Advertisement
daniel_bilar

algos and discrimination, metadata

May 5th, 2014
342
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 3.76 KB | None | 0 0
  1. Dear xxxx (AI researcher)
  2.  
  3. Thank you very much for your quick response and the paper. I will study it with interest and get back to you. My main interest is pertains decisions made by seemingly objective (but opaque) algorithms on (meta)-data.
  4.  
  5. With data analytic advances and shown by the revelation of the past US intelligence works, it is now straightforward to identify criteria that mediately & predominantly target any given cohort in the US (see for example what FB likes say about you http://news.sciencemag.org/sciencenow/2013/03/facebook-preferences-predict-per.html) . This means that any type of discrimination (good, bad, be it Affirmative Action, economic stimulus, health care allocations etc) can be surreptitiously encoded in innocuous measures that rely on seemingly neutral, objective criteria. The intended discrimination and decision 'unwrapping' is hidden in the data correlations of the stipulated criteria (this is conceptually like your ...) and can not be easily ascertained by end result inspection or even black box algorithm inspection.
  6.  
  7. Toy example. Say you want to discriminate against people who are 120 and over. You formulate Rule 1a "If you are over 120 ->no healthcare for you" - such a direct discriminatory rule is easy to spot. But say these people born in 1890s were mostly born at home. Rule 1b "People born at home -> no healthcare". The decision criterion is now one level removed but still is intended to target the 120y/o+ people - but is now not so easy to spot this cohort.
  8. ======
  9. Dear yyyy (constitutional lawyer)
  10.  
  11. I hope this email finds you well. The ----- programs, together with what I know about data analytics and high confidence inferences, got me thinking about the ramification for US jurisprudence. The distinction between data but meta-data was material maybe 10-12 years ago. Today, it is meaningless, I can infer much more than you think about you just be inspecting the meta-data because of correlations: This was of course tried manually before, but now we have the data and the analysis power to do this systematically and 1, 2 or 3 orders removed.
  12.  
  13. It is not seem clear to me that the qualitative nomenclature used in landmark legal cases in the 2nd half of the 20th century is able to capture these data science implications: What does "preponderance", "circumstantial" evidence mean nowadays? How can you argue a law is discriminatory against Republicans when all that is done is selecting criteria that merely associate 20% higher with Republicans versus others? We do not even have proper terms for inferences that are possible now.
  14.  
  15. Secondly, and exacerbating this, because of the number of laws, Federal regulations, any type of data collection will upon inspection of an individual yield with p ~1 violations of some law (in addition to the encoding above, this follows from a well-established math consequence of base rates) . This means that whenever one needs a rope to hang a given American with, a violation is very likely buried in his data, waiting to be simply mined. The Soviets/Russians did/do this with impossible (by design) tax laws against companies that thread on political turf. We have the NCTC database and more. See Ohm's Database of Ruin http://blogs.hbr.org/cs/2012/08/dont_build_a_database_of_ruin.html
  16.  
  17. I think that legal qualitative language to gauge equality of application of the law, evaluation of criteria based on effects, needs to be re-assessed quantitatively, with correlations/associations, confidence bounds. This is especially important when computer algorithms flag people/entities against tremendous morass of regulations - regulations whose criteria were chosen to say target / exclude a given cohort of better / worse treatment.
  18.  
  19. Just my two cents, but I think you should write a WSJ column on this, or a larger article
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement