Pillar 1: Mitigating Unintended Consequences#

Communication, interpretation, and application of human data should be accurate and consider social, political, and economic contexts and ramifications, especially when involving vulnerable populations

Science emphasizes the use of quantitative methods and objective investigation as a means of determining accurate, unbiased conclusions about the observable world. As such, many place science into a vacuum lacking social, political, and economic influence and consider it the “closest” form of truth that we as humans can attain. As many have witnessed during the COVID-19 pandemic, science and society are very much interconnected and coevolving, for better and for worse. At certain points in history, science has been used as a tool for justification of oppression and violence against groups of people. For instance, public sentiments and “scientific” theories expressed by Sir Francis Galton, Karl Pearson, Sir Ronald Fisher, and other scientists were used to justify eugenic laws and practices across the world; some of these include the Racial Integrity Act of 1924/Buck v. Bell12, the 1907 Indiana Eugenics Law3, and the sweep of many other legislations across North America and Europe4 allowing for involuntary sterilization of groups of people. This fueled several other unethical and moral crimes against humanity, including the Holocaust, forced sterilization of over 60,000 Americans (many of which were African-American and Latina women), and other forms of medical genocide. More recently, these sentiments have been cited in motivating modern racially-directed violence, as seen in the Buffalo Shooting in May 20225. In order for science to continue facilitating advancements for the benefit of humanity, knowing this history and understanding how not to repeat it is crucial, as the effects can be detrimental and generationally traumatizing.

../../_images/virginia-eugenics.jpeg
Pamphlet of the Virginia Health Bulletin reporting passing of the Racial Integrity Act of 1924. “Virginia Health Bulletin: The New Virginia Law To Preserve Racial Integrity, March 1924,” Document Bank of Virginia, accessed August 1, 2023, https://edu.lva.virginia.gov/dbva/items/show/226.


While we as a society have evolved our moral compass, remnants of previous ethical shortcomings creep into scientific and technological practices in the digital age. As the rapid utilization of rich datasets drive discoveries in the fields of medicine, physics, engineering, artificial intelligence, and other sciences, few are slowing down to ask the questions that weigh innovation against morality. Most of the time, such ethical considerations are not acknowledged until cumulative damage is done. A well-known example is the report published by ProPublica[^******] in 2016, which described how a recidivism algorithm called COMPAS exhibited racial biases toward offenders, rating Black offenders as more likely to re-offend than White offenders. This algorithm has been used in several states in judicial decision-making processes.

Surprisingly, the reported analysis showed that of those deemed “high risk” to reoffend by COMPAS, 28.0% of Black offenders and 47.7% of White offenders did in fact reoffend. On the other hand, 44.9% of Black offenders and 23.5% of White offenders actually did not reoffend. How may an algorithm exhibit such suboptimal predictions and biases in its outputs? This would depend on the lens of what is considered “suboptimal” or even “fair.” According to a report published by Northpointe Incorporated 6, the creator of COMPAS, the algorithm does not confer biases with the statistical measures they used to assess accuracy and fairness. Affiliates of the United States Courts have also criticized the Propublica report 7, stating that the study did not apply appropriate calculations in determining recidivism rates. Since the publishing of the original Propublica report, representatives of the academic, government, and private sectors have been in debate about what the appropriate measures of fairness are and the importance of these measures for crucial decisions such as recidivism89.

Following up on this report, Dressel and Farid (2018)10 compared the accuracy of COMPAS’s prediction of recidivism to predictions made by human participants using an online survey. They found that human prediction was slightly more accurate than COMPAS (67% vs 65.2%) but was not significantly different. Using a developed classifier, they determined that age and total number of previous convictions yielded a similar racial bias as was reported in the Propublica report. Out of context, this could be viewed at most as a major flaw of the data inputs and outputs for this algorithm (garbage in, garbage out). A more contextual interpretation of how either of these factors lead to a racial bias in recidivism ratings could be supported by the numerous studies documenting racial bias arrests and confrontations with law enforcement11, as well as the track record of controversial police encounters in America. Thus, it is important to understand the greater macrocosm from which this data is derived to garner understanding of what the data may mean. Furthermore, it is imperative to acknowledge the strengths and limitations of statistical models and measures in relevant applications involving human behavior.

../../_images/Dresseletal.png
Human (no-race condition) versus COMPAS algorithmic predictions. Adapted from Dressel and Farrid, 2018.

While the use of tools like COMPAS is still an ongoing debate, important questions and considerations have emerged from this discussion. Particularly, how do tools like COMPAS affect decisions made by judges presented with recidivism predictions? Do these tools significantly impact one’s right to due process? Does automation bias work in concert with implicit biases that are already prominent in the justice system12? Should we be using algorithmic predictions about human behavior in these circumstances at all?

Past ethical faults within the science and technology sectors may not have fully considered the damaging ramifications of certain studies, methods, and drawn conclusions on society. Moreover, it is impossible to anticipate the myriad of ways scientific findings may positively or negatively influence society. However, when it comes to working with human-derived data, or even data that can greatly affect human lives, these damages can be mitigated by staying well informed about the populations from which the data are collected and using ethical and contextual discernment when interpreting, communicating, and utilizing these data. As data scientists, it is our job to have ethics and morality as a basis by which we make these decisions around data.


1

General Assembly. “Preservation of Racial Integrity (1924)” Encyclopedia Virginia. Virginia Humanities, (07 Dec. 2020). Web. 01 Dec. 2022

2

BUCK v. BELL. 2 May 1927, https://www.loc.gov/item/usrep274200/.

3

Laws of Indiana, 1907, pp. 377-78 (B050823). https://www.in.gov/history/state-historical-markers/find-a-marker/1907-indiana-eugenics-law/

4

Reilly, Philip R. “Eugenics and Involuntary Sterilization: 1907-2015.” Annual Review of Genomics and Human Genetics, vol. 16, 2015, pp. 351–68, https://doi.org/10.1146/annurev-genom-090314-024930.

5

Office of the New York State Attorney General. Investigative Report on the role of online platforms in the tragic mass shooting in Buffalo on May 14, 2022. Published OCTOBER 18, 2022. https://ag.ny.gov/sites/default/files/buffaloshooting-onlineplatformsreport.pdf

6

Dieterich, William, et al. “COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity,” 8 July 2016, https://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf.

7

Flores, Anthony, et al. “False Positives, False Negatives, and False Analyses: A Rejoinder to ‘Machine Bias: There’s Software Used Across the Country to Predict Future Criminals. And It’s Biased Against Blacks.’” Federal Probation, Vol. 80 Number 2, September 2016, https://www.uscourts.gov/federal-probation-journal/2016/09/false-positives-false-negatives-and-false-analyses-rejoinder.

8

Corbett-Davies, Sam, and Sharad Goel. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. arXiv, 14 Aug. 2018, https://arxiv.org/abs/1808.00023.

9

Pleiss, Geoff, et al. “On Fairness and Calibration.” Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc., 2017, https://papers.nips.cc/paper/2017/hash/b8b9c74ac526fffbeb2d39ab038d1cd7-Abstract.html.

10

Dressel, Julia and Farid, Hany. “The Accuracy, Fairness, and Limits of Predicting Recidivism.” Science Advances, vol. 4, no. 1, Jan. 2018, p. eaao5580, https://doi.org/10.1126/sciadv.aao5580.

11

Eberhardt, Jennifer. L. “Strategies for change: Research initiatives and recommendations to improve police- community relations in Oakland, Calif.” 2016. Stanford University, SPARQ: Social Psychological Answers to Real-world Questions.

12

Jeffrey J. Rachlinski & Sheri L. Johnson, Does Unconscious Racial Bias Affect Trial Judges, 84 Notre Dame L. Rev. 1195 (2009). Available at: http://scholarship.law.nd.edu/ndlr/vol84/iss3/4.