2013-Competition Winner

Everybody Counts: How Scientists Document the Unknown Victims of Political Violence

by Ann Harrison (Human Rights Data Analysis Group)



Stories about the human cost of political conflict confront us every day. Reports of deaths and human rights violations spark moral condemnation and calls for action. Uncovering the truth about the scope of this suffering is essential. Military intervention, political sanctions, humanitarian aid and judicial investigations all hinge on accurate data gathering. The press, citizen and government groups collect information about these violations, but their accounts are often distorted. They often reflect only incidents that have been witnessed or single data sources that are not reliable narrators of the true patterns of violence. Inaccurate data collection can produce inflated claims or false accounts that undermine efforts to halt abuses and prosecute perpetrators.

Everybody Counts: How Scientists Document the Unknown Victims of Political Violencedocuments the work of the Human Rights Data Analysis Group (HRDAG), a team of scientists who use statistics, computer science and mathematical demography to analyze state-sanctioned violence. For more than twenty years, these researchers have set international standards of scientific rigor to help overcome political arguments about human rights records. They partner with war crimes prosecutors, truth commissions, human rights organizations, and the United Nations to produce scientifically defensible evidence that holds government authorities and other perpetrators accountable for crimes against humanity.

The Human Rights Data Analysis Group also trains students who study computer science, biostatistics, political science, epidemiology, applied and mathematical statistics, and demography. Together, they spend months and sometimes years in the field collecting, securing and analyzing human rights information from multiple data sources. Everybody Counts tells the story of this fieldwork and how the investigators deploy statistical methods to analyze not only documented deaths and violations, but also the undocumented  killings and disappearances that nobody witnesses.

This book will help students and interested readers think critically about information that describes the scope and magnitude of human rights violations. It will describe how these investigators invent and extend scientific methodologies that allow them frame research questions and conduct accurate analysis. Readers will follow these scientists around the world as they confront a wide range of political conflicts and examine evidence of mass killings, genocide, deportations, ethnic cleansing, systematic detention and torture. These stories are organized into a series of case studies that show how the scientists analyze biased or impartial information. The case studies show how bias is often introduced when groups gathering primary data have limited or selective access to populations or regions. These cases will reveal how collecting data about only the “most important” violations can obscure other abuses and why observers often encounter multiple records of the same event. Everybody Counts will describe how the scientists account for these errors and make accurate statements about political violence that stand up to scientific and legal scrutiny.

The Human Rights Data Analysis Group (hrdag.org) includes approximately thirty scientists of different nationalities trained in a range of scientific disciplines. While the researchers are dispersed around the world, the group is based in San Francisco, California. Founded in 1991, HRDAG has been a project of several U.S. non-profit organizations, first the American Association for the Advancement of Science (AAAS), and later Benetech, a non-profit Silicon Valley technology company. In February 2013, the group became a project of Community Partners, a Los Angeles-based nonprofit that develops programs for the public good.

For more than twenty years, these scientists have worked closely with leading experts from academia and industry to research new methodologies and participate in formal academic mechanisms for validating knowledge. They do not take sides in political disputes or support the advocacy of any particular government or policy. The researchers are committed to uncovering the truth and the application of science in support of human rights, but they are not neutral. They are always in favor of human rights and support the protections established in the Universal Declaration of Human Rights and the International Covenant on Civil and Political Rights.

Dr. Patrick Ball, who received his doctorate in political sociology, is the Executive Director of the Human Rights Data Analysis Group. Dr. Megan Price, who earned her doctorate in biostatistics, serves as the group’s Director of Research. This book introduces Ball, Price, and the researchers they work with. These profiles will put to rest any stereotypes that readers may have of data analysts as deskbound number crunchers. The narrative includes stories of how these scientists secured key records from rioters, managed threats from military officials, and gathered critical data in unstable and violent regions. The scientists of the Human Rights Data Analysis Group are the Indiana Jones of their profession immersing themselves in the communities they study to develop a deep understanding of politically charged events.

The author of this book, Ann Harrison, serves as the Senior Science Writer for the Human Rights Data Analysis Group, documenting their projects for the press, the public, and human rights partners. A journalist who studied science reporting at the Columbia University School of Journalism,

Harrison joined the group in 2006 after covering (add link) http://www.wired.com/science/discoveries/news/2006/02/70196?currentPage=all their analysis of human rights violations in Timor-Leste for Wired News. In her story about HRDAG, Harrison quotes Patrick Ball explaining why he and his colleagues focus on accurate analysis of political violence. “If people can’t be remembered by name because they are lost to social memory, the least we can do is remember how many people died as a result of the conflict,” said Ball. “By having an accurate statistical picture of the suffering, we can draw conclusions about what the causes of violence might have been and identify likely perpetrators with a claim based on thousands of witnesses.”

A Global Scientific Detective Story

Armed only with laptops and scientific methodologies, Ball and his fellow investigators have analyzed human rights data for nine official truth commissions in South Africa, Haiti, Guatemala, Perú, Ghana, Sierra Leone, Liberia, El Salvador and Timor-Leste. The Human Rights Data Analysis Group has provided critical scientific analysis to three international criminal tribunals, including the International Criminal Tribunal for the Former Yugoslavia where Ball was an expert witness in the prosecution of former Serbian President Slobodan Miloševi?. The scientists have also analyzed information about human rights violations for United Nations Field Missions in Timor-Leste, Guatemala, and the Democratic Republic of Congo, and for non-governmental organizations in more than thirty countries.

This book will offer photographs, illustrations, footnotes and citations to the published analysis and peer reviewed papers produced by members of the Human Rights Data Analysis Group. It will document how these scientists support their findings with reproducible techniques based on available evidence within a range of known probability. The researchers believe that transparency helps generate trust and their findings, software tools and data standards are available online, open to review and critique. Criticism and discussion of these methodologies by members of the scientific community will be included in the case histories to offer a balanced view of the scientists’ successes and failures.

 In a series of ten short chapters, this book will profile field work conducted by the scientists in El Salvador, Guatemala, Kosovo, Timor-Leste, Liberia, Colombia, and Syria. Everybody Counts will bring readers along as the analysts sift through witness testimonies, legal depositions, morgue and cemetery records, exhumation reports, prison documents, career information on military officers, historical police archives, eyewitness interviews, customs and immigration records. The chapters will show how these scientists work together reviewing data and considering research problems. As a scientific biography, Everybody Counts will look under the hood at the engine of inquiry. It will document how these scientist frame questions and analyze information to produce defensible findings. Each chapter is framed as a case study that includes a concise history of the conflict, evidence of violations, hypotheses about patterns of violence, answerable questions of fact, methodologies used, and the impact of this research on reconciliation and accountability. For those who teach or study quantitative social science methodologies, these stories will compare the challenges posed by different projects and evaluate the progression of the data analysis tools. Student will learn how to generate hypotheses with available data and pose answerable questions that serve a useful purpose within the context of large-scale conflicts. This information will be presented in language accessible to all readers. Each chapter will include a technical appendix that will offer more detailed data about methodologies and findings together with links to relevant scientific papers.

Chapter Summaries

Chapter One

Why is It So Difficult to Accurately Count Victims of Political Violence?

 This chapter will introduce the Human Rights Data Analysis Group and examine why obvious information about human rights violations does not reveal true patterns of violence. It will explore how undercounting, selection bias, reliance on limited data sources and other errors can skew accurate findings about state sanctioned abuses. Readers will learn why it is critical to examine not only violations that are directly observed, but also undocumented violations that are estimated based on available information. After considering why the truth matters, this chapter introduces methodologies used by the scientists to remove their prejudices from the findings. It also discusses the critical importance of working with and respecting local partners who often take great risks to collect primary data. The chapter concludes with a look at how data analysis techniques are developed to interpret quantitative information and how scientifically defensible findings developed from this data can be used to determine a chain of command and responsibility for crimes of policy.

Chapter Two

El Salvador 1991, Who Did What To Whom?

 Chapter Two documents an attempt by Dr. Patrick Ball to capture and accurately reflect the complex relationships between people and events during a 12-year period of state-sponsored violence in El Salvador. Members of the Salvadoran military committed tens of thousands of killings during the country’s civil war which raged from the late 1970’s until 1990. While working for a peace organization in El Salvador in 1991, Ball was asked by a colleague at a human rights group to help organize a large collection of human rights testimonies. Ball created the “Who Did What To Whom” (WTWTW) model for examining human rights data and used this system to build a structured, relational database of violations reported in more than 9,000 testimonies to the Salvadoran Human Rights Commission.

To determine who was most responsible for these human rights violations, Ball created another structured database showing career histories of the 400 most senior Salvadoran military officials gleaned from newspaper accounts and declassified U.S. government documents. Ball linked the two datasets together and compared the dates in which individual officials led specific military units, with the dates of documented human rights violations. Ball compared this database to another dataset of Salvadoran military officials and identified 100 officers who led units involved in the worst human rights violations. When their names were reported in the press, these officers were forced into retirement. The officers sued for defamation, but withdrew after the testimonies and Ball’s database were presented to a Salvadoran court.

Chapter Three

Guatemala 1993 – 1999, Using MSE to Estimate the Number of Deaths

 Propelled by the impact of his data analysis in El Salvador, Ball applied the WDWTW model to human rights information in other countries. Throughout the 1990’s, Ball worked at the American Association for the Advancement of Science (AAAS) analyzing large-scale human rights violations in Ethiopia, South Africa, and Haiti. Together with senior scientific colleagues, including statistician Dr. Herb Spirer, Ball developed new methods for analyzing state-sanctioned violence. This chapter tells the story of what happened when these scientists were asked by a group of non-governmental organizations in Guatemala to gather and analyze information about human rights violations there. Working with the International Center for Human Rights Research in Guatemala, the scientists, collected evidence from testimonies and press reports of more than 43,000 human rights violations during the country’s 36-years of armed conflict. Ball assembled this information into a database protected by PGP encryption software.

After the 1996 UN-brokered peace accords in Guatemala, Ball’s database became a valuable source of information for the Commission for Historical Clarification which asked the AAAS to determine how many people were killed in Guatemala from 1960 to 1996. Ball created an information management system to analyze the reported human rights violations and patterns of violence. He developed methods to remove duplicate reports and control for statistical bias in observed human rights data. This allowed him to apply a statistical technique suggested by statistician Fritz Scheuren called multiple systems estimation (MSE) that uses overlapping accounts to estimate the number of unreported events. Using MSE, Ball and his colleagues estimated the number of unreported victims by comparing three databases of human rights information. The analysis determined that approximately 132,174 killings took place in Guatemala between 1978 and 1996 with a standard error of 6,568. These included approximately 84,468 killings that were never reported. The researchers also determined that acts of genocide were committed against the indigenous Mayan communities.

Chapter Four

Kosovo 1999, Using MSE to Examine Political Claims

 Ball expanded his use of multiple systems estimation (MSE) to clarify the history of a deadly conflict in Kosovo. The violence began in 1989 when Serbian President Slobodan Miloševi? revoked Kosovo’s autonomous status within the Republic of Serbia triggering fighting between Kosovar Albanians and the Yugoslav government. Allegations of widespread and systematic human rights violations were made against Serbian forces and NATO intervened to repel Serb forces from Kosovo. Ball and Scheuren gathered data from Albanian border crossings and other sources in the region. They used this information to examine the claim by the Yugoslav government that actions by NATO and the Kosovo Liberation Army (KLA) opposition sparked the mass migration of Albanian refugees from Kosovo. Ball applied MSE to examine killings in Kosovo and compare this data with information about the scope of the conflict over time. This analysis was used to prosecute former Yugoslav president Slobodan Miloševi? at the International Criminal Tribunal for the Former Yugoslavia at The Hague where Ball confronted the former leader with statistical evidence of his alleged war crimes. The analysis found that the military activities of the KLA and bombardment by NATO forces were inconsistent with the observed patterns of refugee ?ow and the deaths of displaced people. The researchers concluded that the statistical evidence was consistent with the hypothesis that Yugoslav forces conducted a systematic campaign of killings and expulsions in Kosovo. This information was accepted by the court, but in another legal case against a Serbian military leader, the judges rejected these arguments.

 Chapter Five

Timor-Leste 2006, Found Data and Innovative Surveys Uncover the Truth

 Large-scale human rights violations in Timor-Leste began in 1975 when the Indonesian government invaded the small island and continued until Timorese independence in 1999. Disappearances, torture, forced displacement and extra-judicial killings took place during the Indonesian occupation compounded by a severe famine. Estimates of deaths ranged from 50,000 to 200,000, but individual sources reflected only a fraction of total fatalities. The Commission for Reception, Truth and Reconciliation in East Timor asked the Human Rights Data Analysis Group to investigate abuses during the conflict. This chapter describes how Ball and scientists Romesh Silva and Scott Weikart discovered how inherent biases in Timorese data sources could be overcome by comparing three innovative data sets: narrative testimonies, Silva’s census of public graveyards, and a retrospective mortality survey of Timorese households. No truth commission had ever surveyed the population about past killings and no human rights project had used gravestone information to estimate deaths. The researchers matched reported deaths across the datasets and used MSE to determine the pattern and magnitude of conflict-related deaths and violations. Their analysis revealed the surprising truth that far more people died in Timor-Leste as a result of famine than in the fighting itself. The researchers found that 102,800 (+/- 11,000) Timorese died as a result of human rights violations during the conflict. Of these dead, 18,600 people were murdered or disappeared and 84,200 citizens died of hunger and illness in excess of what would be expected during peacetime.

Chapter Six

Liberia 2009, Coding Testimony to Determine Accountability for War Crimes

 In July 2009, the Human Rights Data Analysis Group concluded a three-year project with the Liberian Truth and Reconciliation Commission (TRC) to help clarify Liberia’s violent history and hold perpetrators accountable. A military coup in 1979 sparked 24 years of civil war in Liberia where warring factions subjected civilians to severe human rights abuses. The TRC sought to determine whether these violations represented a systematic pattern or policy. This chapter describes how the researchers developed a statistical analysis of the more than 17,000 victim and witness statements collected by the TRC and applied Ball’s “Who Did What To Whom?” methodology. One of the team’s scientists, Kristen Cibelli, worked with TRC staff in Liberia to develop a coding process by which countable units, such as violation types, victims and perpetrators, were identified in statements and transcribed to coding forms. This process converted qualitative testimony into data that supported quantitative analysis to detect patterns of violence. This analysis found that former Liberian president Charles Taylor – who was tried in The Hague for war crimes and crimes against humanity in Sierra Leone’s civil war – also led the Liberian rebel group responsible for the largest number of violations in Liberia. In 2012, Taylor was found guilty and sentenced to 50 years in prison for crimes in Sierra Leone.

Chapter Seven

Colombia 2010 – 2012, Advancing Methods to Count Deaths and Missing People

 The scientists of the Human Rights Data Analysis Group applied new statistical methodologies to investigate deaths and missing people in Colombia. A report released by Tamy Guberek, Daniel Guzmán, Megan Price, Kristian Lum, and Patrick Ball documented patterns of violence in the Colombian state of Casanare. The researchers used MSE to analyze killings and disappearances recorded in 15 datasets provided by judicial, government, security, forensic, and civil society organizations. Matching cases that appeared in more than one dataset, the statisticians modeled the process by which violations were recorded and estimated the number of undocumented killings and disappearances in Casanare. The study found that disappearances in southern Casanare peaked precisely at the time when reported killings in that region were lowest. The report also observed disappearances in different times and places than killings, suggesting that perpetrators changed their methods during the conflict. This study was built on the researchers’ earlier report on missing people in Casanare that helped guide investigations of missing people across Colombia. Their findings estimated that thirty to forty percent of an estimated 2,553 missing people in Casanare were unreported.

Chapter Eight

Guatemala 2011, Sampling Methods Help Convict Perpetrators

 The Human Rights Data Analysis Group returned to Guatemala in 2006 to analyze a sample of the estimated 46 million records discovered in the archive of the disbanded Guatemalan National Police. Statisticians Daniel Guzmán, Romesh Silva, Patrick Ball and Tamy Guberek, together with Paul Zador and Gary Shapiro of the American Statistical Association, developed a multi-stage random sample of the archive to get a clearer picture of its contents. Sampled documents shed light on the disappearance of Guatemalan union leader Edgar Fernando García who vanished in 1984 while in police custody. Expert testimony by Guzmán provided key evidence in the 2010 conviction of two former police officers who were found guilty of disappearing and murdering García.

Guzmán’s testimony was based on three years of coding key variables from random probabilistic samples of the archive that allowed him to make estimates about the contents of all the documents. Through his analysis, Guzmán was able to compare the percentage of documents about García known by different police units and support the prosecution’s argument that high-level officers were involved in his disappearance. Guzmán’s statistical analysis of command responsibility provided critical information used to support the arrest of the former commander of the Guatemalan National Police, Héctor Rafael Bol de la Cruz, who was accused of complicity in García’s disappearance. In September 2013, Ball testified in a Guatemalan court in the trial of Bol de la Cruz who was convicted and sentenced to 40 years in prison for the kidnapping and disappearance of Garcia.

Chapter Nine

Syria 2012, Modeling Multiple Datasets in an Ongoing Conflict

The struggle between Syrian President Bashar al-Assad’s regime and opposition forces has generated extensive global press coverage, but few accurate estimates of casualties. In January 2013, the United Nations Office of the High Commissioner for Human Rights released a report on the number of conflict-related killings in Syria. The report is based on statistical analysis conducted by HRDAG scientists Megan Price, Jeff Klingner and Patrick Ball. This chapter examines these findings which compared information from a database collected by the Syrian government with six databases compiled by Syrian human rights activists and citizen journalists. After examining information about fully identified victims, the researchers integrated more than 147,000 records of documented deaths and applied data mining techniques to match duplicate records. They concluded that 59,648 named people were killed in the Syrian conflict between March 2011 and November 2012. The UN released a statement describing the report as an exhaustive analysis, but the Syrian government condemned the findings as politically biased.

The scientists are now using examples of fabricated and inaccurate reports of deaths in Syria flagged by critics, to build a model of records that should be excluded from analysis. They are also using MSE to estimate the number of killings that have not been documented. When combined, the results of the two models will reveal the true number of killings and patterns of violence in Syria based on a quantitatively accurate analysis of trends over time and patterns in geographic areas. This information will support transitional justice mechanisms such as victim compensation, truth commissions, vetting of public officials, and criminal prosecutions based on trustworthy evidence.

Chapter Ten

Conclusions and the Leading Edge of Research

In crimes against humanity, researchers must show that these actions follow a pattern. In the absence of firm evidence, efforts to clarify facts are paralyzed by competing claims of historical accuracy. The final chapter of this book documents how the researchers of the Human Rights Data Analysis Group have worked through scientific challenges to create data about state-sponsored violence that supports prosecutions and policy decisions. The scientific paradigms established by these scientists have transformed emotional debates about politics, policies and history into evidence-based discussions about science and methodology. By highlighting the weakness of claims based on inadequate data or improper calculations, the researchers demonstrate that careful analysis of multiple data sources can help embattled populations move past political polarization.

For the families of people who disappear during conflicts, this analysis can help provide clues to the fate of loved ones. For policy analysts, it offers an opportunity to understand the role of political decisions in disappearances, kidnappings and assassinations. These findings also present a view from the perpetrators’ perspective and help investigators understand how a national leader, or their security forces, can turn against their own citizens. Everybody Countsconcludes with the assertion that rigorous human rights arguments offer the greatest opportunity to halt cycles of violence and create lasting social change during difficult-to-foresee periods of political transition. Unless the human rights community is ready with meticulously analyzed information about past abuses, opportunities for official acknowledgement, accountability and reform are lost. The goal of this book is to inspire an upcoming generation of scientists to develop new tools that address state-sponsored violence and present clearly explained, scientifically defensible data that refutes the denials of apologists, murderers and despots.