Data as Truth in Journalism

BOSTON — In 1952, the news team at CBS used a computer the size of a room to successfully predict the results of the U.S. presidential election using poll data. Over half a century later, Charleston Gazette-Mail reporter Eric Eyre used data that drove his narrative covering the rampant opioid epidemic in West Virginia—work that won him the 2017 Pulitzer Prize in investigative journalism.
Eyre obtained data from the Drug Enforcement Agency and Centers for Disease Control for his 2016 piece. From the data, he discovered that drug firms poured over 780 million opioid pills in West Virginia over a six-year period. Eyre also mapped overdose data in a visual aid showing the differences in overdose concentration by region.

Modern society lives in an era many have declared “the information age.” In times like these, data has the potential to explain more than humanly possible 20 years ago. Over the last two decades, advancements in digital storage have changed the rate at which people access information. The amount of data in the world is growing faster by the minute, as is the number of tools to handle it. And journalists happen to be the unsung heroes fit to wield those tools.

In simple terms, datasets are organized spreadsheets containing information. Once data is organized, it can be studied for hidden trends and patterns. Journalists who look for data in the lesser-sought places uncover some of the most groundbreaking material in modern news. In fact, the Boston Globe’s 2003 Spotlight article used court records to expose the Catholic church’s sexual misconduct settlements.

Data helps point reporters in the right direction while investigating, said Todd Wallack, an investigative reporter and data journalist for the Boston Globe. “Public records are really important; they are a critical tool for getting good stories and reporting them out,” he said.

Wallack has been a Pulitzer Prize finalist three times throughout his career and is a proponent of using coding languages like Python and R to analyze datasets he has acquired through public records requests. Public records come in many forms and are accessible in order to keep people informed about government affairs and practices. Datasets such as city health inspections, nursing board disciplines, and police arrest records are all available to the public, and as government bodies collect more data, journalists are more likely to learn new information from them.

But the use of data in journalism can also do harm, especially because false information is relayed just as quickly as true facts. Visualization, or the creation of representative charts, graphs, or other aids to help communicate data, is often misused by reporters purposely misrepresenting data, even if the data doesn’t accurately represent the point they are attempting to convey.

In 2013, Fox News came under fire for using misleading graphs in its reporting. One of the first rules in visualizing data using bar graphs is to make sure the y-axis starts at zero so differences in the data are represented proportionally. Fox disobeyed this rule when it showed a bar graph comparing southwest border apprehensions over time. The increase in illegal immigrant arrests from 165,233 in 2011 to 192,298 in 2013 is a mere 16 percent increase. However, the y-axis starts at 150,000 apprehensions and ends at 200,000, causing the 2013 bar to appear three times as large as the bar in 2011. People typically perceive things visually, so our minds can be swayed by misleading visualizations like Fox’s.

For aspiring journalists looking to master data, the National Institute for Computer Assisted Reporting has several tip sheets and listservs for writers looking to learn the best data reporting practices. The institute also offers week-long boot camps where amateur reporters can lea
rn from veterans in the field. NICAR encourages reporters of all backgrounds to experiment with data and to have fun doing so. “Any reporter can find a way to use data in their storytelling,” said Charles Minshew, director of data services at NICAR. “If you haven’t worked with data before, you can find something fun—sports statistics, for example—that is quantifiable, and just play around with the numbers.”

Despite the emerging influence of data literacy in journalism, there is still minimal urgency to teach computer assisted reporting at major universities. Charles Berret and Cheryl Phillips conducted a study on journalism curriculum for their 2016 book, Teaching Data and Computational Journalism. According to the study, which surveyed 113 university journalism programs in the U.S., schools only offer 1.4 classes in data journalism on average.

“Even as some universities add classes in web development and coding, they have not kept pace with offering courses in computer-assisted reporting skills like learning how to analyze and understand data,” Phillips and Berret say.

Amid the current political atmosphere in the U.S., people are becoming disillusioned with both the government and the press. Data may be the only remedy to such unease; when handled with care, it can reshape the relations between the people, the government, and the press using hard, indisputable facts. As high schools and colleges integrate more data science courses, it is likely we will see an extension of those practices to data reporting and the future watchdogs of society.