Big Data, Big Problems


Victor Porcelli, Staff Writer

Illustration by Victor Wu

In my Investigating Journalism course, Katherine Boss, the Librarian for Journalism, gave a presentation on the various resources at our disposal as NYU students, mainly focusing on the $22 million worth of databases we have access to. Such a staggering amount of money is what allows us to access an even more staggering amount of information. Later completing a quiz on the different databases, I felt the power of the knowledge at my fingertips: U.S. Census data, issues of The New York Times from decades ago, case data from a seemingly infinite amount of legal cases and more. Me geeking out may not be a true indicator of significance, but I am not alone in my excitement for a nice, thick data set. From political campaigns to advertising agencies, big data has played an essential role in the shaping of modern society.

Big data, which is what it sounds like — extremely large data sets — provides more information that allows people to more accurately predict human behavior than ever before. Advertisers have been quick to use big data to their advantage, with companies like Netflix using it to create algorithms that optimize the shows it suggests to its users, and others using big data to create more accurate and personalized advertisements. Companies can better predict the messages consumers want to hear, and form their marketing strategy around this information. Because of these benefits, by 2019 66 percent of digital advertising will be spent on ads on Google, YouTube, Facebook and Instagram. However, this creates an ethical problem, as some users do not like the idea of their data being used or sold by and to companies.

Facebook in particular has had a long history of selling user information to advertisers, as well as general problems with user privacy. These actions are worrisome, because not only do they allow advertisers to use big data for behavioral targeting — using personal information on consumers to tailor advertisements to them — they do so without user consent.

Facebook’s looseness with user data does not stop with advertisers. In the 2016 election, Donald Trump’s campaign, through a company called Cambridge Analytica, was able to use information from 87 million Facebook accounts to create a targeted political ad campaign. This is but one of many ways Trump’s campaign used big data, as they also tailored their messages based on reactions to Trump’s social media. Facebook also served as a conduit for Russian trolls, who bought many political ads on Facebook which were mostly far-right and often served to discredit Hillary Clinton. In this way, behavioral targeting and disinformation became the main strategy to win an election.

Unfortunately, big data threatens our democracy in more ways than one. Gerrymandering has been revolutionized by access to applications like Maptitude combined with data both public and purchased which allows mapmakers to scientifically redistrict in ways that favor their party tremendously. Detailed information on voters and an application which integrates it all gives them a power like never before. In many states such as Pennsylvania and Ohio, the votes do not reflect the following representation. A majority of democratic votes results in five more seats for Republicans, a 50-50 state results in four more, making elections — at least for the House of Representatives — tipped in Republican’s favor. Big data has also resulted in infringements on one’s right to privacy, as government programs like PRISM and Upstream have been re-instituted, allowing the National Security Agency and FBI to gather and access, respectively, data on individuals from social media sites and the internet as a whole, with no warrant necessary. With privacy threatened and the political process more corrupt than ever, big data seems to be the price of the technological innovation we often obsess over as a society.

Yet big data can be used for good. Researchers use it to do amazing things — from creating a computer simulation of the universe to improving cancer treatment. Studies analyzing equal opportunity such as the effect of race on mobility have used big data and could be used to push for social and policy change that furthers equality in our society. Big data can also be used in the medical field to create patient profiles and share them, allowing doctors to group similar patients together and figure out what treatment works best for them. This makes medical care more personalized, and so potentially more effective. Researchers, instead of using a small sample and attempting to generalize it to apply to the population, can use larger and larger samples, minimizing or eliminating the need to generalize.

Overall, big data is simply a shiny new tool that can be used in a variety of ways. As NYU students, we will soon be the ones to wield this tool. To do the research, lead the studies, create marketing plans and strategize political campaigns. It is up to us to decide how big data should be used — and how it should not.

A version of this article appeared in the Monday, April 9 print edition. Email Victor Porcelli at [email protected].