You are currently viewing How Data Happened

How Data Happened

Data and the Inevitable

This look at the history of how data assumed its place of massive world importance is instructive, but its vision for an alternative to data’s role is fairly thin.

Review by Jake Casale

If you access a computer on at least a semi-regular basis in 2023, chances are you’ve stumbled across a think piece ruminating on the specter of artificial intelligence… and one heralding its seemingly limitless potential. There are many more perspectives that mix these two poles, flitting between handwringing and awe as they try to sort out the future impact of these new technologies on everything from mundane office tasks to the trajectory of whole industries. Personally, I find the back-and-forth not only tiring, but also unmoored from historical referent and typically awash in unexamined assumptions. I’m a professional in data analytics—for me, that means that I am perpetually haunted by the compulsion to question any big claim that human beings make, and I’ve found a way to channel that into gainful employment. (I wish I were fully joking.) Yet I’m often troubled by my tenuous grip on what might be the most important context of all—how my own line of work came to be, what it really is, and where it could be going.

How Data Happened began to fill in some of this framework for me. Written by two Columbia professors, Chris Wiggins and Matthew Jones—the former of applied mathematics (who also works as a data scientist for the New York Times), the latter of history—it traces how the analysis of quantitative information came to be viewed as essential for understanding people and societies. While this certainly involves accounts of scientific and technological advancements, such as the development of infrastructure to support big data processing during World War II, it becomes clear early on that this narrative is primarily a history of imagination—of what quantitative tools are imagined to mean and reveal.

Indeed, the two authors locate the foundational innovation of this field not in the creation of a new mathematical tool, but in a new application of an existing one: nineteenth-century Belgian astronomer Adolphe Quetelet’s assertion that, in the same way that averages of innumerable observations about planetary position can yield general facts about planets, so too could a high volume of quantitative observations about human behavior reveal general facts about human beings. This caught on, in turn, as a desired input into policy and statecraft, because it promised that society could be reformed slowly over time rather than via violent and costly social revolutions. In other words, the backdrop of early-nineteenth-century Europe provided fertile soil for the concept of data-driven decision-making to take root as an ideal feature of governance—which also speaks to another theme of Wiggins and Jones’s narrative: the dark relationship between data and the human impulse to control, finding its worst expression in the justification of atrocities like eugenics.  

They are seeking to bring a conversation typically shaped by an attitude of technological determinism under the auspices of historical inquiry.

This illustrates what I particularly appreciate about the approach that Wiggins and Jones take: they are seeking to bring a conversation typically shaped by an attitude of technological determinism under the auspices of historical inquiry. In shedding light on the circumstances and contingencies that enabled data, and the idea of data-driven decision-making, to come to occupy its current place of societal honor, they take aim at the idea that the current status quo was inevitable and that it must remain in place forevermore.

According to them, data’s ascendance emerged from the dynamic interaction between new technologies and the role various social actors imagined—sometimes compulsively or myopically—for these technologies. Such interactions unpredictably shape the terrain of power, create and ensconce new categories of expertise, and spur investment in tools to reify particular modes of reasoning. Discovering whichever element of these interactions proved most decisive in specific moments of push-and-pull kept me turning the pages even as bits of jargon went over my head, a testament to the authors’ storytelling ability. Their desire to craft a narrative that is approachable to readers with varying levels of familiarity with data-related fields and technologies is evident, and they are relatively successful in that regard; when a subject is treated with technical depth, it generally feels in service of fleshing out the overall historical arc rather than acting as a sidebar for the technologists to nerd out on.

What I found less developed is the vague way forward proposed at the book’s conclusion, as well as the build-up to it. Wiggins and Jones argue that the rapid scale at which data technologies have been deployed leaves our norms and values vulnerable to modification before our ethical imaginations catch up—as they say in one of my favorite of their chapter-ending summative statements, “Ethics, no matter how well considered and well intentioned, tends not to scale well.” They do devote a couple of chapters to the slippery, unanswerable question of who can authoritatively define and enforce data ethics, but I found myself hungry for a deeper treatment of the concepts of justice and fairness that ground these accounts. Moreover, the final chapter keeps focus on the unstable relationships between various powers that will shape the future of data—state power, corporate power, and people power—offering recent examples of how each has worked to exert its influence over the trajectory of data amidst the counterbalance of the others. This is a fine way to conclude; indeed, acknowledging and sitting in the tension of complexity strikes me as a refreshingly honest rejoinder to the temptation of using data (even historical “data”) to predict the future—the same temptation that Wiggins and Jones critique over the arc of the book and that I have bumped into, in various shades, over the course of my career.

But that conclusion also seems insufficient to generate another outcome that the authors seem to desire: stoking readers to imagine alternate futures where the use of data is subservient to our norms and values rather than potentially generative of them. They insist, in a resistance to technological determinism that I applaud, that alternatives to the current status quo are possible—but if they have robust visions of what these alternatives might be, they don’t describe them in detail beyond negation of what we currently contend with, such as advertisements based on mass surveillance. In response, I wondered—what kind of commerce would fill a world where such ads have vanished? Would data have any productive alternate role to play, or would it inevitably engender temptation toward the unachievable promise of control? Imagining answers to those questions requires resources and frameworks beyond what is supplied in How Data Happened. But if nothing else, the history it presents serves, if a bit indirectly, as an invitation to search for them.

Jake Casale lives in Boston, Massachusetts. He graduated from Dartmouth College in 2017 and has worked on public health/health systems strengthening efforts both domestically and abroad. He currently works as an analyst for digital health company Cohere Health.

How Data Happened: A History from the Age of Reason to the Age of Algorithms was published by W. W. Norton on March 21, 2023. You can purchase a copy from the publisher here.