Potemkin Data Science

Michael Correll
11 min readDec 30, 2020
Nighttime illumination in Honor of Catherine the Great, Jan Bogumił Plersz, ca. 1787

Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.

David Graeber, On the Phenomenon of Bullshit Jobs: A Work Rant

This is one of those historical stories that probably never happened (or didn’t happen like we think), but is so useful as a metaphor that it’s best to pretend that it did. The story goes that Empress Catherine II of Russia was taking a voyage down the Dnieper river to survey lands newly conquered from the Ottomans. The land was mostly desolate and ruined, rather spoiling the sightseeing. Her on-again-off-again lover Grigory Potemkin, to make things more lively (in the most common version of the story) devised a sort of traveling village with campfires and fake buildings and members of the entourage portraying happy villagers. Each night this village would be deconstructed, carted along with the royal procession, and rebuilt the next day further down the river. The term “Potemkin village” has gotten a bit expansive of late, now applying to many sorts of false fronts, façades, or general fakery, but I wanted to bring up the original context so that you keep in mind the following:

  1. The Potemkin village, as described, is a lot of work. In the more expansive versions of the story there are elaborate sets on wheels, hired actors dancing and singing until late at night, carefully placed lights and fires, all taken apart and moved down the river for the next night. Whether that work is ultimately meaningful is the perhaps more important point.
  2. The Potemkin village isn’t for fooling everybody, but is for fooling very specific people with a very specific vantage point. Much as trompe-l’œil painting don’t work unless you stand in the correct place, the “village” only works from the perspective of the Imperial retinue on the river. The “villagers” didn’t think they were in a real village. Compare this to, say, a cult or a totalitarian state where the lines between public and private beliefs and internal versus external propaganda are blurred, or a folie à deux where a delusion is shared and reinforced between partners.

It is my contention that actually existing data science, if we are not careful, can very easily start to smell Potemkin-y: there are already so many hazards strung in its way, and so many (in data science terms) local minima from which it is difficult for it to escape. I enumerate some of these (not necessarily mutually exclusive) scenarios here. Villages in a Potemkin country, if you will. When data science becomes too Potemkin-y, then we have lots of people working (and working very hard!) for seemingly no useful end other than to, at best, impress a handful of people in upper management.

Data Science as Marketing

The logics of capitalism demand maximization of profits. Yet, many companies spend lots of money on charitable foundations, or donate 1% (or what have you) of their profits to “good causes.” This apparent contradiction is resolved when considering the gestalt of what Žižek calls “cultural capitalism:” a company encourages consumption by including, in the price of their product, apparent ethical relief or compensation for the otherwise morally ambiguous act of consuming. In practical terms, if we see two similar products on sale, but one promises certain ethical commitments (no animal testing, fair trade, for each one you buy a certain number will be donated to the less fortunate, etc.), then we would likely choose the product that promises the most “good,” and in fact even pay a bit more for the privilege. Similarly, I’d be more likely to work (or stay longer) at a company that is having positive impact on the world.

One catch here is that the appearance of doing good is what is valuable, rather than the actuality of doing good (and especially the actuality of changing the system such that certain kinds of charity are no longer necessary). The market benefits for niceness are a lot more about advertisement and rhetorical positioning than actual virtuous behavior, and can change quickly in response to marketing needs (look at how quickly Google dropped its “don’t be evil” slogan once it became an cudgel to use ironically rather than anything that people could take seriously as a guiding principle).

Data science in an organization can have this flavor of a marketing tool. When pitching to shareholders, who wouldn’t want to claim that their decisions are “data-driven?” For a top engineer deciding between jobs, who wouldn’t prefer to work at the place that will have them doing impressive machine learning or complex analytics rather than writing code to trick people to click on ads? Who wouldn’t want to brag about how doggone smart “we” are?

However, as with charity, where this can come crashing down is that the appearance of data science smartness is valuable, but the actual results of a data science effort might not be (or not as immediately visibly valuable). Building a strong data science team, restructuring so that team has the data they need to do their jobs well, and building a culture that responsibly applies what data science teams have learned is very expensive and time consuming and risky. Hiring a few PhDs and sticking them in a room and locking the door behind you is comparatively much cheaper and faster. In the latter case you then still get to make slide decks with graphs of exponential curves and words like “big data” or “machine learning expertise” in them, but without any of the pesky inconvenience of “actually changing anything.”

For data science as marketing, the fact that “data science” is happening somewhere nebulously in the organization is what is important for the recruitment materials and the strategy documents and the pitch meetings. Who these data scientists are, what they are doing, and what people do with what they learned is… less important.

What this might look like on the inside:

  1. Lack of buy-in or institutional support.
  2. Connections to the rest of the organization are few and tenuous.
  3. Actual datasets to work on are sparse or nonexistent.
  4. There is a focus on the process rather than the impact (for instance, immediately jumping to training a complex machine learning model to solve a problem before seeing if simpler solutions would work, or if the problem it would solve is even the right one to tackle).

Data Science as Toy

In what I’m sure will be a great shock to the people who know me, I was not always up on the latest trends growing up. Even when I had a dim idea of what was popular, there were usually other things competing for my allowance money. As such, I mostly experienced fads like Furbys and tamagotchis and slap bracelets secondhand [note to ed.: include a toy trend from this millennium so I am not forced to reflect on my own mortality]. But CEOs have a lot of allowance money, and pay good money for services like Gartner to tell them what’s popular. And data science is pretty popular! So I think a common pattern is for organizations to decide to do data science, not because they have any pressing needs or unused capabilities, but because all the cool kids are doing it, and the CEO wants to be cool too.

This is a similar scenario to the “data science as marketing” scenario above, in that it is doing data science just for the sake of doing data science rather than to address a real need, but I think the toy metaphor surfaces some key differences. For one, it implies that there is a personal champion somewhere up in the org chart who wants to play with the toy. The toy metaphor also implies something faddish and ephemeral.

Do you all remember back during the dot com bubble [ed. note: again, insert a more modern example here] when websites were new and shiny, and everybody who was anybody was getting one? This is despite the fact that the forms of internet commerce were still nascent enough that there was often very little to do on these websites, and very little institutional expertise on what makes a good website. Hence a bunch of guestbooks and site maps and animated gifs and autoplaying midi files and what have you (I still appreciate that the website for Space Jam has been kept in its pristine 1996 vintage state). Many I’m sure were abandoned after an initial flurry of effort and a few PowerPoint slides titled “cyber strategy.”

Yesterday’s unnecessary and annoying website could be today’s data science team. Somebody important hears that data science is the new hot thing, and maybe they don’t know why exactly, but they throw a bunch of money and people and attention at the problem all the same. Then they eventually get distracted and move on to another pet project, leaving the data science folks as so many tamagotchis slowly running out of battery in the bottom of a metaphorical drawer.

What this might look like on the inside:

  1. Unequal buy-in from decision-makers, with perhaps only one or two champions in the upper levels of the organization.
  2. Compartmentalization of the data science team away from other parts of the organization (both the people who generate data and/or the decision-makers who would use the knowledge gleaned from the data; “this is my toy and I want to play with it first”).
  3. An initial burst of resources, headcount, and enthusiasm that tapers off rather quickly.
  4. Frustration from management over the timeline, magnitude, or transformative power of results (the new toy doesn’t prove its worth immediately).

Data Science as Laundromat

A quote that has been attributed (falsely, as far as I can tell) to all kinds of people from Mark Twain to George Carlin to Emma Goldman is “if voting changed anything, they’d make it illegal.” Heather Froehlich and I, a while back, wrote about how a lot of the dashboards we were exposed to, despite allegedly being these powerful data-driven decision-making artifacts, had the curious property that all we were supposed to do was look at them, and not really make any changes. The traditional dashboard for analytics is supposed to be this complex collection of important metrics to drive immediate strategic and tactical decision-making, but a lot of the dashboards we were seeing appeared to be for “decision-laundering:” justifying stuff that had already been decided at levels above us (or even just, through visual or methodological complexity, convincing us that the perhaps mythical “levels above us” were the only places where decisions could be made).

Data science, often only a step or two removed from dashboard generation, is even more vulnerable to this sort of laundering. The data scientist exists to provide convincing post hoc support for decisions or strategies that have already been made prior to or in absence of data. And when the data scientists don’t agree with the decision-making, things get ugly. Former Florida Department of Health data scientist Rebekah Jones is one such example: she was fired for apparently refusing to doctor the COVID data being presented to the public by the state government, in an attempt to cover up a disastrous pandemic response that saw beaches open and spring breakers welcomed while the rest of the world was mostly in lockdown (she speculated that the recent raid by law enforcement on her home was an excuse to take down her own counter-narrative dashboard that she created after her dismissal). Less dramatically, laundromat data scientists are more often ignored or downplayed when their conclusions go against the “common sense” or “the way things are done” in the organization, and brought out into the limelight only when what they suggest just so happens to match up with existing organization priorities.

What this might look like on the inside:

  1. A large data science team, but no clear paths or organizational power to make changes based on data science results.
  2. “Hot and cold” engagement from other parts of the organization depending on the conclusions from the data, including cancelling fruitful projects in midstream.
  3. Projects and research directions given out from the top rather than built up from the bottom, or based on a clear need.
  4. Being told to constantly fish for new datasets or more sophisticated methods when one arrives at “counter-intuitive” results.

Data Science as Oracle

Herodotus reports that Croesus, King of Lydia, seeing instability in the Achaemenid Empire after the fall of the Medes, went to the Oracle of Delphi to ask what would would happen if he invaded. Their response was “if King Croesus crosses the Halys River, a great empire will be destroyed.” Heartened by the response, he went on the attack, only to be eventually defeated and overthrown: the “great empire” he destroyed was his own. Many other statements from Delphi were similarly ambivalent or cunningly phrased to ensure their eventual truth.

If the laundromat is just using the veneer of data science as a post-hoc justification for stuff you were going to do anyway, then oracular data is a way of covering all of your bases so you can claim that everything happened as you predicted. Ask nebulous questions, and get nebulous answers. Make a decision and be proven right no matter what happens.

An example I often use is to imagine that you are given the sales performance of a bunch of products. We’ll assume that, in reality, the sales of each product are in fact drawn from the same distribution. But we don’t know this, of course. And so, through pure sampling happenstance, one of the products significantly underperforms the others. We can pick this trend out and report on it, either to suggest intervention (say, rebranding or increasing the marketing budget) or to suggest non-intervention (because the uncertainty is too high, or the risk of a multiple comparisons problem is too great). Both action and inaction could be pitched in a way that sounds “smart” and “data-driven.” If I don’t intervene, then next year sales are likely to go up (through regression towards the mean), and so I look wise to have urged caution. If I do intervene, then my sales are still likely to go up, either because of regression towards the mean and/or because increasing marketing budgets tend to make sales go up in any event. So in either case it looks like we made a smart decision, and data science came to our rescue.

Ways that data science is made oracular is not just through equivocation, or lack of controls: the mysticism is a part of it too. If the model fails to operate as expected then the blame is placed on insufficient data, or insufficient nuance. That the questions and assumptions going into or out of the project are the wrong ones somehow never manages to be identified as the culprit.

What this might look like on the inside:

  1. Blind faith in the utility of data science to produce actionable insights, regardless of the quality of the questions being asked.
  2. The removal or reduction of uncertainty in data-driven decisions (especially as these decisions trickle upwards in the organization).
  3. Or, similarly, a tension between the data scientists who want to hedge their bets and the decision-makers who want “the” answer.
  4. A lack of reflection on process or a lack of post mortem analysis of what worked and didn’t work.

Away From The Potemkin Village

I have tried to make my red flags and categories vague enough that they can apply to almost any organization, no matter how deeply they are connected with data science. This is hopefully to provoke a little bit more reflection in the reader than to just assume that there’s a dichotomy between good and bad data science (with the temptation then to always put oneself in the good side, and tune out the bad as being somebody else’s problem). If you are set on doing data science work then I want your work to be meaningful, useful, and beneficial. If it’s not, then I want you to stop doing it, and instead work on changing the things that are making it meaningless or useless or harmful. It might be a better use of your time than taking everything apart only to put it all back together, just a few more miles down the river.

--

--

Michael Correll

Information Visualization, Data Ethics, Graphical Perception.