Crossing the KiloPaper Limit: A Trip Report from CHI 2024

Michael Correll
28 min readJun 24, 2024

--

A photo of banyan trees framing a view of Waikiki Beach
My camera roll from the trip is 70% pictures of plants I thought were interesting and 30% blurry pictures of other peoples’ slides.

CHI is the preeminent conference in human-computer interaction, and was held this year in Honolulu, HI. This post is my attempt to summarize the conference from the very narrow perspective of somebody who mostly is interested in data visualization and data ethics, which is just one slice of work that was presented at CHI. In a worrying trend, what was once a genre of blog post that was mostly “I’m going to talk about a few papers I thought were interesting” has now more or less stabilized into “I’m going to talk about a few papers I thought were interesting, but only after several paragraphs of navel-gazing,” so feel free to skip down the subheadings if you’d rather just read some paper recaps.

I’ve never had a good mental model for CHI. My first CHI was 2012 in Austin, TX (tagline: “it’s the experience!”), and I’ve now been to something like something like 6–7 since then (depending on how you deal with the now-standard COVID asterisks). Yet it still doesn’t feel like a “home” conference, and I don’t feel the same impulse to help organize or manage the conference beyond my general impulse to do enough reviewing to make me not feel guilty about how much I submit. I subjectively feel like I have better luck with getting my work into CHI compared to other venues, but that’s a very hesitant assessment based on an extremely noisy signal: since there’s a visualization subcommittee at CHI now that is peopled by roughly the same folks I recognize from IEEE VIS (the other main conference I go to), I assume my work is reviewed by essentially the same reviewer pool making essentially the same judgments (but I could be wrong! Maybe there’s still an “old guard” of VIS reviewers that are holding out against the invaders, or maybe VIS reviewers put on their “CHI hat” when they review for this conference and focus on or prioritize different aspects of the work). I’ve been asked “why do you submit to CHI instead of VIS?” or “would this work be a better fit for CHI or VIS?” and I used to give vague, vibes-based answers about prioritizing different methods or tolerances for theoretical framings or what have you, but now I just sort of shrug my shoulders and say “the CHI deadline is in the fall, and the VIS one is in the spring.”

I mention all of this so you can adjust your priors about me and my perspective on the conference in preparation for the following riff on what was once maybe a hot take but now I think is barely lukewarm: CHI isn’t sustainable. I mean something very specific here, which overlaps with similar recent takes of this form. For instance, the CHI steering committee has a four part series of blog posts (parts one, two, three, four) that essentially says “whoops we ran out of money and the conference is getting increasingly large and expensive and something has to give.” Where the decision on what “had to give” was (but don’t take my word for it; the posts themselves are thoughtful and worth your time) a removal of some of the events and submission types (including some I think are pretty crucial) that ran in the periphery of the main conference, and a reshuffling of the events of the main conference to fit in a workweek. But ballooning budgets and hotel contracts are just one kind of unsustainable. There’s also the related critique that the model where you make everybody in your (increasingly large and what I am euphemistically calling “intellectually acquisitive”) field get on jets and fly around the world just so they can sit in some conference room and give powerpoint talks to each other is unsustainable from a climate justice perspective*. This kind of sustainable is a big enough deal that there’s going to be a whole workshop about it at IEEE VIS this year.

But what I mean by “unsustainable” is (hence the title of this post) centered around the fact that there were over a thousand papers in the conference, with (not even including panels, special interest groups, courses, etc.) 15–20 parallel paper tracks happening simultaneously. “See all the interesting papers” was never a feasible goal even at earlier CHIs, but now I couldn’t even manage the modest goal of “see all the papers from my own lab.” Everything (even parties) was double- or triple-booked with other draws to my attention. Paper talks themselves were 10 minutes anyways (with 5 minutes for Q&A that inevitably got eaten up by the time sinks of speaker/slide switching and the technical difficulties that can result from that), so the papers you could see you didn’t really have time to fully digest or discuss, and, well, good luck finding that paper author again among slightly less than four thousand people. Listen, I’m all for the idea that conferences are about more than their technical content, but something has to give, right?

My brain cannot handle a kilopaper conference. Okay, sure, we could use some tooling and statistical modeling to assist here (and I think conferences could do some more to visualize and represent the semantic structure of the conference beyond just the ad hoc creation of paper sessions with sometimes barely informative names), but I also see that as the well-intentioned dark path that ends up in having an Algorithmically Curated Feed of Conference Content, which is not the life I want to live. Conferences are events that are for humans and about humans and (at least from the perspective of the attendee) should be as human-scale as possible.

The ML folks are probably already rolling their eyes at me about this, since NeurIPS almost hit 4 kilopapers last year (and I’ve been told has 20 kilopapers of submissions this year), and graphs of their submission rates all look exponential while ours look flat by comparison. One of their approaches to managing scale (and one that is similar to other conferences in fields like psychology and medicine and so on) is that all papers get a poster but only a select few get invited talks, which is already a slightly more sustainable approach than CHI. Although, even then, you’re still looking at lots of multiple tracks and potential 14 hour days of technical content. I guess if you treat the technical content as a fig leaf excuse for hanging out with your friends then, sure. Or if you treat the conference as the vestigial organ of the main thing you care about, which is publication in a premiere CS venue, then whatever. But if I actually want to, you know, confer with people at a conference, and be exposed to new ideas, and actually take in some papers, then I think we need to break some things.

It might sound like I’m being petty here (which I won’t deny; I love being petty), but I think this particular kind of sustainability ties into the others sorts. The capstone speaker this year was Sam ’Ohu Gon III, who, as a conservationist and native Hawaiian, was very immediately asked after his talk about how to square the circle of attending a conference in a place like Hawai’i while also caring about sustainability and our climate impacts. I’ve linked to the question timestamp directly, but it’s at the ~45 minute mark if that doesn’t work. And his answer was pretty to the point, I think: “if you just came here, and nothing of consequence came out of it, that was a waste.” That’s (part of) why I’m being so weird about stuff like conference formats and paper talks. Getting people together is an investment, and an increasingly expensive one (both in dollar terms but also in harder to quantify resources). So I don’t want all of that investment to be wasted by turning CHI into a undifferentiated firehose of Content, with people siloed off in increasingly weird filter bubbles, like everything else even tangentially associated with computers.

Stuff I think we could or should break in order to surpass the kilopaper limit include:

  1. The conference publication model. It is an increasingly archaic joke that computer science’s main form of publication is conference papers. I’ve been told that the original rationale was that the field “moves too fast” to have to wait for traditional journal publishing cadences, but, c’mon. For one, even if “move fast and break things” wasn’t a shitty motto to begin with, we (for better or worse) have the internet and preprint servers now, so if delay in publication were really the rationale, I think we can generally agree that we have tackled that problem through other means. In fact, since top-tier conferences have yearly deadlines, in many cases journal models might even be faster, since journals let you do things like revise and offer major revisions rather than having to wait around (potentially months) for the next big deadline if you’re rejected the first time around. I think there’s also a question about whether HCI specifically is even well-positioned to take advantage of the alleged speed of conference publication: it often feels that the field is very reactive to trends imposed on it from “outside”, but is structurally less able to have a proportional impact on the world in return, and so the speed of conferences is less about making sure that we are ahead of some Moore’s law-like curve of exponential growth and more about making sure people can get their papers on [insert trendy topic here] out before the hype cycle cools off. I think right now my main personal argument for tying publications to conferences is that I seem to need the structure of hard deadlines to push projects out the door. But that’s a personal problem I could solve in other ways. Like, if the main intellectual value of your publication model could be replaced by bullet journaling and a visit to a psychiatrist, I don’t think that’s a particularly strong model.
  2. The “one big conference” model. I think this particular idea comes from me mishearing and/or riffing off of something I heard Jennifer Mankoff propose, which is that you do a bunch of local HCI conferences (the “para.chi” project this year was an attempt to explore this satellite model) yearly, but only get together for the One Big HCI Conference on a much longer cadence (like once every five years or so). This kind of works because you still get to do cohort building but you get to have a bit more intentionality about when and why to get “everybody” together. The big conference could then be way way less about trying to cram in all the technical technical content and much more about building networks and cohorts. I liked when we had a satellite “watch party” for the virtual VIS conference in 2021, but I admittedly was in Seattle at the time, so I could hang out with a very dense cluster of people who cared about visualization. I’m not sure there are equivalently dense clusters for everybody. But this might be a practical issue of aggregation and scale rather than an existential threat to this “lots of yearly small CHIs and a periodic big CHI rather than one big yearly CHI” proposal.
  3. The field of human computer interaction. As I have mentioned in prior posts, there is an inferiority streak in HCI research that results in it trying to expand its scope, octopus-like, until it covers literally everything where a human is or might be involved, from software engineering to design to ethics to teaching. The result is that there are lots of people who do “HCI” but have very little to say to each other, not just in terms of research areas but methodologically or epistemologically. I’m sure I would do a shitty job reviewing papers that came into essentially all of the subcommittees of HCI other than my “own” (what do I know about capital D Design, after all?), I would (and have) done a shitty job communicating with people who are (on paper at least) very close to what I do but are, for weird quasi-tribal reasons, siloed away in totally different academic worlds (like STS, statistics, cartography, data science, or human-centered design). I still remember when I was asked to be an external reviewer for a graphical perception paper that was sent to (in my opinion) the “wrong” CHI subcommittee, and the other reviewers went, essentially, “what? why would you study bar charts? don’t people already know about bar charts?” I think there’s been a bit too much speciation in HCI, and now we have a bunch of factions that don’t speak the same languages or care about the same things. There’s certainly a benefit to trying to get these groups to talk more, but a huge kilopaper conference is unlikely to do so except in very stochastic and unintentional ways. If the term “HCI” doesn’t describe anything coherent anymore, then why bother with it?†

These are all perhaps drastic options, but I didn’t see much thinking in other directions that does more than just sort of put a bucket underneath the metaphorical leaky roof and call it a day. There were already creaks and groans in the pre-kilopaper system around providing high-quality reviewing, integrating distinct areas of academic thought, and managing our global impact. It’s more than just frog boiling: there’s some elaborate double or triple boiling system at work here where one frog is boiled, another is being tempered like chocolate or eggs, and a third is part of an elaborate sous-vide system. The labor, expense, and footprint of academic conferences cannot grow unbounded: conferences draw on finite and diminishing resources from finite and diminishing common pools.

Minor Conference Errata

Other components of the conference worth mentioning:

  1. They did a thing this year where snacks/reception hors d’oeuvres were all vegetarian and made with local ingredients. I am admittedly biased here but I think it worked well, since a lot of vegetarian things also have the benefit of being portable and consumable at different temperatures and generally hard to screw up even at conference catering scale. I don’t think I even register chicken skewers served en masse in chafing dishes as food anymore, so hopefully this policy sticks around.
  2. I don’t know what happened but I used to remember the interactions exhibits (essentially, a big demo floor/art show/etc.) being some of my favorite and most memorable parts of CHI. For instance, I was recently trying to look up a paper I half-remembered before finding out that it was a paper I, myself, wrote and presented that I had totally blanked on. But I still remember stuff like the procedurally generative music based on calligraphy strokes, the electric forks that stimulate your tastebuds to make you taste phantom flavors, and using a pool of water as a keyboard. But now it’s a much more sedate affair. I don’t know if it’s because everybody has jumped on the AR/VR wagon and it’s hard to make those experiences look cool for people wandering by in the same way (people wearing dorky headsets and gesturing at unseen stuff look pretty much the same regardless), or I’m become more jaded, or what, but I want things to be crazier (which is why I’m bummed that interactions and my other preferred CHI venue for crazy stuff, alt.CHI, are both on the chopping block as per the big conference re-org blog posts I linked above).
  3. I don’t know if people were just politely agreeing with me, but everybody I talked to about it confirmed that it has been harder and harder to recruit reviewers (for conferences but also just in general), with COVID as an identifiable downward inflection point. I’m sure there are lots of mundane reasons why, but I’m currently workshopping a quasi-Freudian thing where a brush with trauma and mass death made people realize that they should spend less of their one wild and precious life doing an under-appreciated thing that most people don’t even like very much, for little reward other than offsetting internalized guilt for creating work for others by the mere act of existing in academia. But I could be wrong, maybe it’s something else.
  4. Essentially nobody was masked (I’d see maybe one or two people in each crowd). The conference center had lots of large open air spaces and was built around open air atria, which was nice, though, and meant that coffee breaks and so on felt low-risk (to me). I didn’t hear from anybody that got COVID this time around (as opposed to the “superspreader event so bad it made the news” that was CHI 2022), but I don’t know if that’s because transmissions didn’t happen or if it was because people don’t self-disclose as much at this point in the pandemic or what. Nobody gave me any guff one way or the other, other than that a few people claimed not to recognize me with my mask on. But that’s fair, since “kind of schlubbly balding white guy with glasses who looks like he doesn’t sleep much” is not exactly an uncommon phenotype in STEM circles.
  5. I should mention that, despite the existential dread and complaining above, I did, in fact, have fun (and not just because I was in an interesting and beautiful place I hadn’t been before). I got to talk to people I hadn’t seen in person since the pre-COVID days, see some work that excited me, and make some new connections. “There’s too much stuff that I wanted to see and it was overwhelming” is not the absolute worst outcome, after all. But I would like some intentionality here!

Papers

Okay, with all of that prelude and errata out of the way, here’s some stuff I thought was interesting. Of course, as per the preceding couple thousand words, I didn’t see anywhere close to everything but stuck mostly to the visualization paper sessions. I am also continuing my boycott of surfacing AI/ML stuff more than necessary, since I still haven’t come up with a way of being productively skeptical about it. With respect to the kilopaper limit though, per the opening session there were at least 48 sessions that were “about AI.” I don’t know if that means sessions filled with entirely AI papers or not, and I am not that bored or idle enough to manually check. At a target of 5 papers a session and 1060 accepted papers, means that something in the ballpark of 20% of papers were about AI. I wonder if 20% of NeurIPs papers are about HCI? Hey, maybe.

Anyway, normally I have to go fishing for a theme to bundle my papers together but, when putting my notes together, there was a clear cluster of work about actually asking people what they are doing and getting them to specify what they are thinking when they look at visualizations or otherwise do analysis, an area that I will keep giving people props for as along as it continues to be relevant. The reason I like this “ask people specifically what they are doing and thinking” approach is that it highlights my particular soapbox that visualization is communication. It’s not a magic bit of technology or an algorithmic way of condensing data to be losslessly piped into your brain, it’s a genre of visual/statistical language that people use to communicate to other people.

And, like all other genres of communication, this means that the whole thing is messy: there’s register and dialect and rhetoric and affect and competing notions of literacies and all that jazz that comes with other kinds of communication like speaking or writing (and, in fact, part of my lens here might have been biased by coming to the main conference directly after the associated workshop on visualization literacy I attended and had thoughts about). It also means that when people fail to understand a visualization, it’s because, you, the designer of the visualization, fucked up in some way: either you didn’t understand your audience well enough, or you just did a shitty job at design. It’s not, and I hate when I see this sort of language creep into visualization design or pedagogy, that your audience was too stupid and so should be written off, with visualizations only for the sainted elect who have enough expertise to know what you’re saying without having to be taught anything, or who can see what you’re trying to show “at a glance” rather than through actually firing brain cells. And likewise it means that the stuff that doesn’t get as much coverage in visualization papers (like onboarding and explanations and captioning and feedback and longitudinal learning experiences and so on) is vitally important to the success or failure of a design out in the real world. I think people mostly know all of this already, in their heart of hearts, but I always like to see it made explicit in papers.

rTisane: Externalizing conceptual models for data analysis prompts reconsideration of domain assumptions and facilitates statistical modeling

Eunice Jun, Edward Misback, Jeffrey Heer, Rene Just

This work is an extension of the Tisane system where the central conceit is that the type of statistical analyses you do in experimental work should be based, naturally, on how the experiment is designed, and how you think variables should be related. With the goal, then, that if you get people to be up front and explicit about what these assumptions and models are, then you can automate a lot of the currently highly manual and occasionally dramatically idiosyncratic processes of doing analysis.

The original Tisane work, and this extension which focuses on the construction of “conceptual models” of data more than just authoring linear models based on an the experimental design, is framed in this positive manner of helping people design and debug their experiments, but, with my bright and sunny disposition, I think there’s a negative framing as well, which is: if you don’t correctly establish your experimental design and data models and let that guide your statistical analyses, you’re fucking up. Differently designed experiments should probably not have the same analyses, and vice versa! So, in their evaluation, where 4 out of 13 people couldn’t successfully construct a multivariate model, my response was not “oh, how nice that this system makes it easier for people” but more “well, shit, what kinds of statistics are people doing now?”

Longer whiny rant excised here, but I am skeptical of lots of allegedly quantitative components of the HCI field (including, I should be fair to mention, my own work). The first source of skepticism is just on the surface validity of the numbers we are producing and interpreting. Many HCI researchers (again, myself included) don’t have all that much formal training in statistics or quantitative research methods but instead sort of base things on models from other HCI researchers, who also may not be trained either, and so create a sort of cross between the telephone game and a cargo cult. But that’s fixable: you build better tooling, more tightly integrate methods courses into HCI curricula (which also has the nice benefit of pulling HCI further out of computer science’s clutches, where I’m still not 100% convinced it belongs), build cultures around audits and corrections and critique. I mean it’s not an easy set of fixes, but there’s at least a path there.

The second objection is less fixable, which is that we don’t seem to have a lot of shared epistemological projects, so it’s sort of unclear to me what all of these experiments are for. Like, we’re not (often) using them to build evidence for or against particular theories: a lot of the experiments I see in papers focus on performance or preference for ad hoc systems (often without control groups), which, to me, are maybe more about putting a “science-y” veneer around what’s really more like a product focus group than doing anything that inherently seems to need much in the way of inferential statistics. I don’t really know if I need a p-value attached to stuff like “9 out of 10 people said they liked our system!”, is what I’m saying. So maybe my conclusion is that if designing and analyzing quantitative experimental data is really hard to do right, maybe we need to stop expecting quantitative experiments in so many visualization papers, or stop anchoring so much of our apparent understanding of the field on the quantitative results presented in visualization papers.

Reading Between the Pixels: Investigating the Barriers to Visualization Literacy

Carolina Nobre, Kehang Zhu, Eric Mörth, Hanspeter Pfister, Johanna Beyer

This paper so far holds the record for this CHI for my count of “excitedly gesturing people over to look at a paper figure.” Here, let me walk you through it. So what this paper did was give people a standard visualization literacy test (the VLAT), which is mainly a series of questions where you give people graphs like the below (here I’m rolling my own example since I’m not certain about the legalities of figure reuse, but it’s not totally dissimilar):

A line chart showing the year average price of cubic meter of crude oil for the period from 2012 to 2022. The price in 2014 is a bit over $600.

And then ask participants questions like “what was the average price of a cubic meter of crude oil in 2014?” What you are supposed to do is go down to the x-axis, look for “2014”, then go up to the correct dot, then go across to the y-axis and estimate the value (here the “right” answer is $622.4, but the real VLAT uses fake data so there’s a nice integer for you to read off). And people look at how many of these questions you get right and make an assessment about how visually literate you are.

What this paper does, which I was shocked wasn’t more common, is to look at the subset of people who got these questions wrong, and dig deeper into why they got them wrong. They did this with a combination of sketching and soliciting people’s reasoning, but it results in stuff like this:

The same line chart as in the previous figure, but annotated with sketches circling the values for the years 2014 and 2015, and drawing a line in the approximate middle of those two points, with an arrow pointing to the corresponding part of the y-axis, which is a value less than but close to $500.
“There was a lot of change in the year 2014 so I took the average”

Again, I’m changing the exact stimulus and response here so I don’t have to figure out how litigious ACM is, but you get the idea. The person got the question wrong not because they were some sort of drooling idiot who can’t even read a line chart, but because there was a mismatch in their mental model for the chart. The designer intended for people to look at (in this case) 2014 and say “oh, there’s one point, and that’s the average for the whole year”, but this person saw that as “oh, that’s January 1st, 2014, and the line is showing the daily change in price, so I need to find the average of all the points in the year” and so ended up with the wrong answer. But I don’t think the participant was totally out of line here! Lots of other line charts use axis labels in this way! And if you have (in this case) 11 discrete points that are aggregates of temporal data, maybe it’s a little weird of the designer to use the metaphor of a line chart (which at least suggests that the data are continuously estimated values over time) instead of a scatterplot or bar chart.

The paper puts these together in a multi-level typology of errors but, to me, the takeaways were that a) there are lots of ways to “read” or interpret charts, even allegedly “simple” charts and b) visualization designs rely on a lot of implicit work (around shared mental models, visual metaphors, chart genres, etc.) to communicate data, and it might be nice for us to make that work more explicit sometimes instead of immediately blaming the viewer whenever we’re misinterpreted. If visualization is communication, then communication is, pretty famously, a two-way street.

In Dice We Trust: Uncertainty Displays for Maintaining Trust in Election Forecasts Over Time

Fumeng Yang, Chloe Rose Mortenson, Erik Nisbet, Nicholas Diakopolous, Matthew Kay

A Medium post that I got partway through writing but ended up languishing in my drafts, and I am mentioning here mostly to poison the well and remove the temptation of me actually trying to finish it, was a kind of weirdly Lacanian psychoanalytic critique of the (in-?)famous New York Times election needle, the speedometer-like uncertainty communication invention that was meant to use a little wobbly line to show that estimates of voting outcomes come with uncertainty and so we shouldn’t get attached to specific numerical predictions, but had the bad fortune to debut in the 2016 U.S. presidential election and so probably did more singlehanded damage to national heart health than the recent craze for smash burgers.

The Vox article I linked above gets at some of this, but some of the backlash to the needle was that it appeared to get it wrong: it (along with almost everybody else) appeared pretty confident in a Clinton victory, then, over the course of the night, in jittering steps that initially gave people the feeling that both outcomes were still possible, flipped to the Trump side and stayed there.

The apologetics in future days and months spent lots of ink blaming people for not understanding statistics and uncertainty, or being dichotomous thinkers of some form or another, and, sure, maybe we could be better about getting people to understand risks, but I just don’t think that people see something like “we predict that X will win the election with a likelihood of Y%” (where Y is some value pretty far away from 50%) and then go into some weird frequentist trance where they think “ah, well, over the course of infinite elections, naturally Y% of them will be result in victories for X, but I should of course not be surprised by the outcome of this specific election, since of course Y is less than 100.” I think they see a so-called expert making a prediction, and occasionally what appears to be a very confident one, and getting it wrong. And, if you got it wrong once, why should I trust you the next time?

This paper is an interesting attempt to try to operationalize that sort of behavior, where, even if an uncertainty visualization did nothing “wrong,” (in showing the likeliest, but still not certain, outcome) people still see it as as less valuable the “next” time, in this case by having participants participate in a sort of multi-round betting procedure where they receive a form of uncertainty communication and then, after the fake “election night”, choose to either keep the form of communication they have, or switch to another one (say, to a visualization of uncertainty rather than just text, or vice versa).

It’s a pretty complex set of experiments so I don’t have enough space to do the paper justice, but it’s got lots of cool features (like a condition where they go, essentially, “okay, you think people underestimate tail probabilities? fine, then we’ll lie to people about tail probabilities and adjust them to what they should be once you account for that bias”). But in general, to me, the results seem to be that the utility of what I think of as “fancy bells and whistles” of uncertainty visualization (animations, calibrations, complex visual forms) might be overstated. These new forms don’t seem to be harmful, necessarily, but they do suggest that maybe we need to be a bit more strategic than just using elections as excuses to try a bunch of new stuff and then blaming our audience for not “getting” it.

“Yeah, this graph doesn’t show that”: Analysis of Online Engagement with Misleading Data Visualizations

Maxim Lisnic, Alexander Lex, Marina Kogan

Cunningham’s Law is “the best way to get the right answer on the internet is not to ask a question; it’s to post the wrong answer.” People love to correct other people on here, even about trivial bullshit that doesn’t matter. And, yet, shockingly, despite the existence of all of these people who like to correct other people, the internet is not full of exclusively correct information. A few months ago I had a post where I speculated that one reason that, e.g., people who deny global warming keep posting obviously “debunked” charts is not because they are legitimately ignorant of the existence of things like cherry-picking to support their conclusions, or are masochists who get a thrill out of being told that they are wrong but that, among other reasons, getting corrected all the time means that you get more engagement, and more engagement means wider reach, which is the real goal, after all. There’s also side benefit where if you get a bunch of authoritative people correcting you, then you’re “being censored” or “must be on to something” or otherwise get to play nice David vs. Goliath rhetorical games.

This paper looks at these patterns of circulation around particular visualizations that have been marshaled for nefarious ends (I’m still sort of uncomfortable/unsettled around the terminology we use to refer to this kinds of visualizations: I think where I land now is that it’s more about intent and rhetoric than some binary misleading/not-misleading deceptive/accurate dichotomy). What the authors find (intermixed with some fun terminology from the dis-/mis-info lit that I will have to adopt immediately, like “evidence theatre” where people use “data props” to support their arguments) is that, sure enough, while people really like dunking on people that misinterpret data to fit their own ends (replying and quote-RTing these interpretations of charts more than others), all of this dunking does not seem to produce a statistically significant reduction in how much those underlying misused charts actually circulate.

Sure, this paper is example n+1 that you can’t make the dis-/mis-information problem go away by just slapping a “misleading!” badge or something on offending tweets, or by rating some chart “3 out of 4 pinocchios” or whatever, but I buy the author’s arguments for why the visualization aspect of the problem makes what was already a wicked problem even wicked-er. The “mistakes” in these charts are often subtle or require a bit of background in statistics or data science to understand. The charts themselves, as avatars of “data”, at least rhetorically seem to have some sort of special status as being more “truth-y” than other forms of evidence. But, and the paper focuses on this last point, since, as per Ben Shneiderman, “the purpose of visualization is insight, not pictures”, it means that people can agree to the relative validity of a particular chart, but then get into heated shouting matches over the gloss that they apply to the chart. Rather than standing on its own, the construction of a dataset into a pointed rhetorical weapon is something you can watch happening in real time.

Wrapup

We are way too deep into this trip report for me to even have a prayer of leaving you with something concise and summative. I guess that’s even the point of my thesis here, that the conference has gotten so large and diffuse that even a cursory overview of a very select subarea of the conference can’t pull anything useful together. But, and maybe this is just selection bias and wishful thinking, I think I am seeing an emergence of a sea change in how people think about visualizations in the community, as less of these authoritative and instantly understandable embodiments of some supreme logos of underlying objective data, and more as situated and contingent rhetorical artifacts, with research questions like “what is the most efficient data visualization?” as point-missingly odd as “what is the most efficient spoken sentence?”. Or, you know, maybe not. There were over a thousand of these papers, after all.

Footnotes

* For another critique of CHI based on both climate justice but mostly on tourism per se and its impact on the indigenous peoples of Hawaii, there was also a call to boycott CHI entirely this year. I have a lot to say about this appeal (and the fact that I attended in person would at least provide evidence that I didn’t find it personally compelling), but, with the caveat that my positionality here means that magic eight balls, ouija boards, and tarot decks are likely to provide more coherent critiques of the various issues at stake, I was personally disheartened to read a lot of the discussion (both in the appeal and in the reaction and discussion of it), especially since, as can be inferred by the rest of this post, I agreed with large chunks of the boycott argument (although others I vociferously disagreed with, or, at the very least, didn’t buy).

I think my main issue is that I don’t view academia blundering into exploitative relationships and reinforcing imperialist power structures as a wacky new side effect that just so happened to be bad this year because of one-off coincidences of location and natural disasters, but that this inherent domineering conservatism is pretty much the modus operandi of academia. Yes, even academics in the humanities (you don’t get a pass for abuses of power just because you’ve read enough Foucault or whatever to nicely articulate how you are abusing your power). Although maybe this point is easier to accept for us in the U.S. after a few months of watching university admins let loose cops to point fucking sniper rifles at students’ heads because they had the audacity to protest, or immediately cave to cynically-deployed moral panics around DEI or trans people’s right to exist. So it just felt very naïve and/or uncomfortable to have academics talk about, e.g., how tourism as an industry can be extractive, with the majority of the capital benefits extracted by capitalists rather than laborers, without doing the same self-critique about what the academic “industry” is doing even when it’s not sitting in conference rooms. I don’t know how to weigh the moral issues with flying to a conference hotel somewhere against the moral issues of flying students in to work in places where their human rights, physical or economic security, or even life-or-death safety are at the whims of administrations and governments that are adversarial to even the bare minimum of what I think it means to thrive as a person (in academia or just in general).

† It might sound like I’m against taking the effort to be trans-/inter-/cross-disciplinary here, but I think what I’m objecting to more is people trying to add the interdisciplinary stuff near the end of the academic system (showing off “finished” work), instead of throughout the system where it would actually do the most good. For instance, a common lament I hear from HCI/visualization spaces participating in allegedly domain-focused workshops “why aren’t [more of] the domain experts in the room with us right now?” Or the similar “why aren’t the people in [AI/ML/some other ‘hot’ domain] engaging with our research?” And there are lots of potential answers to that, but to me there’s an immediate pragmatic answer, which is that a conference is a huge expenditure and time sink for somebody outside of the discipline, and asking somebody to come in and wade through the kilopaper morass to find the parts that might be useful to them on their own dime seems pretty rude and unhelpful to me.

This is perhaps a whole additional post, but it seems like there are many similar structural costs to being interdisciplinary in the way that I think HCI people allegedly aspire to be, and few concrete rewards. If you need a bunch of top-tier publications in a very small set of venues to even make it past the initial weed-out stage of a research-focused faculty interview (or to get an industry internship, or be considered for tenure, or really all of the other parts of academia where you are evaluated by a relatively homogenous group of “peers”) then, I’m sorry, you’re not in a system that rewards exploring other domains, you’re in a field that rewards specialization. The purpose of a system is what it does! It seems like “being interdisciplinary”, from the institutional lens, is conceived of as a side project to do in your own spare time, a hobby for very senior people who are bored with their “home” discipline, or as a peacock-feather-like way to show off that you have enough academic brilliance to “waste” on “superfluous” things like being up to speed on multiple, largely independent sets of fast-moving research literature. And no, making a bunch of new, nominally interdisciplinary departments doesn’t fix this issue, for the similar reasons as the old XKCD canard about standards: the problem of people not having resources to “keep up” with n disciplines is not to make it so there are n+1 disciplines.

The above rant is I think pretty paint-by-numbers at this point, so maybe a slightly hotter take here is that I think one of the most important things in research, especially if I want to make more than incremental progress or want to explore new angles or perspectives, is having time to aimlessly fart around, and the modern research world (in academia but also industry research and probably intellectual activity generally) seems to be adverse to letting people fart around (not just because it’s outwardly unproductive in a world that’s largely held hostage by neoliberal bean-counting logics and a culture of hustling, but also because it’s seen as “rude” to the people who don’t have the “luxury” to be lazy). There’s a somewhat related David Graeber quote that I keep in my back pocket for this kind of sentiment:

In academic environments where most people were first drawn to their careers by a sense of intellectual excitement, but feel they then had to sacrifice that sense of joy and play in order to obtain life security, it is extremely unwise to be seen as visibly enjoying oneself, even in the sense of being excited by ideas. This is viewed as inconsiderate.

--

--

Michael Correll

Information Visualization, Data Ethics, Graphical Perception.