Laissez les bons temps tousser: A Trip Report from CHI 2022
In Antoine de Saint-Exupéry’s (you, know the Le Petit Prince guy) 1931 novella Vol de Nuit (Night Flight), Rivière, the station chief of an airmail operation in South America, intentionally sends men and machines on (at the time) extremely perilous night flights up and down the Andes and in all sorts of weather in order to maintain the thin margins of speed and efficiency which interwar aircraft could offer over mail sent in trains and boats, keeping alive the promise and potential of this emerging technology. While Rivière maintains an external façade as a pitiless and by-the-book taskmaster, he internally muses about what rights he has to sacrifice men on the altar of progress. The pilots, on the other hand, have no such conflicts: they take on the job for the love of flying and adventure, and resent even the implication that they might be too fearful (or even just too old) to take on these risks.
This particular allusion is in service of my immediate reflection on CHI 2022, a hybrid conference on human-computer interaction in New Orleans, LA. I attended in person because, like a pretty sizable chunk of people in my community, I was really missing in-person conferences and was relatively confident that I could at least manage my risks, even during the spread of new COVID strains. I thought about it, got antsy as mask mandates on planes got lifted a week or so before I was due to travel, but still ultimately decided that the benefits would outweigh the costs.
As somebody who is very curious about how academia in going to work once the metaphorical dust settles, my notes from the first few days are sort of funny in retrospect: “oh, hmm, pretty much no interaction between virtual and in-person attendees, interesting.” “hmm, it looks like session chairs have to monitor like three or four in-person and remote feeds simultaneously to make sure they aren’t missing questions or tech problems, that doesn’t seem sustainable”, and “oh, the ticket-based open bars are now cash bars, I wonder if that will help with alcohol-related safety issues or just shift them to outside events and/or add money to the mix”, etc. etc. The challenges of a hybrid conference, and the sheer goodwill of CHI attendees to make things work, was embodied for me when an SV brought up their laptop to the front of a talk room and precariously balanced it on a podium facing the audience so the virtual presenters could see that there, was, in fact, a live audience cheering them on from New Orleans.
I (and others) still have thoughts on all that stuff, but to me they are sort of dwarfed by the fact that the main thing dominating the conference hashtag on twitter is people disclosing positive COVID results and urging people to isolate and get tested themselves. It’s only been a few days since I returned and I’ve already gotten at least three personal emails of people I had close contact with letting me know about their positive tests. While I appear to have dodged the bullet this time (🤞), this was more due to luck (and perhaps some personal isolating misanthropy) rather than any particular virtues on my part.
It is inescapable that there are a non-zero number of people who got COVID that week that probably would not have gotten sick if they hadn’t attended CHI in person. How many, and how much work that “probably” is doing in the sentence, I don’t know, and I am unlikely to ever really know with much certainty. There’s just so much availability bias/friendship paradox/sampling bias stuff going on to know much about the relative or comparative risks here (this article says 1,900 in person attendees, and my hasty [under-]count of positives based on disclosures on social media and via email was slightly over 30 as of a week after the conference, but those are just two uncertain values in the several Bayesian calculations I actually want).
I should point out that I analytically expected that I would likely get a COVID exposure or two, probably on a plane or at a restaurant, since it would be in a city without mask mandates and with big crowds of fellow travelers. And I think that analytical expectation was probably about right: the conference had vaccine verification and a mask mandate with pretty good adherence: inside the convention center per se I felt I was able to manage risks— it was outside and en route where I felt things got risky. But I was not really prepared emotionally for the “hey, we had lunch/dinner yesterday and I tested positive today” emails, and those emails freaked me out. I suspect this reflects my privilege more than anything about epidemiology or risk: most of the rest of my cohort is back to in-person teaching and/or has school-age children who are back to in-person learning—maybe these exposure notifications are old hat to them. Whereas I have mostly spent the pandemic holed up inside my apartment getting increasingly weirder to be around.
I have never envied the organizing chairs of big conferences, but now more than ever I feel like they are trapped among mutually contradictory desiderata, between managing cohorts with different levels of risk they believe they can assume for the benefits of having in-person interaction and professional development. Of course, there’s the voice in my head that says that this kind of moral calculus has always been at play: economic and social burdens, crime and harassment, longer-term harms like carbon costs, and, even in pre-COVID eras, “con crud” illnesses that have created lasting repercussions — the 1976 American Legion convention has its own wiki page, for instance. There’s always been a non-zero possibility of serious harm at any conference (both in-person and virtual, I should add), and those harms (and, conversely, the benefits) have never been borne equally across people from different backgrounds or life situations (but then again that’s true of pretty much all human activity).
All that being said, I still maintain what I said in this post: virtual conferences are not cutting it for a lot of pretty darn crucial parts of what makes academic conferences valuable, and, while the specific norms are still in flux/being built, the future of conferences I think is going to be inherently hybrid. But I think we have to manage urgent questions about what hybrid means in the short/medium-term and how we can better quantify and manage risks. E.g., do we enforce daily testing? What happens to people (especially students) who test positive? Do they have to eat another week of hotel costs in an expensive city while they self-isolate? Do we have to roll our own contact tracing and disclosure systems? How do we manage social or peripheral events like parties and receptions? I think we’re learning a lot about what not to do, but a real positive project for the future of hybrid conferences has yet to emerge, as far as I’m concerned.
All of this prelude is to perhaps explain why some of the verve is missing from the following list of themes and papers that I noticed — it just seems sort of… in poor taste to go on about how I thought something was a cool study design or whatever (although I’m envious of the fast typers and good notetakers who were able to get their trip reports out before the positive cases came in; they just got to write about a big long list of papers). But at the same time I don’t want to overshadow the good work that people did under very challenging circumstances. But I at least wanted to work through some of my thinking before I got to the recap of content.
For this overview, the standard caveats apply: I can’t cover everything from the conference, or even everything I personally went to, since I am one person and I am assuming limited reader patience. I tend to choose around seven talks as a good midpoint between depth and breadth, so that’s what I did here. I do visualization research so I am heavily biased towards visualization work (and there’s now consistently enough visualization stuff at CHI to fill most of the week without having to go looking for other sessions to go to), and of course I have my own biases and priorities in what I find interesting that I do not expect or anticipate that you, the reader, will completely share.
How Do People Actually Work With Data?
One concern I have with data analytics is that people in statistics and data science have very strong opinions about what you should do with data, and codify these opinions into products, but these prescriptions have pretty much nothing to do with how most people actually get their work done. We’re somewhat a victim of our own success here: lots and lots of people work with data every day, and yet we seem to be hung up on designing for a subset of people who think about data in exactly the same way we do. I think we need to do a lot more ethnographic work about how people actually use the tools we’ve made for them, so it was great to see quite a bit of that work this year.
“It’s Freedom to Put Things Where My Mind Wants”: Understanding and Improving the User Experience of Structuring Data in Spreadsheets
George Chalhoub, Advait Sarkar
As we just got on the cusp of exploring in our prior work, there’s this weird disconnect where we keep designing all of these allegedly powerful tools for analyzing data, but many (maybe even most) people are perfectly content to stick to Excel and Google Sheets, thank you very much. Part of this is the freedom these tools give you to just explore and directly interact with your data. So one solution is to meet users where they are, and provide options to structure or analyze their data without leaving the table environment, at the cost of perhaps a few constraints in the otherwise totally unconstrained grid of spreadsheet cells. This work expands on what makes spreadsheets so darn ubiquitous and tries some solutions for allowing restructuring without giving up too much of the good things about the tabular environment. Since the “transient structures” of spreadsheets are so important to the freedom that people value from tables, they found that tools, even really useful tools, that enforced particular table layouts or structures were often just non-starters, user-preference-wise. But I think there’s a really interesting design space here.
Irene Rae, Feng Zhou, Martin Bilsing, Philipp Bunge
One of the core bits of academic visualization lore is the Shneiderman Mantra: overview first, zoom and filter, then details-on-demand. It’s influential to how we teach people to do data analytics, build analytics tools, and conceptualize the ways our intended users interact with data. It’s a prescriptive way of thinking about how data analysis is done that is often taken to be descriptive as well. What this paper presupposes is… maybe it isn’t? Here the authors built up low level interaction logs and context switching into patterns of behavior to see how and when people drilled down (or moved up) in their data. And it turns out people (very competent people who get very important data work done), regardless of our mantras, have many strategies for moving up and down. My favorite bits of terminology for these patterns were the “sawtooth” (repeatedly filtering/drilling down to a low level of detail, then clearing filters and starting again) and the “sidestep” (filter down to a low level, but then chuck out everything but the most recent filter and revise your hypotheses).
Visualization Annotation and Augmentation
I have no idea where people got the idea that the “data should speak for itself” or that visualizations need to stand alone to tell the whole story but it’s a really silly idea and is holding the field back. Do other forms of communication hamstring themselves in this way and assume that multimodality or explanations or even just walking the viewer through what they are seeing is a sign of weakness? I hope not. Anyway, there’s tons of promise and potential in visually augmenting information to make it more useful, trustworthy, or understandable, and we should be doing more of it.
Math Augmentation: How Authors Enhance the Readability of Formulas using Novel Visual Design Practices
Andrew Head, Amber Xie, Marti A. Hearst
Periodic advice I hear is that I should include more equations in my talks, because it makes me look smarter, and, well, I could use the help. The assumption behind that advice is that big complex formulae are often there to just dazzle the audience rather than do anything useful. And yet formulae are pretty useful, and, if you build an intuitive picture of how a particular one works, maybe it will serve to do more than to just look “mathy” in a paper somewhere. This paper was a great examination of all the ways that color and animation and interactivity can be used to tell people how a formula is put together and how it is derived, but I think it’s a useful design space for text, data stories, or any other scenario where you have to explain dense or complex or formal things to people other than yourself.
Arlen Fan, Yuxin Ma, Michelle Mancenido, Ross Maciejewski
We have lots of metaphors for how we build visualizations these days, but one affordance we’re missing is the notion of “autocorrect”: what’s the equivalent of a squiggly red line underneath a bar or line chart telling you that something might be fishy? I think these kinds of interventions could be useful for both designers but also consumers (“here’s a bit of deception that could be going on in this chart”, say). This work attempts to automatically detect various line chart issues (for instance, in a shameless plug, futzing with the y-axis), and, what I think is most interesting, presenting a little in situ “corrected” version for people to compare against. I worry about a system that automatically “fixes” charts, but I think having multiple versions (perhaps that you could even toggle between) is an interesting approach that places a bit more power into the user to interpret their data, and gives a bit more credit to the designer that they have important contexts or constraints in mind that wouldn’t be captured by binary dogmatic design rules.
(Co)-Existing with Machine Learning
Right before the next AI winter rolls around there always seems to be a period of optimism where people claim that we can “fully automate” important things. I’m never really quite sure what this means, since a human at the very least seems to have to press the “install” button on something, right? Or buy the software? Or make the decision to turn the software on, at the very least? To me the promise seems to be more along the lines of “humans will be able to fully avoid the responsibility for what the automated systems do on their behalf,” which is a pretty nightmarish scenario no matter how good the algorithms are. So I’m always appreciative of work where the human parts are put back in, and more specifically where we talk about the implicit structures around the “smart” systems we build.
How do you Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational Behavior
Vidya Setlur, Melanie Tory
One thing that maps apps do that they didn’t used to is tell me if the place I’m going to is going to be closed when I get there. This makes sense, because most of the time when I put an address of a business in my phone to navigate to, I want to actually, you know, patronize the business. And so if my phone gave me nice turn-by-turn directions to a place that was closed, I’d be pretty peeved. I didn’t explicitly say “navigate to this place, but only if it’s open,” when I opened the app, but it is part of a shared implicature of this (implicit) conversation. Philosopher Paul Grice suggests that implicature is a large part of how conservations work, and many of the ways that we interact with virtual agents (or, I’d argue, software generally) is not respectful of these tacit but very important parts of conversation. This paper is a look at how we can improve conversations in a analytics chatbot system, but I think we should be thinking more deeply about back-and-forth in all of the rest of the interactive systems we build, even if they are not (explicitly) conversational.
“Look! It’s a Computer Program! It’s an Algorithm! It’s AI!”: Does Terminology Affect Human Perceptions and Evaluations of Algorithmic Decision-Making Systems
Markus Langer, Tim Hunsicker, Tina Feldkamp, Cornelius J. König, Nina Grgić-Hlača
There’s a joke that goes something like “it’s artificial intelligence when you’re pitching to investors, machine learning when you’re recruiting developers, and linear regression when you have to actually ship something.” The punchline works because the language we use around ML/AI/“smart”-whatever is so slippery, so prone to hype, and so often obfuscatory about what’s actually going on. This paper looked at how attitudes around competency or complexity change based on how you refer to the system, from the grandiose “artificial intelligence” or “robot” all the way down to the significantly less exciting “sophisticated statistical model.” The results are a mixed bag for me in that some of the hype does seem to work (a “robot” is perceived as, in the aggregate, more complex than an “automated system”), but there are many nuances and complexities across attitudes their rankings that suggest it’s not as simple as slapping the “machine learning” label on your system to make people’s brains turn off.
Nithya Sambasivan, Rajesh Veeraraghavan
The thing that keeps most AI systems running is usually a lot of poorly paid (or sometimes unpaid) human labor. Even those fun little AI generative art bots only work because they scraped up several server racks worth of human-made art from the internet without giving those artists any money or credit. And since this labor is often invisible in the final product, it is not surprising to me that the AI researchers collecting this data often treat their “walking datasets” with contempt or disdain: as “corrupt, lazy, and non-compliant.” And so these systems, that are meant to automate away tedious tasks and promote human flourishing or dignity, are designed in ways that are hostile to the people that make them work, full of measures to discipline and punish but fewer to reward or empower. As an example from my work, pretty much all the crowdsourced visualization studies I’ve seen will include attention checks to weed out poor quality responses or “click through” behavior, but I can count on one hand the number that give out bonuses for participants who perform well. How do we make these contributions of human knowledge and skill more participatory? And, if we can’t, in the words of the authors, “should AI labor even be performed?”
We passed the 3k word mark a paper or two ago, so I’ll keep this brief. There was a lot of content at CHI this year that hopefully troubles the vision of steady progress towards bigger and better and smarter computational tools that solve all of our problems for us. We’ve got a lot more learning to do about ourselves and our social structures before we can be sure our technological work is having the impact that it should. But, beyond the content of the conference, I hope that the conference itself, and all of its successes and failures, becomes an object of socio-technical inquiry: where we can focus on what works and what doesn’t work, who we can support and how, and what concrete and positive picture of the world we can put together from the ways we interact and work and think with each other.