Some final reflections on data visualization

I enrolled in this class frustrated with my previous attempts to learn about data and visualization, and with a sense that, as a journalist to whom data is important to be able to understand and work with, I was missing something. While data still isn’t as intuitive to me as I would hope, I learned some important things this semester that will inform the way I look at, and work with, data in the future. They include:

  • Color matters. Yes, I was the one to call this out during our Phase I presentations, but seeing the way that my classmates struggled with and debated over the use of color, both in terms of user experience and as a cultural signifier, emphasized for me how design elements that seem secondary are actually extremely important. It’s something I kept in mind in the creation of my final visualization, which relied heavily on color.
  • D3 is amazingly powerful. As my Phase I project tested the limits of my very limited coding skills, I abandoned D3 in favor of a more arduous method of visualization for my final presentation. But boy did I miss it while I was hand-drawing the tint masks and captions on my 360 video. The most exciting part of studying D3 was reading about and seeing examples of its ability to process large amounts of data, and change in response to the data that’s inputted. I got to experience this for myself on a basic level, and it was thrilling.
  • Some of the best data visualizations are also the simplest. I wrote about this in my very first blog post, reflecting on one group’s ability to prototype a clear visualization of the different areas of study represented in our class, as opposed to my group’s far more complex, and messy, visualization of nationality and food preference. I’m still struggling with the principles of good, clear design when it comes to data, and I’ll be interested to hear my classmates’ reaction to my final project: is it fun and engaging, or do the bells and whistles take away from it?

So yes, I’m still not a data wiz, and I’m aware that my abilities are far exceeded by many of my classmates. Still, I’m leaving this class much more conversant in the norms and aims of data visualization, which, for someone who’s not aiming to become a data scientist, or even a data journalist, is enough for me!

A first 360-degree scene

Here’s what it looks like to hang out in the Matter lounge space, through an entrepreneur’s eyes:

Okay, not literally. The colors I shaded the office correspond to Multimer’s categorized data: here, you can see that this subject’s mental experience of the space vacillated between “powerful” and “fascinating” (with one tiny blip of blue “restorative”) over a 24-minute period on a late afternoon in July.

You may notice that this scene looks a lot different than what I originally drew for my prototype. That’s because, as I quickly learned, there’s no real way (at least not with the tools I have access to) to create an overlay with data on a 360 video. Instead, I had to embed the text directly onto the footage, meaning that depending on where the user chooses to look, they may learn what time it is, what the outside temperature is, or what’s happening in the space at this time. This turns out to serve a useful purpose: while I don’t have footage of Kourtney conducting office hours with this subject, since this happened months ago, I can put the text describing what’s happening right on the couch where they were likely sitting.

The video quality, unfortunately, isn’t great: I used a Samsung Gear to record the space. While easy to use, it’s far from the high-end equipment one would use to record a professional 360 video. Because nothing’s happening in the video, it might have been better to have just taken a photo of the space, since those tend to be better quality. However, the Gear doesn’t have a way to trigger photos from afar — since it’s 360, that meant any photo I took had me in the scene. And the titles I added in post, particularly the time, are hard to read. 360 videos uploaded to YouTube usually take a few hours to reach full resolution, so I’m hoping this will look a little better over time.

Project Update

I spent the last two weeks digging through Multimer’s data and determining what, given its limitations, can be done with it. Those limitations include:

  • I’m excited to work with quad data, as I think it will give me the best possible understanding of how the subjects exist within the Matter space. However, that data is only available for sporadic amounts of time through the study period (and only the data for two of the three subjects is usable). There is not enough of it to draw large conclusions about the space.
  • The times when Multimer did collect quad data don’t always correspond with the times when Multimer was able to collect location data. There were a few cases where I was able to bring in external data about planned activities in the space to make a reasonable assumption about where the subjects were at certain times — but those activities often didn’t line up with the times for which I have quad data.
  • The data overlay I prototyped for the 360 video is kind of difficult to actually transpose onto the 360 video. If you’re not looking directly at it, the data gets curved and weird looking. This is a design challenge I’m working around.
  • I took 360 video during the day. Some of the quad data takes place after sunset. So the verisimilitude of my video won’t always be perfect.

With those limitations in mind, I revised my vision for the project to be a series of scenes of the three different office spaces, focusing on the times for which I have data for these spaces. This seems more useful than showing a timelapse of the space over 3 days, as most of that time would be empty of data. However, I’m uncertain about how useful that data will be to the Matter team, as it’s pretty anecdotal. While I won’t be able to draw any solid conclusions about the data, I do want to start with a scene that shows which quad category was strongest in which location, so that some sort of big picture is presented.

The incredible, horrifying growth of Thanksgiving turkeys

Back when I was an environmental journalist for, one of my jobs was to ruin people’s enjoyment of the holidays. It was kind of fun! And it was especially easy to do on Thanksgiving, because I could tell them about turkeys.

The visualization above tells you just about everything you need to know about how the American tradition of eating turkey on Thanksgiving has morphed into a supersized horror show. As I reported back in 2014, the modern, commercial turkey has been bred to grow at twice the rate, and to a size twice as large, as its wild cousins. Because we Americans love white meat, most of the added weight is concentrated in the birds’ chests: many commercial turkeys struggle to walk or even stand upright, and none of them are able to mate.

Actually, this illustration from Mother Jones is a more accurate portrayal of how turkeys have changed:


This is a feat of science that also comes with huge gains in efficiency, as I wrote of factory-farmed turkeys:

They require just 2.5 pounds of feed in order to put on a pound of body weight, while the feed-conversion ratio for heritage breeds can be as high as 4-to-1. From a carbon footprint perspective, they’re much lighter on the planet than other forms of meat, particularly beef.

However, I was able to identify a downside to that, too:

The USDA’s Economic Research Service estimates that when consumers bring home turkey, a full 35 percent of the edible meat, through a combination of cooking, spoilage and plate waste, is lost. One reason why turkey is used so much less efficiently than chicken (which has an estimated “loss rate” of just 15 percent), the ERS report posits, may be precisely because it’s typically eaten on holidays, when, according to the report, people faced with mountains of leftovers may be more inclined to discard them. Dana Gunders with the Natural Resources Defense Council calculates that about 204 million pounds of turkey are thrown away on this one day alone — it’s by far the most wasted food on the Thanksgiving table. Into the garbage with the excess meat go the resources used to produce it; despite that efficiency, it adds up to some 1 million tons of CO2 and 105 billion gallons of water.

I know, I know, I’m a huge downer. But I’ll be spending my holiday filling up on stuffing and sweet potato pie, and it’s hard to complain about that!

Mapping the Moon

While I was in Arizona learning about media and emojis, I also took an afternoon to visit my college roommate, who now works as a research analyst for NASA’s Lunar Reconnaissance Orbiter Camera (LROC). Her team creates detailed maps and models of the moon’s surface, and their work is best experienced by slipping on a pair of 3D glasses and exploring the maps on one of these computers:


But even without access to the tech, the LROC mission has released a lot of fascinating images. One person I met was excited to show me “proof” of the Apollo 11 mission: thanks to the lack of atmosphere, the landing site and rover tracks can still be seen on the moon’s surface.


They’ve also released a data visualization tool that you can try out yourself. QuickMap allows users to overlay vector data on a 3D model of the moon — I used it to chart craters that are between 5 and 20 km in diameter (blue) and Copernican craters, the moon’s youngest craters (pink), as in they’re less than one billion years old.


While in Tempe, I also saw some new, not-yet-released visualizations that make very interesting use of color. I’ll update this once they’re made public!

What makes for a good data viz?

Data visualizations aren’t always very intuitive to me. I often feel like it takes me a long time to really understand what they’re trying to convey, and that I’d prefer text just explaining what I’m supposed to know. But there’s one viz that hit me incredibly hard, and was almost too effective at conveying information.

Yep, it’s the New York Times’ election forecast needle (aka the “speedometer of stress” aka the “gray needle of death“).

Screen Shot 2017-11-14 at 11.19.41 AM

Twitter, television news, and texts from friends in the know all helped me understand what was happening the night of the 2016 Presidential Election, but it was these three, jittery, nerve-wracking meters that gave the simplest and clearest picture of what was happening. Amidst mixed messages and an unwillingness, among everyone I knew, to admit that Hillary might lose, the needle didn’t lie. Long before I was willing to admit it myself, I knew, from watching it, that the election was over.

What made this visualization so effective? Its ability to synthesize real-time data during such a dramatic evening certainly helped, and the inclusion of the margin of error as a jitter, while not perfect, heightened that real-time effect. So did the decision to highlight both the popular vote margin and the electoral vote count next to the forecast: it helped me understand where the confidence interval was coming from. Looking back on it over a year later, it’s a good reminder of why the electoral system is…not ideal. The choice of color, obviously, was simple and effective. I like the choice of language on the forecast meter, which is easy to understand, and while I wouldn’t know off the top of my head how many electoral votes one needs to clinch an election, or exactly how the stats behind the popular vote margin work, those components made a lot of sense in context. My only complaint is that the viz was a little too effective at conveying the drama of the election. At one point, it came close to giving me a panic attack, and my roommates forced me to close my laptop.

A less effective visualization I’d like to discuss is one I came across in my capstone project, for which I’m working on ways to effectively convey information, including data, on social media. One thing my team has learned is that on social media, data visualization need titles that very clearly convey exactly what it is you’re looking at, preferably in an attention-grabbing way (an example we often point to is “You’re more likely to be killed by your own clothes than by an immigrant terrorist.”) One case in which I don’t believe this was done effectively was in a recent series of charts for Instagram:

This slideshow requires JavaScript.

Without the accompanying article, this visualization didn’t contain enough information for me to understand what I was looking at, or what the larger point was that the chart was trying to make. I got that it was calling Fox News out for framing the Mueller narrative in a different way from other networks, but titles like “Percent of times Robert Mueller was mentioned in the context of credibility” are confusing — I’m not sure why they chose to collect and visualize this data. Beyond social, it’s not clear to me why a timeline format was used, or what the dotted line on the chart about Hillary Clinton signifies.

The headline for the article on which this Instagram gallery was based helps put things into context: “A week of Fox News transcripts shows how they began questioning Mueller’s credibility.” So do the quick bullet points summarizing this narrative that are included in the article:

  • Fox News was unable to talk about the Mueller investigation without bringing up Hillary Clinton, even as federal indictments were being brought against top Trump campaign officials.

  • Fox also talked significantly less about George Papadopoulos — the Trump campaign adviser whose plea deal with Mueller provides the most explicit evidence thus far that the campaign knew of the Russian government’s efforts to help Trump — than its competitors.

  • Fox News repeatedly called Mueller’s credibility into question, while shying away from talking about the possibility that Trump might fire Mueller.

If I were to recreate these charts for social, I would use those bullet points to shape their titles; for example: “Fox News was more focused than any other network on questioning Mueller’s credibility”, with a subtitle that explained exactly what the chart was showing. I would also probably start with a title slide that explaining what the charts you’re about to see are illustrating. Otherwise, these are just a bunch of data points that I know are supposed to make me angry, but that I can’t make sense of.