Category: In Action

Modeling the DNC

The Democratic National Convention held in Denver last week was an overall success thanks to countless hours spent planning by law enforcement, the convention committee, local leaders and a math class from the University of Colorado.  Yep, that’s right – a math class.

NPR aired a story last week about a math class at the University of Colorado that created models to best locate resources such as volunteers and free bike rental stations.  For volunteers, the class had to take into account variables such as the skills and interests of the volunteers, the availability of the volunteers, where the demand for volunteers would be needed, and so on. Similar variables were considered for bike rentals. To further complicate matters, the models were constructed without knowing the values of many variables such as how many bikes would be available.

This challenge made me think of some of the optimization models we make at Corona. Often we have teams working in parallel; one constructing the model and the other crunching numbers creating the inputs. The model has to be flexible enough to allow for a broad range of values without knowing the exact values (or range of values), while ensuring the model still accurately represents the desired real world situation. While the teams work closely throughout the process, it is still an anxious moment when the two parts of the process are combined and we hit the “go” button.

Of course, constructing the actual model is the easy part – designing the model to mimic reality is where the art (and fun part!) comes in. While limitations always exist, nearly any problem can be modeled.  The DNC is just one example, of course.  Need to pick a new location for your business?  You could model where your market to make sure you minimize cannibalization of your other locations.  How about maximizing your marketing budget?  You could use a model to maximize return on your money (and even time) spent.  Consumer behavior?  Population growth?  You get the idea – modeling can help make better decisions for real life problems.

(for our observations on the DNC, see our other post here)

Be Careful What You Ask For

It’s the season for political polling, which is a convenient occasion for illustrating the many potential pitfalls of conducting opinion research.  Last week there was a particularly good example of biases in opinions caused by the way a question is asked.

There is currently a bill (House Bill 1366) in the North Carolina State Legislature that aims to reduce bullying in the public schools, and (at least at one point) specifically calls for harsher penalties for bullying that is based on group membership, including sexual orientation.

So what do North Carolinians think about the bill?  Well, apparently only 24% of them support it.

No, wait a minute—74% of them support it.

What gives?

There’s an easy explanation — you get what you ask for. Here’s how the more liberal Public Policy Polling phrased the question in their survey (which showed 74% support):

There is currently a proposal in the General Assembly that specifies the need to protect children from bullying based on their sexual orientation. Do you think this provision should be passed into law?

And here’s how the more conservative Civitas Institute phrased the question in the poll that received 24% support:

Do you think public schools in North Carolina should implement an anti-bullying policy that requires students be taught that homosexuality, bisexuality, cross-dressing and other behaviors are normal and acceptable?

Regardless of your politics, I think anyone would agree that there is a pretty big difference in the emotional tone, the choice of absolutes (“specifies” vs. “requires”), and the choice of descriptors (“sexual orientation” vs.  “homosexuality, bisexuality, cross-dressing and other behaviors”; “children” vs. “students”) in these two questions.  This all adds up to big differences in what those questions are asking, so it is unsurprising that they got such divergent results*.

As recognized by the local media, both polling groups typically operate on opposite sides of the political spectrum, but I have to agree with the reporter that the Civitas question is the more biased** of the two.  Casting political questions in terms of absolutes (i.e., “requires”) often lowers levels of support because most Americans do not like the idea of being told what to do by the government.  Throwing in the ambiguous (and scary)  “other behaviors” invites respondents’ imaginations to run wild.  Finally, framing the bill in terms of “teaching” rather than “preventing bullying” is arguably a misstatement of what the bill is supposed to do.  You can make the argument that children are “taught” what is normal and important by viewing how adults punish and reward their behavior, but “taught” in the context of public education explicitly conjures the image of direct classroom instruction.   For all of these reasons, the Civitas question looks like it was written to get the exact result they got.  It may not be a public opinion question, but a marketing question, designed to get headlines and shift attention.

In other words, to ensure you get good quality data, you need to be careful what you ask.  Which, if either, of these questions is likely to provide an accurate estimate of how people will vote on the bill?  And to ensure that you as a reader are not mislead when biased questions are reported in the media, you need to know what was asked!

*If I had to be evenhanded to both sides I would argue that the Public Policy Polling question was asking about support for the intent of the bill, while the Civitas Institute was focused on support for a potential effect of the bill.

**The Public Policy Polling Question isn’t perfect either.  “To protect children” is a fairly loaded phrase (Simpsons fans will recall the often exclaimed, “won’t somebody think of the children?!”)

Photo of the North Carolina State Capitol in Raleigh courtesy of Jim Bowen and licensed via a Creative Commons Attribution 2.0 license.

Forget Gen Y, What About the “Google Generation”?

Since our work on Digital Natives (pdf) for the Idaho Commission for Libraries on digital natives (mentioned in this post), we’ve been noticing others’ work on defining the behavior of GenY and the subsequent generation (whom I refuse to call Gen Z) who have all grown up with ubiquitous computers, cell phones, and the Internet.

University College London, working for the British Library, recently released yet another interesting report examining individuals born after 1993 (whom the report dubs the “Google Generation”).

The report, based on literature reviews and analysis of library database search data, focuses on how the Google Generation searches for and uses information (and how that behavior is different from other cohorts), with a focus on searches for “scholarly” articles.

A great feature of this report is that the researchers have indicated their confidence (from low to very high) in the validity of each of the hypotheses and myths they set out to examine.

To me, one of the most arresting results* lay in this graph** (click on the graph to open a window with a slightly more readable version):


Personal relationships, across all cohorts, are a common way to find scholarly articles, but the younger cohorts are more likely to search google scholar, examine an electronic table of contents, or visit a journal publisher’s website.

Members of the Google Generation are also much less likely to visit the library in person, which provides still more support to the idea that academic libraries of the future will feature far fewer physical stacks and far more virtual ones.

*Ok, this result isn’t perfect. Since the data is cross-sectional, we can’t be completely sure if the differences in behaviors between cohorts are due to the fact that they are in different generations or if there is some developmental change (i.e., some systematic difference in behavior, preferences, or training between older and younger individuals that younger individuals will eventually “grow out” of) that is causing the differences here.
**To nitpick some more, the graph isn’t perfect either. The y-axis isn’t labeled (nor is the x-axis, which we believe to be age), and the text accompanying the graph says only “the graph shows the relative value that members of the academic community place on a range of methods for finding articles,” so there’s no way to tell what scale was actually offered for the values (e.g., 1 to 6, or 1 to 10, etc.), or whether numerical “values” were accompanied by verbal labels that aren’t included on the graph. Also, the smoothed curves are unnecessary, and give the illusion of a continuous variable when, in reality, there are no values between the labeled cohorts. Using a simple straight line that connected visible dots would have been clearer.

Three laws of Great Graphs?

What graphs should you use in your presentations?

Marketing uber-guru Seth Godin recently posted an interesting set of guidelines (and a follow-up coda) on his website.  As is customary for our culture, Seth’s rules were three:

1. One Story
2. No Bar Charts
3. Motion

His rules quickly were a lightning rod for controversy, so let’s separate the wheat from the chaff:

1. One Story Seth says (and has said before) that a graph should avoid nuance and be easily understandable in two-seconds and should make only one main point.  But Seth is giving his advice in the context of making a memorable, high impact presentation, where (in Seth’s words) you need to make a “point in two seconds for people who are too lazy to read the forty words underneath.”

Other types of contexts require different types of graphs.  Reports can handle more complex graphs (but executive summaries should probably be simpler), exploratory analyses can go even more complex, and data-visualization as art need not even be readable!

But let’s take a second look at that quote.  By calling your audience “too lazy to read the forty words underneath” Seth is assuming that the audience doesn’t care.  This echoes what he says in his follow up post

In a presentation to non-scientists (or to bored scientists), the purpose of a chart or graph is to make one point, vividly. Tell a story and move on. If you can’t be both vivid and truthful, it doesn’t belong in your presentation.

Again, this only matters when your audience is not invested in your topic.  For topics and presentations where there is much more intrinsic interest among your audience, you can get more nuanced.  But there is no reason a nuanced chart cant also have a simple message, as the Junkcharts blog points out.  Or, as Jon Peltier puts it, by removing nuance, you are insulting your audience, telling them that “They can’t handle the truth!”

2.  No Bar Charts. Seth has made the point before that a bar chart can obscure the truth, and muddle your story.  Instead, he suggests using a pie chart.  A confession: I hate pie charts (as do others who make graphs for a living), but they do have a time and a place for simple graphic displays, especially since audiences are familiar with them and expecting them.  But don’t scorn the useful bar chart!  Yes, they can be misused, but so can any other type of graph, and (as pointed out by Stephen Few) pie charts are actually perceptually inferior to bar charts even for presenting simple data.  The one point I do agree with is that bar charts should not usually be used to display time series results (often line graphs are better for that) (Seth defended his no bar charts decision in his follow up post–to stop this from turning into Moby Dick, we’ll address that with our own follow up post).  The issue here is really using a chart, and choosing the data, that best illustrates your story for your audience.

3.  Motion.  Seth really dropped the ball here.  For someone that understands the distraction caused by PowerPoint’s dubious dissolves and annoying sound clips (pdf), his suggestion of creating two slides with graphs set up to show changes is just as cheap and distracting a trick.  Stephen Few has a much better suggestion (although even this can be improved), to show the change by making parallel  bar graphs.  Here the story is “Trolls were a problem, but Gremlins are now.”

If trolls and gremlins were the only categories on the graph, then Seth’s suggestion would work great.  And if the biggest problems are all you care about, then that is fine.  But if you have this data and your story is different, you need a different graph and a different presentation style than Seth suggests.


Design your charts (and all your materials) with your story and your audience in mind.

And as for this non-controversy, we’re really all on the same team here–we all want clear, interesting presentations.  Seth wants to accomplish that by limiting what people do with graphs; those who he (dismissively) calls data purists want to educate people to do more with their graphs so they make the right choices.  Seth’s plan works for the novice.  But when the training wheels are ready to come off, I think it’s better if presenters know how to make the best choices for the needs of their story.

Shift Happens

While this video has been making the rounds for a while, I recently ran across it again.Clean presentation, gets to the point and it’s more motivating than daunting.That’s one of the reasons I like research – here are the questions, so now what are the answers?

Occasionally, we get to provide some of those answers.We have done many education-related projects here at Corona.Recently, one of our clients posted our findings on digital natives’ needs and desires related to library services.Like any business, libraries must change and adapt to remain relevant to fulfill their mission with the next generation. Just one example of how research is helping determine the needs of today’s generation.