There was much to learn from studying how the 2016 USA presidential election campaign unfolded. The final results have not been tallied to date, although the outcome is clear now.

As an educator, I am particularly curious about possible linkages between educational level and voting preference. This is a matter covered in various exit polls and other studies, but I wanted to get into the data myself.

This essay tells how I assembled and analyzed some relevant data.


I downloaded state-by-state educational data from [1], and converted to a CSV file for easier processing (provided at note [2] below). Then I visited [3] and cut/pasted the results into a file. After editing the percent signs from the file, I created the file provided at note [4]. Finally, I ran the R code provided in note [5], to produce the graph shown in the next section.

In this analysis, the focus was on the percentage of the population educated beyond the Bachelor level, since that proved to be the educational variable with the highest explanatory power.

As a measure of voting preference, I constructed an index defined as (D-R)/(D+R) where D is the number of votes cast for the Democrat, and R is the number cast for the Republican. This index is zero in a tied state, with positive values indicating a preference for Clinton and negative values a preference for Trump. By definition, the index is bound to lie between -1 and +1, i.e. between 100% votes for the Republican and 100% for the Democrat. I did not perform regression analysis, because it is not clear what that would mean if applied to this index.


The results of the analysis (which may be reproduced by running [5] with datasets [2] and [4]) are presented in the graph shown below. The vertical line indicates an educational level of 10.2%, which was chosen by visual inspection as a divider between Trump-favouring and Clinton-favouring educational levels. The blue dots indicate a preference for Clinton over Trump. center

A noticeable feature is the isolated point indicating high educational level and strong preference for Clinton. This is the District of Columbia, which may have a particular reason to favour Clinton, since Trump has pledged to “drain the swamp.”

The overall data cloud reveals stronger trend of preference for Clinton in states with higher educational level.

Focussing on which candidate won the count, it is possible to construct a simple statement about the results of this election.


Clinton won a majority of the votes in states with high educational levels.

Further steps

I plan to revisit the data over the next days or weeks, as the final votes are tallied. I would be happy to post more items to this blog, if readers contact me with requests. Please note that this blog does not permit comments, since they are so often problematic on any blog relating to politics.

Reference and resources

  1. Source of educational-level data:

  2. Comma-separated file of educational level, created from [1]: 12s0233.csv

  3. Website with a table providing 2016 election results:

  4. Tab-separated file of voting records, created from [3]: election2016votes.dat

  5. R code to read education and voting data and graph the results: election2016.R

  6. Jekyll source code for this blog entry: 2016-11-10-election-usa-2016.Rmd

This website is written in Jekyll, and the source is available on GitHub.