March 2017 - The Numbers don't Lie

A Non-Technical Overview of My Research

Recently I have been writing up a draft of a research article on my general election model to submit for academic publication. But that paper is technical and requires you to have some exposure to statistical research to understand. I wanted to explain my research without going into all the technical details.

Introduction

The President of the United States is elected every four years. The Electoral College decides the winner, by the votes of electors chosen by their home state. Usually the electors are chosen based on the winner of that state and they vote for the winner of that state. Nate Silver correctly predicted the winner of the 2008 election with Bayesian statistics. Silver got 49 out of 50 states correct. Silver certainly wasn’t the first person to predict the election, but he received a lot of attention for his model. Silver’s runs Five Thirty Eight which talks about statistics and current events. Bayesian statistics is a branch of statistics that uses information you already know (called a prior) and adjusts the model as more information comes in. My model like Nate Silver’s used Bayesian statistics. We do not know the details of the Silver model, besides that it used Bayesian statistics. To the best of my knowledge, my method is the first publicly available model that used poll data from other states as the prior. A prediction was made for 2016, where I correctly predicted 6 states. Then the model was applied to 2008 and 2012, where my prediction of state winners matched the prediction of Five Thirty Eight.

Methodology

I took poll data from Pollster, which provided me csv files for the 2016 and 2012 election. For 2008 I had to create the csvs by hand. I had a series of computer programs in Python (a common programming language) to analyze. My model, used the normal distribution. My approach divided the 50 states into 5 regional categories: swing states, southern red states, midwestern red states, northern blue states, and western blue states. The poll data source used as the prior were National, Texas, Nebraska, New York, and California respectively. This approach is currently believed to be unique, but since multiple models are proprietary it is unknown if this has been used before. I only used polls if they were added to pollster before the Saturday before election date. For the 2016 election analysis this meant November 5th. I posted my predictions on November 5th.

I outline more of my method here.

Results and Discussion

My model worked pretty well compared to other models. Below is a table of other models and their success rate at predicting the winning candidate in all 50 states plus (and Washington D.C.).

Race	Real Clear Politics	Princeton Election Consortium	Five Thirty Eight (Polls Plus)	PredictWise (Fundamental)	Sabato’s Crystal Ball	My Model
2008 Winner Accuracy	0.96078	0.98039	0.98039	N/A	1	0.98039
2012 Winner Accuracy	0.98039	0.98039	1	0.98039	0.96078	1
2016 Winner Accuracy	0.92157	0.90196	0.90196	0.90196	0.90196	0.88235
Average Accuracy	0.95425	0.95425	0.96078	0.94118	0.95425	0.95425

As you can see all the models do a similar job at picking the winner in each state, which predicts the electoral college. There are other ways to compare accuracy, but I don’t want to discuss this here since it gets a little technical. No one was right for every state in every election. It would probably be impossible to create a model that would consistently predict the winner in all states, because of the variability of political opinions. Election prediction is not an exact science. But there is the potential to apply polling analysis to estimate public opinion on certain issues and politicians. Right now the errors in polls are too large determine public opinion on close issues. But further research could determine ways to reduce error in polling analysis.

Only You can Prevent Bad Political Polls

My research relies heavily on polls. So I understand why it is important to do polls. If I see a poll and determine it’s well written, I do it. But I think this position is rare because people don’t know the importance of polls. I want to explain why I think polls are important. Pre-election polls are commonly used to predict elections, and favorability polls are often used to judge a politician’s popularity. Polls are an important part of American politics.

I get that polls are annoying. I know it takes time and you are probably busy (like me). But doing 1 political poll a year can greatly help improve the accuracy of polls. You don’t have to always answer a poll, but increased participation in polls improves accuracy. Now there are a lot of bad polls, and it’s difficult to tell if a phone poll is good based of the phone number. Some people have “polls” that really are marketing calls. I understand if you are hesitant to do phone polls. But internet polling provides a good alternative. I think the future of polling is quality internet polls. When you do an good internet poll you know more about the quality of the poll then a poll phone call. But Internet polls from scientific polling agencies require a large base of people to create accurate samples. You can randomly call 1000 phones, but you really can’t send 1000 random internet users a poll. To combat this problem polling agencies have databases of users to send polls. Polling agencies send surveys to certain users to create a good sample. Joining a survey panel with political polls is a way to get your voice heard.

My view on participating in political polls is you can’t complain if you don’t participate. Polls need a diverse sample to be accurate. If you feel your political stance is not heard in the polls, then you should do more polls instead of less. We need all kinds of people to do good polls. Not everyone may have internet access, but enough voters do to create a good sample. What you can do is join a poll panel. My two recommendations are https://today.yougov.com/ or https://www.i-say.com/. They also do non-political polls and market research which are also important (I might do a post later on this). I recommend them because they are user friendly and statistically sound. I am not receiving anything for recommending these agencies, I just think they are good.

If you want polls to be more accurate, the best (and easiest) thing to do is participate in polls. As a statistician, I value good data. But for data to be good it needs a representative sample. Regardless of your politics, you should participate in political polls.

A look at Alternatives to the Current Electoral College Process

First, I want to be clear, that there is no universally fair way to elect a president. All methods have pros and cons, and you can have your own opinion about which way is the best.

Current System

Right now with the exception of Nebraska and Maine, the electoral college is decided by whoever has the most support in a state. The winner usually has a majority of votes, but sometimes no single candidate was a majority. This method also helps smaller states as they have a lower ratio of voters to electors than larger states.

Pros

This method makes it easy to determine the winner on election night. You don’t necessarily need all the votes to come in if you have enough information to predict the winner.

Cons

Most states have a clear winner party. So most of the attention goes to swing states who do not have a regular winner.

Popular Vote

The popular vote method is based on the winner of the popular vote. Whoever gets the most votes wins. This method can be implemented if enough states change their laws to award their electors to the popular vote winner.

Pros

Every vote counts the same. Larger states would have more power than the current system.

Cons

Smaller states lose some electoral power compared to the current system.

Congressional District System

This system awards 2 electors to the state winner and 1 elector to the winner of every congressional district. This is the method Maine and Nebraska use.

Disclaimer: This is my personally prefered system.

Pros

It’s a compromise between the current system and the popular vote system. The electoral college would probably mimic the congressional makeup.

Cons

Like the current system, could elect a president that didn’t win the popular vote.

All of these systems have pros and cons. There isn’t necessarily a “best” way to pick the president.

Here is a Five Thirty Eight article about different methods of deciding the electoral college.