The Numbers don't Lie

Model Update 10/2

Note that this update doesn’t account for the President’s positive covid test yesterday. I don’t what that means for the race.

This week’s map:

Scale (in terms of estimated win probability):

Scale: 0-.05 Safe Red (darkest) 0.05-0.15 Likely Red (second darkest) 0.15-0.25 Lean Read (light red) 0.25-0.75 Tossup 0.75-0.85 Lean Blue (lightest blue) 0.85-0.95 Likely Blue(second darkest) >.95 Safe Blue (darkest)

Biden/Trump are likely to lose about 1 in 4 of his lean states and 1 in 10 likely states. So this means Biden will probably lose 22-46 electoral votes from this map, but him winning only 300 electoral votes would still be possible due to correlated errors. Except for Texas, the likely and lean red states are labeled because of insufficient polling data.

The 95% intervals should be accurate at this point in time.

This map changed from last week. The predicted vote share hasn’t changed, but a minor bug messed up the uncertainty estimates last week.

Analysis

I have no idea how Trump’s covid infection will play out. I think if he is lucky and has a mild case, then there wouldn’t be a lot of changes. But if he dies or has complications that prevent him from finishing his term or running for re-election then I think anything could happen. I hope he recovers, but his age and weight make him high risk. So for now I’m watching and waiting.

The Model Review: 9/25

After fitting the model for a week, it’s time to look at the model.

Every week, I’ll include a model release that has the historical accuracy of this model at this time in 2008-2016. These will be called: weekxsummary in the google drive. The first one is here.

Here is an approximate map:

First up a couple of cautions. We still have thirty-eight states with less than 5 polls. At that point, the model will be very uncertain due to a lack of data.

These are the states with less than 5 polls that should be taken with a grain of salt: AK, AL, AR, CO, CT, DC, DE, HI, ID, IL, IN, KS, KY, LA, MA, MD, MO, MS, MT, ND, NE, NH, NJ, NM, NV, NY, OK, OR, RI, SC, SD, TN, UT, VA, VT, WA.

DC is a very odd election to model. I’m going to come up with something special for it since it is incredibly partisan and not like any of the states. I would ignore it.

This map comes from estimating the proportion of stimulations the model had Biden above 50%. Right now there is uncertainty in the model since it’s not a forecast so my labels are very cautious.

Scale:

<0.05 Safe Red

0.05-0.15 Likely Red

0.15-0.25 Lean Read

0.25-0.75 Tossup

0.75-0.85 Lean Blue

0.85-0.95 Likely Blue

>.95 Safe Blue

The reason my tossup category is so large is that there is plenty of room for shifts of 1-2 points in the model over time and if that happens in certain directions these states could flip.

If we have at least 3-5 polls per state the probability estimates are generally reliable individually. When you have very few polls there is a lot of uncertainty because individual polls are random. Roughly for every 5 states that are lean for a candidate about one will be won by the other party. Roughly for every 10 states that are likely for a candidate, about one will be won by the other party.

Analysis:

This is a really good map for Biden. I think my uncertainty estimates are matching the historical error well. In most of the likely states for Biden, the historical error is less than Biden’s margin in the past three elections. I’m a bit worried about the craziness of 2020 affecting poll accuracy and the effects of mail-in voting. If enough mail-in ballots get thrown out, that could throw the election if it is close. Polls aren’t going to capture the effects of rejected ballots. We also don’t really know how many rejected ballots to expect. I’m cautiously optimistic that the model will do as well in that past. If this model does as well as even 2016, Biden has a high probability of winning, but it’s hard to accurately pin down that number. If I had to pick a number I would say probably in the .80-0.90 because you would need a massive polling failure probably combined with a real shift towards Trump. Both of those things aren’t that likely.

My 2020 Model

First up I want to be super clear this is NOT A FORECAST or a prediction of what happens on election day. This is a polling aggregation model. Think of it as a fancy Real Clear Politics average except that this model comes up with good estimates of uncertainty and is a little better at predicting the final outcome. This model only predicts well the election at the very end of the cycle. At about six weeks before the electio

This model may not be final. I am going to test a few new features on historical polling data. If they work they will be added in to the 2020 model.

I want to explain why I do what I do.

For starters, I was a little bit surprised when the Economist came out with their model. It did some things I wanted to do. I agree with most of how it is structured. But I didn’t want to basically copy them. I wanted something novel.

One interesting thing I have discovered in my research is that FiveThirtyEight’s model is not that much better at predicting election outcomes (51% for Trump, 49% for Clinton) than basic polling averages where you average the last few polls. My models from my undergraduate research were better than a polling average but not as good as FiveThirtyEight. I wanted to how accurate could a Bayesian election model be that could be run on a standard laptop in a couple of minutes.

Data Inclusion Criteria

I am using the Economist’s I assume that the results in one poll don’t affect the results in another poll. One type of election polls are tracking polls where they interview the same people multiple times. Tracking polls depend on the previous poll results so I exclude them.

What I Learned From 2016

Recently I have been working on a chapter of my dissertation about polling accuracy and modeling polling error. I’ve done a big analysis on state-level Presidential polls from 2008-2016 and I will talk about that in a later post. Looking at the data from 2016 has been a chance to really dive into the data I collected over the years and reflect on what happened.

In 2016, I was a college sophomore. I was in my first mathematically based statistics class. Now I’m a third year Statistics PhD student with most of my classes completed. I’ve grown a lot as a statistician over these past four years, and I think I still have a little more growing to do. But I’ve compiled three things I’ve learned.

Don’t forget that the general public aren’t experts
Embrace Uncertainty
Reflect on the Mistakes of the Past to Prevent Them in the Future

Don’t Forget that the General Public Aren’t Experts

I’ve always viewed election modeling as a pathway to public engagement and education about statistics. Still, you have to remember that not everyone knows what margin of error means or understands how to interpret a probability. Formal statistical training often focuses on the limitations of statistical modeling and how it will always have uncertainty. But people often equate statistics with mathematics as something that provides the same solution with every attempt and that it won’t be wrong. If you are going to put a model out to the public there, you need to explain it in a way that can both be understood by a non-expert, but still provide the information experts need to evaluate your model

Embrace Uncertainty

Polling and polling averages are limited in their precision. If you have an election within a point, polls will not be able to predict the winner. Models have more precision, but there are many examples of cases where they, too, are uncertain. Polling data can be limited, and elections happen once. Sometimes we can’t know who will win. Additionally, it’s hard to evaluate probability predictions because there is only one 2020 presidential election. It is important to be open about the uncertainty in your model and the potential for error.

Reflect on the Mistakes of the Past to Prevent them in the Future

The polling and election modeling world has spent a lot of time reflecting on the 2016 Presidential election. The errors in 2016 were a mixture of insufficient data, some bad methodological choices, and bad luck. Some of the methodological mistakes are easy to fix, but some of the important ones are difficult. A big issue in 2016 was weighting the data for education because non-college-educated individuals are less likely to respond. But it’s not clear how to weight the data well when the composition of the electorate is constantly changing and turnout is hard to predict. But pollsters and modelers know what the challenges and know how polls have performed in the past which can help us access the likely accuracy of polls. We can learn from 2016 so that our models and polls can be the best they can be.

I hope that I’m not writing a piece like this in 2024. I hope that the public will listen to the experts on what typical polling accuracy is and not blindly assume that margin of error covers everything. I hope that the polls and models do well and that polling will be something that the public values and trusts. But now all we can really do is wait and see.

The Nuance of Polling

I’m an election modeler, and my entire dissertation is focusing on analyzing public opinion polling data in one form or another. I love polling. Often on this blog or on twitter I’m cautious about a new poll or what an election model can actually tell us. So I thought perhaps I should explain why polling is important even if it may not tell us who is going to be the next President.

I feel there is an imbalance on how polling is viewed. Some approach it as being completely certain and if it is outside the margin of error it is impossible. Others dismiss polling because they can’t understand how one thousand people can tell us what the entire country thinks or that 2016 showed polling was a failure. But neither of these views is accurate.

The truth is polling remains our only rigorous and mathematically grounded tool to estimate public opinion. Elections can be forecast using economic and other data but that is only because the true proportion voting for a candidate is eventually known. But polling can tell us what percentage of individuals approve a certain policy or unravel how an individual’s policy preferences to prevent terrorism are related to their risk assessment of future terrorist attacks (as I’ve done in a recent project). We can understand how and when people’s opinions do and don’t change.

Polling isn’t a magic problem solver. The results from a poll can not be treated as 100% correct. Polling has error. Sometimes that error puts us in positions where all we know is that a race is too close to call or that the country is evenly split in its support for a policy. We have to acknowledge that margin of error won’t solve all our problems and that polling is hard work. It’s not easy to predict who is a likely voter or decide between an expensive phone poll or a larger internet panel or try to determine why someone left a question blank.

It is possible for polling to be very important because it signals to our government what the people want and that sometimes polling doesn’t give us a clear answer. It’s possible for polling to “be wrong” just by random chance. But it is also possible it gives us a clear answer. Often, it gives us something to point to as important for the government to act on in a way that is far more representative than calls to a congressman or your friend’s opinions or social media comments. If followed by leaders, polling could be a pathway for a more direct democracy without forcing every citizen to give opinions on every issue.

This election, it’s important to embrace the nuance in polling. Every poll is unique and needs to be interpreted holistically considering when and how it was conducted. Every poll on the same issue or election should have different results and that’s expected and ok. Polling is usually going to be off by a handful or two of percentage points, but sometimes the message is clear because the support is so strong or weak. But polling can give us answers when nothing else will, and for that, it will always be valuable.

Unraveling Polling Error (with GIFs and no math)

You can’t understand what polls mean until you understand how they work. The most common misunderstanding people have is about what margin of error means and what type of error it covers. The margin of error doesn’t tell the whole story. Polling error has many different components. Some types of errors are easy to predict, but others can be impossible to guess. I’m going to talk about the three most important sources of error in polling broadly but focusing on election polling. I want to do this so that you better understand why margin of error is not the only type of error and that polls work well all things considered. This post is not about how polls are wrong, it is about how they are right in the midst of numerous challenges.

Three Types of Error:

Sampling Variation: effects of using only a subset of a population
Incorrect Responses and Changes in Opinion: respondent to the survey intentionally or unintentionally does give the correct response or later changes their mind
Miscoverage: the population that was sampled was not the population of interest

Sampling Variation

Possibilities GIF - Possibilities EndlessPossibilities SpongeBob GIFs

One concept people struggle with in statistics is variation. Statistics involves math but there is no longer one solution. If I solve an algebra equation again over and over again my answer shouldn’t change. Statistics is based on samples that are subsets of a population. If a collect a sample over and over again the numbers will be slightly different almost every time. In most cases, my estimate for the sample will not exactly match the true population. You can see this by flipping a coin a bunch of times. A US coin is going to be split relatively evenly, but when you start flipping the coin yourself you might not always get exactly 50% heads and 50% tails. This is because coin flips are random. Polls work in a similar way.

Sampling variation is a huge driver of polling error. It is completely mathematically expected that a sample of a few hundred to a few thousand people isn’t going to tell us the exact true proportion of support for a candidate or policy. If we make some assumptions and adjustments we can calculate a quantity called the margin of error that gives us an estimate of how much randomness to expect. Margin of error only includes error from sampling variation and not the other two types of error I will talk about later.

Typically you are going to make the following assumptions:

People who were sampled but didn’t participate in the survey are not any different than those that did after you control for demographic variables. This is something that is obviously impossible to verify directly.
You have a decent estimate of the target population. The target population is who you want to poll, and the main target populations are likely voters (people thought likely to vote), registered voters (people actively registered to vote), and all adults (including people that can’t vote). The census gives us a pretty good idea about the population characteristics of all adults, and you can use this information combined with other information to get estimates for registered voters. Likely voters are hard to identify because the respondent isn’t always the best predictor of if they will vote, and turnout varies from year to year.

Assumptions are very common in statistics and sometimes it’s difficult to assess how reasonable an assumption is. You may have some doubts about both assumptions being valid. To a certain extent, we know that these assumptions aren’t completely true, but there is a concept in statistics called robustness. Robustness says that under certain conditions (usually large sample sizes) small violations in assumptions are ok, but it has to be acknowledged that violations in assumptions can affect results.

Depending on details about the poll the margin of error is usually about 2-5 points. I’ll omit the calculation because it gets complicated in practice because most surveys have complicated procedures (but statistically valid) to use the estimate of the population characteristics to adjust the poll to match it (this is called raking or weighting involves a lot of math). Now what does this margin of error mean? Typically you add or subtract the margin of error from each estimated quantity and this gives you a range of probable values. Theoretically, if no other types of errors exists (they do) and the assumptions hold and you had dozens upon dozens of polls, 95% of those intervals should can the true population proportions. But in election polls, it’s common to have undecided voters and while it is important to track undecided voters, they complicate things since undecided isn’t really a ballot option. A workaround in election polling is to look at the difference between the margin of candidates, double the margin of error and see if that interval contains 0, and if it does the election is too close to call based on this poll.

Incorrect Responses and Changes in Opinion

Glee Brittany Pierce GIF - Glee BrittanyPierce IsItTheTruth GIFs

People makes mistakes in all aspects in life, and polls are no exception. Additionally you can lie to a pollster. Unfortunately in most polls you don’t know if the answer from the respondent was accurate. If we didn’t have to ask the respondent to answer the question because we knew it already there wouldn’t be much reason for polling. There is no mathematical formula that tells us how to exactly adjust for lying or mistakes in surveys. Respondent mistakes and lying is not in the margin of error because it is too hard to exactly estimate. You can test the respondents ability to follow directions by telling them what response to pick on test questions (i.e. Select True for the next question) and if they get the test questions wrong you could throw them out. Individuals who later changed there mind fall into this category. Early in an election, there are going to be undecided individuals and individuals who later change their minds. Undecided voters are likely a large driver of polling error because it is unknown whether they will vote and if so for whom. Sometimes the people who decide later in the campaign vote differently than other deciders and this is believed to be a large factor in 2016 polling error. If a poll has a higher percentage of undecided voters thant the difference between the candidates, this indicates that the race can be competitive regardless of the margin of error.

Miscoverage

Make It Fit Tight Shoes GIF - MakeItFit TightShoes GIFs

There are a few ways to get a sample. Ideally, your sample is random but there are only a few ways to recruit survey respondents: call random phone numbers, select random addresses, or “randomly” recruit people on the internet, or for exit polls stand outside of polling locations and ask every nth voter. All of these sources of data are not exactly representative of the American voter. Selecting random addresses can be random and representative of the American adults but people move and mail samples are typically used to recruit people for future phone or internet polls because mail is slow. Not everyone has a phone or internet access. Statistical theory often assumes you have a perfect source to draw a sample from. We don’t have a perfect source, so we have a little bit more error. And standing outside of every single polling place is obviously impractical and it’s hard to select a subset that will be completely representative. When we use a sample that doesn’t exactly fit we don’t always get perfect results.

Conclusion

What Does It Mean What Does It All Mean GIF - WhatDoesItMean WhatDoesItAllMean GIFs

I haven’t listed every source of polling error but I wanted to give a few examples of types of error that help explain why margin of error doesn’t always match up with actual survey error.

You may be wondering if margin of error doesn’t provide the whole predict what do we know about polling error? Are polls reliable? What is the difference between margin of error and predicted error on average?

Thankfully we have a great large database from Huffington Post’s Pollster that can help us answer these questions. We have over 5,000 state-level presidential polls from 2008-2016 and over 3,000 have a listed margin of error. In most cases we care about the difference between the democratic and republican candidates as the main metric because it tells us who is leading. We know that the margin of error is double the standard margin of error because margin of error is for one candidate. For about 4 in 5 polls the true election margin was in the interval in polls the last 60 days before the election. But in 1 in 20 polls the observed error was 5 points higher than the margin of error. A key thing about this data set is the polls are not evenly distributed across years or states.

Statistical models like the ones I commonly build can help to predict a polling error given information about where, when, and how the poll was collected. While these models are helpful, they also have uncertainty. Usually polls can be used as signals of what races are competitive, but can’t always predict winners. It is helpful to take a conservative approach when looking at a poll and acknowledging the potential sources of error.

The key thing to remember is that polling is one of our best tools for evaluating public opinion in general. Sometimes people use other types of models to predict elections, but for non-election public opinion questions about policy or presidential approval, polling data is required for statistical analysis.

How to Interpret Election Polls

This is the first of my approximately weekly posts I’m planning about the 2020 election.

As election day approaches, polls are going to become more prominent. It’s important we carefully interpret polls. I suggest you stick to focusing on polls from poll aggregators (like FiveThirtyEight or Real Clear Politics) or those tied to prominent news organizations. Polls brought up by polling experts (myself included) are typically going to be good sources. But you can encounter polls out in the wild that are complete garbage and you should be skeptical of a poll from a website you have never heard of. I’m going to briefly talk about three things you have to always consider when you analyze polls:

Margin of error
Polls are not predictions
Outliers happen

Margin of Error

The margin of error is probably one of the most misunderstood polling concepts. The margin of error comes from a statistical formula that the natural randomness that comes from estimating a proportion for an entire population with a small sample. The margin of error is meant to be added and subtracted to a single candidate’s support. When we are talking about US elections, we normally care about the difference between the democratic and republican candidates (sometimes called the margin and written as Trump +x or Biden +y) and to examine that we must double the error. The reason for doubling the error is because in a poll with a margin of error of three points, Biden could be underestimated by three points, and Trump could be overestimated by three points, which leads to a six-point gap. The margin of error doesn’t cover the rare scenario where a respondent lies or makes a mistake. The margin of error calculation assumes the individuals who respond to a poll are not that much different than the population we are aiming to poll, and we can reach every member of that population, which isn’t exactly the case. The margin of error underestimates the polling error. It’s hard to quantify before an election how much margin of error underestimates the real error, but it is typically less than one percentage point in from my analysis I did on polling error (details will come later). If the difference between two candidates is less than double the margin of error, the poll does not provide enough information about who is winning, and this signals the race is too close to predict from just that poll.

Polls are not Predictions

Polls are not designed to predict elections. Polls are designed to estimate the percent of voters who support each candidate and what percentage of voters are undecided when they are conducted and not necessarily election day. The margin of error estimates the error between the poll and the true support for the candidates while the poll is conducted. People change their minds occasionally, and it’s hard to predict what direction undecided voters will go. A good guideline from research (taken from this book and replicated my own analysis) is that polls are not predictive of the election day result until labor day weekend. The predictiveness of polls improves over time. For example, my model will start collecting data on September 6th and will hopefully be fit by September 22nd, which is about 45 days before the election.

Outliers happen

Occasionally we will see a poll somewhere that is different from other polls. Strange poll results do happen, and you can’t say there has been a change in the race until multiple polls from different pollsters have similar findings. You should look at multiple polls when you check the state of the race. I like to look at two poll aggregators: FiveThirtyEight and RealClearPolitics. FiveThirtyEight is where I am planning to get my data from for my model. The tricky part about comparing two polls is you have to add both their margin of errors together to compare single candidates and double that number if you want to look at the difference between two candidates. Consider Poll A had Trump at 45 and Biden at 48 with a 3 point margin of error, and Poll B had Trump at 48 and Biden at 42 with a 4 point margin of error. The difference between Trump and Biden in Poll A is -3 with a margin of error of 3*2=6, and Poll B is +6 with a margin of error of 8. The margin of error to compare these polls would be 6+8=14, which means if the difference between the polls is less than 14, we would say that the polls aren’t showing statistically different results and the difference we observe could be explained by random sampling error. A lot of times, outliers aren’t really outliers after you adjust margin of error to fit your comparison.

Now you have learned some basic tools to critically analyze polls on your own to follow the races that matter to you. There are many more things you can do with polling, and the analysis I do is far more complicated than this. I’ll continue to write about polls and the election if you want to learn more.

Open Letter to College Students about Online Classes

Dear College students whose classes have been moved online,

Please have grace for your professors/instructors during this difficult transition to online classes. We probably got just as much notice as you students did. Some of us (like me) our students ourselves wondering how this thing is going to work.

I teach 51 students an Intro to Statistics class. Even before it was official, I started prepping for teaching online on Wednesday. I looked into software options for lectures, considered how to give exams online, and thought how to deal with less class time. I still am working out how to do things. I anticipate that this will be more effort for me than teaching in person, but my pay will not change.

This next week or however many days your professors got to prepare will be spent working. This is not a second spring break for us. There are decisions to be made about office hours and homework and exams. Homework and exams to convert. Software to be installed. Papers to be scanned. Test lectures to be recorded. Plus for professors and grad students, our research isn’t going to get a break either.

I’m sorry that seniors attended their last class without knowing. I’m sorry your events and activities have been postponed. Our lives as instructors are also interrupted by the virus. Our conferences are cancelling. We are discouraged from holding large meeting in person. Social events for faculty and grad students have been delayed or canceled.

Overall, teaching takes much more time than the three hours each week spent in front of the class and office hours. I don’t think anybody has all the answers yet on how this will work and what the best thing is to do. We don’t know if zoom and blackboard are up to the challenge of much more usage. We don’t know what to do for students without reliable internet or computer access. Sure online classes have existed in the past but moving classes that have never been taught online to online takes a lot of work. But we are trying to do what is best for the communities we live in by shutting our physical classroom doors while also trying to do what is best for our students. I think everyone would prefer a world where there wasn’t a pandemic and life went on as normal but that isn’t the world today. But let’s have grace for all the people in our lives as difficult decisions are made and things change.

New Hampshire Prediction Correction

There the 95% margin of error were missing a decimal place and the margin of error were not properly converted to variance estimates in the Iterative model which changed the prediction and standard deviation of the iterative model. The corrected predictions are above.

GOP Primary Prediction: New Hampshire

I have decided to predict the GOP primary. Although I know Trump will almost certainly win, I want to evaluate my model. Many states have canceled primaries and polling data is highly limited which constrains what I can do.

My data source is FiveThirtyEight. I am also finding the margin of error for those polls so I can do my iterative model. My prior is based on the national polls. I will only predict states with three or more polls after December 1st. Texas may be the only other race that will meet that polling requirement. I only include polls after December 1st because I want to exclude the polls with Sanford, who has since dropped out. The candidates I am tracking are President Trump and Bill Weld. These are the only candidates currently in the race who appear in the polls. There are some other candidates on the ballot in some states that I am ignoring. Joe Walsh ran until last Friday and appeared in many polls. I am assuming that 75% of these supporters will vote for Weld. I believe that if you preferred a long-shot candidate over Trump who then dropped out, you are likely to support any candidate that is not Trump, but there is also a chance you will not vote.

I am using the Gaussian Iterative and Gaussian Polls models from my JSM Proceedings paper. The Gaussian Iterative model is my preferred method, but I had to drop two polls because I could not find their margin of error, which is important to the iterative model.

Predictions: Trump will win all the delegates because Weld will likely not surpass the 15% delegate requirement. Below is a table with my predictions. The last column is the approximate margin of error of this model, which since my model is Bayesian, I am predicting a 95% chance that the result is within that margin.