Thursday, October 30, 2008

National and State-Level Factors in US Presidential Election Outcomes: An Electoral College Forecast Model

The following is an electoral college forecasting model that grew out of a paper Paul-Henri Gurian and Damon Cann first presented at the Western Political Science Association meeting in San Diego, CA this past March. The inherent value of that paper was its power in explaining that the variation in the two-party vote shares over the last 15 presidential elections (1948-2004) was based on a combination of national and state-level factors, the latter of which were separated into long- and short-term influences. It is a natural extension, then, to utilize the data from those 15 elections to project the 2008 electoral college outcome.

What follows is a brief summary of the model and a discussion of some of the issues both Paul and Damon see in it. As Paul said, "
The forecasts in the paper are really preliminary. However, if we wait a few months, till we've re-specified the model, it won't be a forecast anymore." Questions, comments and concerns can be left in the comments section. I will forward them to Paul and Damon.
There is no shortage of presidential election forecasting models, academic or otherwise. In 2008, there are at least 15 political science forecasts, the average of which shows Obama winning approximately 52% of the two-party vote. Most rely on some combination of economic factors, presidential approval and/or incumbency to explain vote shares in presidential elections. Those factors are completely national in scope and what is lost in the process are many of the relevant state-level variables that could play a role in determining the electoral outcome. To be sure, there are also forecasting models that include state-factors, but what Paul Gurian and Damon Cann have done is to draw a distinction between the long- and short-term, state-level influences. [You can view their forecasting paper here.] In much the same way that the past polls in FHQ's weighted averages serve as an anchor to the short-term fluctuations in state polling, the long-term factors included in this forecasting model allow for historical, state-level factors to serve as a baseline of sorts for their forecast.

Those same national factors, then, are included, but are buttressed by short-term, state-level impacts (state primary divisiveness, home state, home region, etc.) as well as some of the more historical, state-level influences (state partisanship and ideology, etc.) that play a role in explaining the variation in the shares of the two-party vote. [A more thorough description of the state-level factors can be found on p. 6-7 in the paper linked above.]

The beauty of this is that you get 51 different forecasts, not just one on the national level. And that is certainly more suitable to the electoral college system. Based on the included variables over the last fifteen presidential elections, a projection of the two-party vote in each state can be made. The results can be found on p. 10-11, but a map of those results is included below. [No, I can't help myself. I have to include a map.]
[Click Map to Enlarge]

The result is a rather close outcome between John McCain and Barack Obama. The line between a solid and a toss up state is whether a state's division of the two-vote is within the margin of error. You'll no doubt notice that there are several states that are on opposite sides of where they may be expected given other forecasts and projections. Iowa, New Hampshire and New Mexico, for example are shaded in red while Arkansas, North Dakota and West Virginia appear in Obama's column.

Here are some caveats that Damon adds:
A few thoughts on the states:

NV: I think the "home region" variable swings the prediction for NV
toward McCain more than has actually happened in this instance.
Without that, McCain would still be in the margin of error.

AR and ND both had strong Democratic showings in House and Senate
races in 2006/2004, probably stronger than past history would suggest
for those states. Plus AR has the Democratic history from the "old
south" and our fixed effects may be picking that up a bit with the
1948-1970s elections.

I think NH is just a matter of history--while they went for Clinton in
'92 and '96, prior to that they only went to a Democrat once, Johnson
in '64. While NH has been battleground recently, our fixed effect
(based on all elections in the sample) moves NH just outside the
margin of error.

FL is probably similar. Like NH, most of the variables for 2008
suggest it ought to be perhaps R leaning but still battleground.
However, the fixed-effect for FL slides it about 2 points closer to
McCain.

I re-ran the model dropping the fixed effects, but that decreases the
general predictive power of the model by about 10%, seemingly
generating more error than it would eliminate.

Also, thinking about this statistically, since our margin of error is
based in the 95% level of confidence and we're making 50 forecasts, we
should actually expect to see 2.5 (OK, let's call that 2-3) of our
predictions that are significantly different from 50% by sampling
error alone. But since these errors are random, they should cancel
each other out in the EC tally (as long as it's not CA that is one
error and WY as the other).

I finally re-ran the model using national fatalities per 100,000
rather than state-level fatalities. The coefficient still comes out
insignificant statistically.

I want to thank both Paul and Damon for sharing this and I hope that we can get a good discussion going that will generate some helpful feedback.


Recent Posts:
The Electoral College Map (10/30/08)

Liveblog: The Obama Infomercial

Update(s): The Electoral College from a Different Angle

7 comments:

Unknown said...

It's interesting, and at some point I may take a look at the paper, but it does look strange. It's not just the states that "change sides," it's the states that are "solid" and don't seem like they should be:

Colorado, Virginia, Florida, Montana, and New Hampshire for instance.

There are also the pairs of states that seem odd:

How come Montana is solid McCain, but North Dakota is toss up Obama?

How come Virginia and Florida are solid McCain, but North Carolina is only lean?

The model is hooking on to some peculiar factors if it's doing that.

MSS said...

Seems like a model set up not to be able to cope with either a 'change' election or with other more recent trends that upset the longer-term patterns.

I'll stick with FHQ and 538, thank you very much.

Anonymous said...

One factor that is not included -- and one I perhaps should have described -- is the economic crisis isn't figured into this at all.

And the state partisanship variable -- one that is a part of their long-term state-level influences -- is one they are continuing to work on. More or less, the time frame being used is being reexamined.

Anonymous said...

Matthew,
When I finished putting the map together, that was my first thought as well. Again, I think much of it has to do with that state partisanship variable (operationalized as the average Democratic percentage in the most recent gubernatorial, Senate and House races). In the case of North Dakota and Arkansas, that meant an atypical number of Democrats. And I would wager that the House percentages may be swaying things in states like Virginia and Florida, where despite the Democratic wave in 2006, Republicans did alright because of gerrymandered districts.

Paul has mentioned that that variable and the one on national party divisiveness were the ones they were in the process of re-specifying.

MSS said...

Well, that's the general problem I have with multiple regression analysis, anyway: "respecify" things enough, and they are bound to look more "reasonable."

(This potentially plagues 538, where they frequently talk about "adjustments." Does it also plague this site? I'll refrain from comment!)

Anonymous said...

Matthew,
Respecification doesn't hasn't plagued FHQ's methodology, but we have tweaked things from time to time to make the averages a little more responsive to changes in polling. Responsive but not too responsive has been the mantra. I think we've got a pretty good balance now.

And I really like being on the conservative end of the spectrum. Now, that could mean that we just lag behind every other electoral college analysis, but I like couching it in terms of it really meaning something when our measure finds a change in categories.

...that a lasting change has occurred.

The fear is that we potentially miss out on states like Georgia or Arizona that make a last minute push.

MSS said...

Yes, the reasons you mention are why I prefer this site to 538. (Yet I really do like 538; Nate and Co. do a lot of very interesting things. I have been a BP subscriber almost from the start, so it was quite fascinating when 'Poblano' revealed his identity.)