Elections, Public Opinion, & Voting Behavior
 
home
Section Officers
news
awards
teaching
conferences
resources
archives
Symposium on Mid-Term Elections

 

Generic Polls. Late Balancing Acts, and Midterm Outcomes:  Lessons from History for 2002.

Robert S. Erikson and Joseph Bafumi

Columbia University

Can the Democrats gain House seats in 2002?  Can they gain enough to wrest control of the House from the Republicans?  On the one hand, the presidential party almost always loses seats, although not in the most recent instance, 1998.  On the other hand, the president is rather popular in Fall 2002, despite a slumping economy.  How can one forecast with such competing signals?

Suppose we consult the polls for guidance.   By this we mean the generic polls which ask survey respondents which party they plan to vote for in the House elections.  In 2002 there have been numerous generic House polls.   Throughout the long campaign, these polls have been extremely close.  Some recent polls are shown in Table 1.

 Table 1.  Generic House Polls, Early October 2002. 

Poll

Dates

Universe

Republican

Democrat

Newsweek

Oct. 10-11

712 LikelyVoters

43%

46%

Marist Poll

Oct. 9-10

769 Reg. Voters

44%

39%

Fox News/Op. Dyn.

Oct. 8-9

900 Likely Voter

42%

40%

Werthlin

Oct. 4-7

869 Reg. Voters

41%

43%

CNN USA Gallup

Oct. 3-6

606 Likely Voters

47%

48%

Pew

Oct. 2-6

1158 Adults

44%

46%

CBS News

Oct. 3-5

304 Likely Voters

43%

46%

Poll of Polls

 

 

43%

44%

Source: Pollingreport.com

The polls suggest an extremely close election. But how good have generic polls been as augurs of past elections?   Are they noisy or have they generally predicted the vote well?  Have they been vulnerable to last minute trends?  Have they been systematically biased  so as to be suspect of favoring one party over the other?  This short paper explores the forecasting potential of the generic polls based on the historical record with an eye toward predicting 2002.

We have gathered the record of numerous generic house polls, going back as far as 1946, from Gallup and (more recently) many other houses, using the Roper Center, pollingreport.com, and Moore and Saad (1997) as sources.   Some are early in the election season, others toward the end.   They variously report the vote intentions of prospective voters in samples of adults, registered voters, and “likely” voters. 

For this analysis, we analyze generic polls conducted no more than thirty days before the election.  For each of 14 midterm election 1946-1998, we compute the percent Democratic of the major-party vote in polls conducted over the last thirty days of the campaign.  We adjust for type of sample—adults, registered, or likely voters—by using our best estimate of the partisan differential due to sample type and then projecting what the results would be if the poll were a “likely voter” poll.   In practice, each adult sample is adjusted to make the generic vote 3.83 percent more Republican than reported and each registered voter sample is adjusted to make the generic vote 3.10 percent more Republican  than reported.[1]   The median number of polls over the last thirty days is four and the median N (Democratic plus Republican vote intentions) is 1,945.   We summarize the generic preferences as the percent Democratic of the two-party vote in the generic poll.[2]  

As a dependent variable, we have the actual major party divisions measured  as votes but also as seats, the ultimate target.   To aid the assessment of possible bias, we measure each vote and seat variable (and their lagged values when applicable) not on a zero to 100 percentage scale but as a percent deviation from the equal division, 50% Democratic and 50% Republican. We look at midterms and ignore presidential years. 

From the literature (Erikson and Sigelman, 1995; Moore and Saad, 1997), it is known that the answer to the question, “how accurate are the generic polls” must be highly nuanced.  They perform poorly as point estimates.  For instance, an 18 point Democrat lead would probably translate into a far lesser vote lead on election day.   However, they perform well as predictors in regression equations predicting votes or seats.  And here we will show that they perform extremely well in regression equations when other variables—most notably the presidential party—are also included.

Table 2 shows some regressions using as a predictor the generic polls over the last 30 days of the campaign in the 14 midterm elections, 1946-1998.   Column 1 begins by showing a regression of the vote on the generic poll result, accounting for 77% of the variance.  Column 2 shows the same regression but with the percent of seats held by the Democrats as the outcome variable. This model performs about equally well as the first with 76% of the variance explained by the generic ballot poll.  Can we do better?

Almost as much variance in the outcomes can be explained by another variable acting solo in the regression equation—the president’s party.   On average, each party gains the fewest votes and seats when it does not control the presidency.  A likely explanation is ideological balancing.   With balancing, some segment of the electorate hedges its ideological bets (to get more moderate policies) by tilting against the presidential party at midterm (Erikson, 1988; Alesina and Rosenthal, 1995).  Our question is, do generic polls absorb this balancing behavior, or do balancing cognitions ignite mainly late in the campaign—between the last poll and election day? 

The interesting answer is that balancing historically has worked to the out-party’s favor apart from the generic poll verdict.  That is, by incorporating the presidential party and the generic polls together  it is possible to boost the explained variance upward. Columns 3 and 4 below show regressions for our two dependent variables when the presidential party variable is included.  Our new predictor is coded –1 in years when Republicans occupy the White House and 1 in years when the Democrats do. The explanatory power in these regression equations substantially outpaces the explanatory power of the equations in columns 1 and 2. Our new model can explain 91% of the variance in the national vote.  Including the president’s party also helps predict the percent of Democratic seats held in the House, allowing 80 percent of the variance to be explained.

It also helps to include the lagged dependent variable. Column 5 shows that including the lagged vote as a predictor improves the prediction of the vote  slightly, with an adjusted R squared of .92.   Adding lagged seats improves the prediction of seats, with an adjusted R squared of .86.[3] 

Votes and seats are all measured as deviations from 50-50.  Thus, the intercept has special meaning as a measure of potential bias.  Intercept estimates are small, tend to be positive, and are sometimes significant.  This means that there exists a small Republican bias to the likely voter polls; if the generic polls are at 50-50, the Democrats should expect to win about 52 percent of the vote and perhaps a few greater percent of the seats.  (By comparison, surveys of registered voters or adult samples have no partisan bias.)  The slopes for the vote equations are appreciably less than unity,  indicating that the majority party’s lead in the polls compresses by election day.

We could show more.   The presidential-party effect holds nicely if we use only the last poll of the campaign.  Strong predictions are maintained if polls are measured as aggregates over a longer period than 30 days.  It matters little whether we adjust for sample type or lump adult, registered, and likely voter surveys together. Based on regressions predicting the generic vote (not shown), the generic polls absorb party identification and are only mildly related to presidential approval.   Neither  party identification nor  presidential approval nor the economy matter when generic poll preferences are in the equation.  To the extent these variables matter, they are absorbed by the generic polls.

What are the implications for 2002?   The generic polls have been close for the entire campaign—yielding a value of a virtual even split of the two party vote.   In past elections the generic polls have been extremely constant throughout the campaigns, as if the fundamentals of the election are decided very early.  Thus history of stability suggests little further change in their values in 2002. .  The important exception is that the party of the president historically kicks in late in the campaign, to boost the out party beyond its yield from the generic polls.    Finally, likely voter polls (but not registered or adult polls) are mildly biased to over-report Republican votes.  If the generic ballot is 50-50, the expected vote is a slight Democratic tilt plus another mild tilt against the presidential party (of equal magnitude for Democrats and Republicans).  

By this reason, the Democrats should be expected to win slightly more votes and seats than the Republicans, even if the generic vote in the polls persists at 50-50.   Applying the equations in columns 5 and 6 of Table 2, a 50-50 generic vote would yield the Democrats about 53 percent of the vote and 54 percent of the seats.   This possible outcome would be roughly the mirror image of the actual result when the Democrats violated the midterm loss rule in 1998.  The generic polls of 1998 showed virtually a dead heat, with a slight tilt to the Democrats; but the out party won the most votes and seats by a slight margin, as the Democrats gained seats but not enough to regain Congress.   

Table 2.  Votes and Seats by Generic Poll Results, Party of President, and Lagged Dependent Variable,  Midterm Elections 1946-1998.

 

Dependent Variable

Dependent Variable

Dependent Variable

 

1

% House Dem.

Vote

2

% House Dem.

Seats

3

% House Dem.

Vote

4

% House Dem.

Seats

5

% House Dem.

Vote

6

% House Dem.

Seats

Poll Results

(%D using  adult polls in last 30 days)

0.63

(0.09)

1.20

(0.18)

0.50

(0.07)

1.07

(0.19)

0.47

(0.06)

0.89

(0.18)

Presidential Party

(1=D,-1=R)

 

 

 

-1.41

(0.33)

-1.62

(0.94)

-1.54

(0.32)

-2.79

(0.93)

Lagged Dependent Variable

 

 

 

 

 

0.15

(0.10)

0.37

(0.16)

Constant

1.31

(0.50)

4.46

(1.01)

1.58

(0.32)

4.75

(0.95)

1.28

(0.36)

2.30

(1.30)

Adj. R2

.77

.76

.91

.80

.92

.86

Root MSE

1.72

3.45

1.10

3.20

1.03

2.68

N

14

14

14

14

14

14

All vote and seat variables measured as deviations from 50%.  Standard errors are in parenthesis.

For illustration, Figure 1 plots the vote by the pooled 30-day poll results. Figure 2 plots seats by the same poll results.   In each, two lines are drawn; one for the equation when Republicans occupy the White House and another when Democrats do. First, the linear relationship between poll results and the vote or seats held is evident and strong. Second, notice how midterm years when Republicans occupy the White House show boosts for the Democrats, compared to when the Democrats hold the presidency.  Most important, the pattern rarely fails.  If one draws a “regression line” halfway between the lines for a Republican and Democratic presidency, in almost every case and with only marginal exceptions,  the “out” party did better than the late generic poll projection assuming a neutralized presidential party effect (i.e., if the presidency could be halfway between Democratic and Republican).   The out party surges beyond what the polls would suggest, perhaps as voters complete their “balancing.”

Of course, any forecast for  2002  is problematic, given the unique political conditions  and the recent redistricting that has favored incumbents and possibly Republican incumbents most of all.  Still, the consistent historical pattern is a late current favoring the out party even beyond what the generic polls show.  That should be good news for the Democrats. 

 

Figure 1:   House Votes by Late-Campaign Generic Poll Preferences,  Midterms 1946-98.

Figure 2:   House Seats by Late-Campaign Generic Poll Preferences,  Midterms 1946-98.

References:

Abramowitz, Alan. "Who Will Win in November? Using the Generic Vote Question to Forecast the Outcome of the 2002 Midterm Election.”  Internet posting, October 2002.

Alesina, Roberto and Howard Rosenthal.  1995. Partisan Politics, Divided Government, and the Economy.  New York: Cambridge University Press.

Erikson, Robert.  1988. “The Puzzle of Midterm Loss.” Journal of Politics. 50: 1011-29.

Erikson, Robert S. and Lee Sigelman.  1995. "Poll-Based Forecasts of Midterm Congressional Elections:  Do the Pollsters Get it Right?"  Public Opinion Quarterly.  59: Winter 1995, pp. 589-605.

Moore, David and Lydia Saad. 1997. “The Generic Ballot in Midterm Congressional Elections: Its Accuracy and Relationship to House Seats.” Public Opinion Quarterly, Vol. 61, No. 4. (Winter, 1997), pp. 603-614.


[1] For these estimates, adult polls, registered voter polls, and likely voter polls (within 30 days of the election) were thrown into one equation where the dependent variable is the generic vote and the independent variables are year and sample type dummies. 

[2] In instances where the same poll is reported by two sample types (e.g., among likely voters and registered voters separately), we used the more screened sample (likely over registered over adult), even though this reduces sample size.

[3] This is similar to Alan Abramowitz’s (2002) report on the Internet.  Abramowitz models seats as a consequence of  the vote intention of likely voters in Gallup’s final generic poll plus (like us) the presidential party and lagged seats.  The difference is that Abramowitz uses the last poll, whereas we utilize all polls of the last 30 days. 

 
-
APSACentennial
Kathy Dolan, Communications Director
Eben Christensen, Webmaster
Created November 1, 2000
Last updated: October 10, 2002