Possums Pollytics

Politics, elections and piffle plinking

The Headline Forecast – regression prediction model.

Posted by Possum Comitatus on November 16, 2007

I’ve finally built the model to forecast the ALP TPP result. This gets a little stats heavy, so I’ll try to walk those folks through it that might find it hard going as best I can, but I’ll answer any questions you have in the comments.

What I used to build the forecast is the monthly average of Newspolls going back to 1996 when Howard was elected. The reason I use the monthly average is that it dampens a lot of the noise in the individual polls, and gives us a time consistent series of data that can be used for long term analysis.

I don’t trust the preference allocations for Newspoll, so what I did was construct my own based on the preference distributions for each election and let the preference flows adapt over time between elections, so the two party preferred vote from polls straight after an election used nearly all of the preference distribution from the previous election, polls from halfway between elections used a preference allocation based on half the previous election and half the next election, and polls just before an election had nearly all of their preferences distributed as they were at that impending election. For the 2007 election preference distribution I’ve simply used the 2004 preferences (which may slightly underestimate the ALP TPP, but not by very much).

The model itself is a regression model built specifically to forecast one month and only one month ahead.

The model is a little unorthodox because using polling data in a model is a little unorthodox to begin with, but the important thing here is that it works – even if it suffers a little econometric impurity in the process.

The variables I’ve used are split into two types.

Firstly, Dummy Variables – which are variables that have a value of zero or a value of one. What they let us do is measure how the level of the ALP TPP vote changed as function of specific periods time that represent events when we regress the ALP vote against them. You can get a gist for how they play out here:

The Dummy variables I’ve used are:

Dummyhhmoon – which is a dummy variable representing the Howard “honeymoon period” in 1996 as well as 2 months after every election. It has a value of one for the first 12 months of Howards government as well as for the two months after every election other than 1996. At all other times it’s value is zero.

Dummylatham – which has a value of 1 for those months Latham was leader and a value of zero for all other periods.

Dummyrudd – which has a value of 1 for the months Rudd has been leader and a value of zero for all other periods.

Dummyworkchoices – which has a value of 1 since November 2005 when Workchoices was in Parliament and the union campaign against it revved up.

DummyElection which has a value of 1 for the month an election is on and a value of zero at all other periods. I use this as an interactive dummy variable so I can emulate special election campaign effects with long term satisfaction rating changes.

Secondly, the other type of variables I use in the model are:

ACNALPTPP(-1) – which is the previous months value of the ACNielsen two-party preferred vote for the ALP. By using ACN, I can effectively anchor the forecast to the less volatile ACN series, while still using the Newspoll estimates and its qualitative data estimates in a consistent way without running into too many “house effect” issues that may be occurring in the Newspoll weighting.

PMDISAT(-1) – which is the previous months average of the Prime Ministers dissatisfaction rating using Newspoll data.

OPPRIMARY(-1) – which is the Oppositions primary vote in the previous month using Newspoll data. This lets the forecast ALP TPP vote adapt to the size of the ALP primary vote.

Then these two, which are probably the two most important variables in the model and fill very specific rolls.

((PMDISAT(-1)-PMDISAT(-12))*DUMMYELECTION

What this represents is the difference between last months PM dissatisfaction rating and last years PM dissatisfaction rating, but is only modelled during the month of an election.

So what it effectively does is modify the forecast of the model only in months that an election is on, and does so on the basis of the size of the long term change in the PMs dissatisfaction rating.

Similarly, our other complicated variable is:

(OPSAT(-1)-OPSAT(-3))*DUMMYELECTION

What this represents is the recent medium term change in the Opposition leaders satisfaction rating. It’s the difference between last months satisfaction rating and the satisfaction rating of 3 months ago – but is only modelled during the month of an election.

What it effectively does is modify the forecast of the model only in months that an election is on, and does so on the basis of the size of the medium term change in the Opposition leaders dissatisfaction rating.

What these two variables do is simulate the process of voters coming to a conclusion about who they will vote for in the month of the election, using long term changes in satisfaction and dissatisfaction with each party and its leader. It allows for “it’s time” factors and “he has certainly improved over the last year” and “he’s getting worse as time goes on” and “he wasn’t what I thought he was like” type factors to be accounted for in terms of the way they influence voter movement in an election, but through an error correction type mechanism.

So the Election Forecasting Model is:

forecastequation2.jpg

And we’ll use ordinary least squares regression to do the number crunching which turns out as:

forecastoutput11.jpg

What is important here is that all of these variables are statistically significant. This model explains about 76% of the variation in the Newspoll estimate of the ALP TPP vote since 1997, but it’s built with the aim of being more accurate for the election date than it is at other times via those two long looking variables.

Onto more of the forecast stats:

forecastalptpp31.jpg

That’s mainly for the stats people that shows the model does its forecast job extremely well, with very little overall error.

The forecasts this model produces don’t exhibit a lot of the polling overshoot that Newspoll experiences when a new leader comes along, or a Tampa and S11 shocks the system. But it still tracks the changes in the TPP vote for the ALP as we can see with the following graphic:

forecastgraph11.jpg

The blue line represents the forecasts the model produces for each period, whereas the red line shows the actual Newspoll TPP vote for the ALP that the model is attempting to predict. The model misses the troughs and peaks of most of the big volatile movements in the ALP vote because the underlying dynamics of satisfaction ratings, primary vote level, and importantly, the slow moving ACNielson in the previous month don’t support the overactive Newspoll during these periods.

So how good is the model using previous elections?

In 1998, the model predicted an ALP TPP of 50.82 whereas the actual result was 50.91

In 2001 the model predicted an ALP TPP of 49.15 whereas the actual result was 49.07.

In 2004 the model predicted an ALP TPP of 47.23 whereas the actual result was 47.20.

It’s actually more accurate at election times than ordinary periods because those two little complicated variables were arrived at to simulate the processes involved in voters coming to a decision in the campaign. 2001 and 2004 were very different elections with movements going in opposite directions during the campaign period, but the model estimated both results fairly accurately by any measure.

I took the approach of rather than building some error correction components into the model for every period, it only needed to be done for specific periods when elections occur. And it’s probably also worth mentioning that the model predicts each months vote based on last months figures.

So what is the forecast for the election?

An ALP two party preferred result of 55.15%

How that will split between the States will come (hopefully) by Monday.

AddThis Social Bookmark Button

add to kwoff

About these ads

91 Responses to “The Headline Forecast – regression prediction model.”

  1. kwoff.com said

    The Headline Forecast – regression prediction model. « Possums Pollytics

    I’ve finally built the model to forecast the ALP TPP result. This gets a little stats heavy, so I’ll try to walk those folks through it that might find it hard going as best I can, but I’ll answer any questions you have to the best of my ability….

  2. John V K said

    Holymoly what kind of fit is that.

    I knew I should have gone to maths instead of hanging out at the refec, drinking beer and chgecking out other curves.

    Hope you’ve padlocked it with a patent of some sort and a paper for peer.

  3. Don said

    That gives 94 to Labor, 54 to the Coalition, 2 independents on Antony’s calculator.

    Roll on!

  4. Rod said

    Gee, I wish that model of HMS Victory I made when I was 12 (about 40 years ago – well, alright, a bit more) had been half as accurate as that one, Poss.

    Beautifully done. I hope, think, (and expect) that you are pretty well right.

    Cheers

    Rod

  5. Tom said

    Very close fit! Pity MathsIS was in like 1968 :)

  6. psephoblog said

    Very elegant, Possum. Hard to see how you could be wrong. If we accept today’s ACN at 54% as accurate, and then allow for the Coalition’s self-inflicted disasters since that poll was taken, 55% seems to be the most likely outcome.

  7. Geoff Lambert said

    Very impressive, I liked the Latham variable- The Great Repellor.

    However, if you do a simple linear regression of just the pooled poll results for the campaign period, you get the same TPP at election day (55.2%), but with an R-sq of 0.97.

    I fret about all these projections, including my own. What happens if a bone-crushing handshake ensues and the arse drops out of everything? Can’t you factor in a random disaster variable, Possum?

  8. Possum Comitatus said

    Why thank you all, this is about Model Mk 85. It only dawned on me this morning that I was approaching the error correction issues from the wrong end. As is the way with these things, the model I had was becoming ridiculously complicated and unusable – then it dawned on me and this, which is a hell of a lot simpler, was born.

  9. Possum Comitatus said

    Thanks Geoff,

    As you know, I’m not a great believer in straight lines (not that there’s anything wrong with them, its just the way my brain works), so it’s interesting that we both came to almost identical conclusions using completely different methodologies.

    Random disaster variable – I need one of them when my better half cooks :mrgreen:

  10. Mercurius said

    Ummmmm, 42?

  11. flute said

    Bollocks Possum.

    From finance markets neural nets to day trader twat moving averages, it is very easy to map some simple formulas with tweak factors to fit data in the past.

    The result won’t be far off what you say, depending an what your margin of victory will be, but it will be bugger all to do with the predictive power of your stats.

    When I looked in the procelain this morning, it spelt a 5 and a 6. I’ll interpret those foul incarnations on November 25th.

  12. Country Kid said

    Just this morning I plumbed on 94 while trying to reassure my neighbour about labor’s chances.

  13. John V K said

    I wont argue the Maths.

    It relies on the smoothing of two fairly accurate pool samples over time and leaders. People very rarely vote for teams.

    But where the real test will be, will be after the election if the two polling systems keep their poll sampling methods similar.

    People vote on feelings and leaders. Polls only reflect feelings on leaders and how they feel they are doing.

    I’m not defending it or attacking it. But if the track is that good in the past something is working.

  14. Amaranthus said

    Poss, impressive fit, no dobut. But did you consider trying a few different GLMs of varying complexity and then evaluating Kullbak-Leibler distance via AIC or k-fold cross-validation where k=1? Essentially, I suggest you try model selection based on bias-adjusted likelihood rather than frequentist p-values. I dislike the phrase “seeking the most parsimonious model”, on the basis that parsimony is simply a property reinforced by the bias correction in AIC rather than an explicit objective of AIC, but it amounts basically the same thing.

    In a nutshell, is your model over-parameterised and thus are you risking fitting noise? It is hard to judge based on what you’ve presented. Or alternatively, if you cross-validate by splitting your data set at (say) end-2000, how does it go at independently predicting the rest of the time series (e.g. the 2001 and 2004 results). Or split at end-2003 and just predict to 2004 (and 2007). This would be a nice evaluation of true predictive verification rather than goodness of fit, which is what your r2 values and “forecasts” are currently showing.

  15. Harmless Cud Chewer said

    Possum, remember my comment about the standard model of particle physics, where they keep tossing in independent variables?

    I’m also tempted to repeat something I heard in a mathematical modeling lecture once when I was actually in attendance. Someone famous (at least in mathematics) being quoted as saying “give me 20 independent variables, and I can model and elephant.”

    I do hope your approach does have some ‘physical basis’.

    Anyhow here’s to your legend status :)

  16. Enemy Combatant said

    My compliments to the chef.

  17. The Finnigans said

    Poss,

    Don’t know much about history,
    Don’t know much biology,
    Don’t know much about a science book,
    Don’t know much about the statistical probabilities that I took,
    But I do know that I like: “An ALP two party preferred result of 55.15%”,
    What a wonderful world this would be.
    TQ.

  18. Amaranthus said

    @Harmless Cud Chewer

    …and give me 22 and I’ll make its tail waggle!

    Yep, famous quote amongst modellers and gets to the heart of my point above. See: http://users.fmg.uva.nl/ewagenmakers/2003/elephant.pdf

  19. Bruce said

    Hi Poss,

    Do you have an MOE (95% CIs) for your prediction. Overparameterised models will often give very wide CIs.

  20. Lomandra said

    Couldn’t have put it better, The Finnegans.

    If Possum’s right, we’ll have to have him crowned First Secular Saint of the New Republic. :)

  21. Alan H said

    Poss, some time ago, on PB, I just published a weighted average (weighted by sample size) of all polls by the four pollsters of all published polls this year up to that time. By accident it came out at 55.2, hence my prediction of 94 seats, in every tipping competition I have entered. Meaningless, but interesting, nonetheless.

    cheers,

    Alan H

  22. Well that is the most impressive looking calculations I have seen this election (and i have seen a lot)

    Are you saying you are working on State by State at the moment?

    That would be worth seeing.

  23. Charles said

    Unfortunately any curve can be predicted using the previous data if you know what you are predicting. You only have one unknown point, the next one, and we all know about curve prediction using overfitted data.

    It looks nice

    I think it’s going to be 55 because that is what it has been for about 8 months ( if you take away the polling error). It’s the stability that makes the prediction simple. It’s like the weather man predicting it will be sunny in the middle of the Simpson on election day.

  24. Possum Comitatus said

    Amaranthus:

    Rule number 1: Build you model specifically to fit your forecast horizon
    Rule number 2: See rule number 1

    In a navel gazing world, we could wax lyrical about the theoretical wonders of bayesian mathematics and all of its squidgy softness it contains till the cows come home, but the point here isnt to theorise about a unified model of psephology, it is to produce a one step ahead, a one month ahead forecast based on the way we know humans behave.

    With polling data, overparameterization stands out like dogs balls when you do it – polling data doesnt give you nice cuddly variables where you can sponge some statistical significance off the noise. It gives you great fat and useless p-values when you try it because the human behaviour underlying the polling data is deliberate, and focused.

    You are essentially modelling peoples decision making.

    I can’t split the data set and forecast ahead in *years*, it’s a model specifically designed to produce a forecast for 1 single month

    The next month.

    See rule number 1… or 2 ;-)

    Parsimony is one of those funny things that mean a lot in the lab (so to speak) but very little outside of it. It’s essentially a question of “how much of what you know effects a given thing are you willing to sacrifice just to get a theoretically pretty model?”

    As long as your significance is very tight with your variables, as long as the underlying explanations of WHY a variable influences your series is well founded (and in this case, historically accurate) and as long as you understand what you are doing, what you’re measuring, the horizon you are forecasting through and why you are measuring it a given way – parsimony quickly becomes a bit of a cliche (queue howls of derision coming from .edu addresses across the land :mrgreen: )

    You give someone 20 variables and they will be able to model an elephant?

    If you give them 20 variables that dont contain time (or polynomials thereof), not much of it will be statistically worth its weight in exhaust fumes, let alone the elephant.

    I’m a frequentist through and through with this type of stuff – because it works.Human behaviour has all been done before… in the eternal words of the great mathematical philosopher Mr Jon Bon Jovi – “it’s all the same, only the names will change”. ;-)

  25. Possum Comitatus said

    All roads seem to be leading to Rome Alan. I got the same result as you and Geoff, all of us doing different things.

  26. Amaranthus said

    Actually, re-reading what you’ve done in detail, I see you haven’t actually used Newspoll TPP as an independent variable, just ALP primary and satisfaction ratings. There will be a strong correlation between those however, so I still maintain that I’d like to see some parameterizations based on part of the series and then projections to an independent part (e.g. fit to month before 2004 election and then predict election result – don’t fit to all data and then see the forecast for these given events).

    Or is that indeed what you are reporting above when you give the “So how good is the model using previous elections?” results?

  27. Possum Comitatus said

    Charles – if we just used he usual variables to predict the next month, it looks fine on paper, but it fails dismally with previous elections when you have to account for the 10% or so of voters moving en masse.

    To give an example, if I removed those last two variables (that are essentially a specific error correction process for the election campaign), the last 3 election results would have been 2 and 3 points different from the basic forecast.

    So in this case its not really a matter of plugging in as much as you can to get a good fit (if you try that with polling data (actually, if you try that with nearly any data where you dont use time as a variable) your variables become statistically insignificant and irrelevant anyway), nor is it even a matter of plugging in the usual suspects – it’s about accounting for that last 10% of voters that operate under a different set of behaviour in the campaign to give accurate results.

  28. rossco said

    This is the simpler version of the model? Thank heavens you didn’t give us the complex version.
    Any way, I like the result. Hope it stands the test on the 24th

  29. Beach Ball said

    Mercurius @ 10 – very well said. However, given the Rodent’s demeanour of late, they haven’t read the cover of the intergalactic forerunner of Wikipedia.

  30. Amaranthus said

    Heh, I hail from one of those .edu addresses! I agree wholeheartedly with your statement about having to rationalise your predictors.

    However, you say: “Parsimony is one of those funny things that mean a lot in the lab (so to speak) but very little outside of it. It’s essentially a question of “how much of what you know effects a given thing are you willing to sacrifice just to get a theoretically pretty model?””

    I think this is a bit flippant :) First, if you re-read what I wrote above you’ll not that I said I dislike the term parsimony. The real point of model selection using AIC or CV (or ideally, multi-model inference) is to provide the optimum predictive model given limited data. The “true model” (that which is fruitlessly sort [theoretically] using BIC (called Schwartz IC by your stats package) is almost always incredibly complex in semi-stochastic systems, but involves a suite of dominant and tapering effects. AIC basically says: “given this limited data set, how many main effects and how many tapering effects can I afford to parameterise before the variance of the coefficients overwhelms the gain in bias reduction?”. It’s the trade-off between overfitting and underfitting that matters for prediction.

    I work in ecological modelling, and so the true model (reality, for want of a better term) is always elusive because it is always extremely complex. But that may not matter for predictive purposes, if the majority of the variance can be captured by a few main effects – we don’t always aim not to get a theoretically pretty model, but a functionally predictive one that can be parameterized on the basis of finite data. I can send you a paper I wrote on this recently if you want :P

  31. 2 tanners said

    For the more statistically ignorant of us, is this best summarised as (a) this is what polls showed the last few times (b) this is what happened (c) hopefully, through dummy variables, this is what has changed and how, leading to (d) Labor to win?

    If not, please tell me where I’ve got this wrong. OTOH, if so, what is your margin of confidence expressed as (a) percentage; and (b) amount of cold hard cash you are willing to bet on it?

  32. True Believer said

    How has western civilization existed so long without this degree of open analysis and experimentation. Isn’t it time we started to ask why the “owners” of newspoll can’t conduct (or contract) this level of analysis?? On another note…Possum…do we need to book you into psephologist detox?? What will you do with yourself after the 24th???

  33. Possum Comitatus said

    Amaranthus,

    Noooooooooooooooooooooooooooooooooooooooooooooooooo! For the love of God, no paper! :mrgreen:

    Sorry – I thought I was going to get bashed by a parsimony fundamentalist (it’s happened before)

    I know what you’re saying – I’m not a model monkey that pumps the r-sq, the model was actually refined on the basis of minimising my Akaike and my Schwarz (this is all starting to sound a little rude!).

    With polling data, I’ve found its best to first remove the bulk of the big issues (dummy variables do the job for that – things like major policy fallout, honeymoon periods etc), then look at why people (in the case of the ALP) vote for them because of them and how many do this, then look at how many vote for them because of their dislike of the other guy.

    Once you’ve got that mess sorted out, then its a question of refining the underlying residual, unexplained human behaviour of how the remainder react on election day.

    You work in eco-modelling?

    When you guys get quantum computing, you wont know what to do with yourselves ;-)

  34. Possum Comitatus said

    On the error margins folks, they dont work like we’re used to with polling data.

    If we took the theoretical uncertainty and applied it as a confidence interval – it would be about 15% points wide on either side because every piece of polling data we have contains it’s own uncertainty and when we start compounding it together it blows out dramatically.

    On the graph where the blue point forecast is bounded by the dotted red lines – those red lines serve as practical forecast errors.

    But this was actually designed to be more accurate for elections than outside of elections (that’s what those last two variables do – they effectively model how the undecideds swing and how history plays out for the soft vote) – and on that score you cant really give an interval as such, just the comparisons of how it’s predictions compared to the actual results.

    If we had a couple of hundred years of data -we’d be able to give really good quality intervals for these things.

  35. Rod said

    “Thats right Lisa, Daddy’s a teacher”.

    All this maths talk is making my brain hurt. ;)

    55.15?? isnt that a record for Labor?

  36. Possum Comitatus said

    2 Tanners, basically this is what the polls have shown for the last 10 years, and what the last 3 elections have shown, and how undecided and soft voters have reacted in the last 3 elections (even taking into account the fact that they moved different ways in each of the last 3 elections).

    True Believer, They’re two very good questions. Dunno about Newspoll, and have absolutely no idea about what I’m going to do!

    Thinking about starting up a twelve step psephologist’s anonymous program ;-)

  37. George said

    Poss, can you settle something for me, which I’ve never looked into myself. Again tonight on LL, there’t this “25% of voters don’t make their mind up until the last week, ans 10% on polling day”.

    1. is this the case? Are the above numbers correct?
    2. if yes, does this have any major impact on the result?

  38. SIEV XI said

    Hurt my head a bit, that did, but the pretty pics won me over, seriously, that’s some impressive crunching. How they hell, in plain English please, do you keep track of the relationship between so many variables without being hoist on a matrix of your own design? Does the formula do the sheep herding for you? (Sheep being the collective noun for a bunch of unruly variables).

  39. SirEggo said

    Possum

    Love your work mate, but I’ll be honest and say that I didn’t understand a single word of it until “Labor estimate 55.15%”

    That made me want to get down

    And boogie

    People who do analyses like this expose some of the crap the comes up in opinion polls

    Good onya!

  40. True Believer said

    Amaranthus – your post elegantly shows the difference between communicating with the <1% of people who have done uni-level stats and who read journal articles on the subject and the rest. Hint – its the latter who get on with things. I hereby nominate possum for a PhD in Mathematics (Statistics(Applied(Psephology))).

  41. Grant said

    Ecological modellers represent! I have to say, Possum, I have been sending links to your modelling exercises around to my ecological modelling colleagues as well in recent weeks. We all start by criticizing your old-school p-values and wish you’d do some multi-model averaging…but eventually we all just kick back and appreciate your fit. Hot stuff.

  42. Charles said

    My real job plaything is neural nets so I look upon this as training sets etc. and all the risks that go with setting up the structure to do the modeling.

    I think the difference between our belief system is I believe most of the variation in the polls since rudd came on the scene is sampling error and people are making things up to explain the error, while your modeling tries to look past the sampling error.

    Or to put it another way bayesian mathematics requires prier knowledge of whats going on, I think the prier knowledge is made up.

    I don’t believe the result has shifted at all since the campaign has started.

    Having said all that, I think your story looks sensible ( or nice as I said in my last post).

  43. STAR said

    Poss,

    this is all too much for my poor old tied brain. You pseph are too good for me, at least in my field I can hold my on.

    In fact, Like Daisy the cow, I am out standing in my field.

    But, I digress. tell me in one sentence , as my kids often do when they ask me such a simple question as, “What is the meaning of life ” Who will win the election on the 24th.

    Ps Forget McPherson.

  44. mate said

    Amaranthus, Possum… Please God make them stop…make them stop!…please you two, no more tonight. Everytime you post something I find I HAVE to read it… and with all this stuff your going on about well Jesus wept! I’ve got a zillion pages opened checking this word and that meaning, re-reading certain sentances etc trying to gain even the slightest idea WTF you are sayin… So far I get the elephant bit…I think… but PLEASE.. no more, my brain is leaking out my ears ;-)

  45. Amaranthus said

    Well guys, in simple terms, I asked –

    “Poss, if the election was a cup of tea, is your model tracing the path of the individual tea leaves, or the swirl of the teaspoon”. ;)

  46. Guido said

    Mmmm, Possum…Amaranthus…I love it when statisticians talk dirty.

  47. imacca said

    So, Poss has a model which the stats heads will no doubt debate until it has accurately picked the next three ALP TPP actual election results. Which means at some time in the next decade it may well predict their demise as well?

    And Amaranthus @45 is coming over all Zen on us.

    I like this blog, its fun.

    Particularly at this election since it looks like someone who thoroughly deserves it is going to take a good kicking where it matters.

  48. Harmless Cud Chewer said

    -moo-

  49. Frogg said

    Possum,

    So far it is a line fitting exercise, not a predictive model. Unless you show us some cross validated results (assuming or not prior knowledge of your dummy variables).

    The validation on previous elections are not cross validated …

    Having said that 55,15% sound pretty right to me, I had 55,2% in mind the other day when I went for a prediction of 95 ALP seats.

    Cheers,

  50. Enemy Combatant said

    The Shillster slips in with the Raoul Merton as El Rodente falters.

    [Dennis Shanahan, Political Editor | November 17, 2007
    JOHN Howard enters the final week of his last campaign facing defeat as Kevin Rudd and Labor hold their election-winning lead in key marginal seats.

    According to the latest Newspoll survey, covering both parties' election launches this week, the Coalition has failed to peg back Labor's lead in the Government's 18 most marginal seats in NSW, Victoria, Queensland and South Australia.

    On primary votes in the 18 seats, Labor extended its lead in the past two weeks to five points -- 47 per cent to the Coalition's 42per cent -- to give the ALP a two-party preferred lead of 54 per cent to 46 per cent.

    To be even competitive, the Government has to pull back at least two or three percentage points in the final week of the election campaign.

    The survey of marginal seats is in keeping with all the national polling in recent months, which has shown a consistent eight-point lead for Labor.]

    Citizen Rupert doesn’t like to be seen backing losers. Bad for Bidness.

  51. Alex said

    Possum, was there a clue as to your identity on Lateline tonight (Fri)? Nice graph.

  52. Stephen T said

    Poss I just love the way your modeling has a high degree of consistency, coherence and continuity you sure are doing something right. Don’t pretend to understand it all but getting the general gist. That 55 ain’t movin and it ain’t gonna move except up. Its just too late to factor in a major stuff up. Its a done deal.

  53. adam said

    hi poss

    your analysis here calls for a measured response.
    so i got my jigger out…

    the following is my range of recipes for an appropriate election evening shindig. this is the actual plan for the evening at mine. i’m sharing this because 1) some of the recipes are appropriate and 2) I now feel that it makes little difference either way, as if rudd fails (hahaha) then we’ll just go ahead anyways… ;^)

    “Teh steps”

    1) first, make a few sizable howard pinatas from old GGs and araldite, copiously filled with bounty bars, pickled ham and lamingtons. cricket bats at the ready, poll tragics! smash one vigorously to commence proceedings. eat contents. start on the XXXX.

    2) early in the evening, probably about 7.30pm, start serving a special toast to the falling junta with “malibu on the rocks” – all zip with no tip left in higgins, or rodent in bennelong, so just icebergs and coconut down the hatch. toast one round for each declaration of seat loss by cabinet member, until all results in.

    3) a special commemorative round of “bundy and dry ginger”, coinciding with official concession white flag going up at kirribili, in honour of the team that did the unglue-ing. smash next pinata at this moment! remember to scoff “teh goodies” like your snout’s in the trough.

    4) as howard begins concession speech, smash next pinata. during speech, play drinking game in which we skull a shot of bourbon (in honor of obama being the turning point) each time he threatens to break down, mentions his legacy or thanks a cabinet member. double skull if he cries like a baby. triple skull if hyacinth loses it in any way.

    5) quiet for the acceptance, you rabble! bundy and dry all round, three cheers the PM! then, after the acceptance, serve a feral round of the patented “mad monk” cocktail – a special mix of benedictine and crazed fruit juice – to celebrate the (most likely) new opposition leader’s rise to impotence and probable removal inside a year. if downer looks to get the nod instead, laugh and whack on the rocky horror dvd.

    6) basic rule: XXXX at all other times. down here in victoria we say: thanks for everything Qld, we’ll even drink that frickin piss tonight because life, as they say, is sweet.

    7) get up, go to church, say thank you.

  54. Mercurius said

    But Possum, if I take a cat into the polling booth with me, does that mean I can only know who won the election, but not by how many seats; or I can only know the number of seats won, but not which party won them?

    And will the cat survive this process?

  55. Ha! Very good Mercurious!

    It would seem that you (or Schrödinger, your pussy cat) will only know the exact election result after the election has been held – but then it’s not a prediction any more…

    A paradox indeed.

  56. Mike said

    Tidy work, poss.

    Quick thought; there’s a line of argument in politics that after September 11 2001, security concerns contributed to the strength of incumbents. That’s a story that your model is able to tell (sudden drop in opposition TPP after 2001, flatline for a bit, then building back up). I’m not asking for you to plug in a 2001 dummy (don’t think there’s one already in there) as the effect already seems substantial and it’d mess with your efforts to avoid overspecification. I’m just curious about the degree to which “time since September 2001″ can be used to explain the ALP vote. Any thoughts? It’d be interested to see if the data supports or contradicts the theory.

    Maybe I’ll have to procrastinate a bit next week and gather together some data for myself…

  57. Mike said

    PS – has anyone written a journal article on comparing inaccurate models entitled “I See Your Schwartz Is As Big As Mine”? If not, I think I’ll apply for copyright now and figure out the paper later.

  58. Enemy Combatant @ 50

    The GG is an amazing read this morning. Sol being even more frank in handing it to Labor and Dennis being almost measured. As you say. The word from head office clearly arrived yesterday and they broke out the alternative medication.

    I imagine the folks at Lib headquarters either setting fire to their copy or starting drinking with breakfast. Or both.

  59. Stephen T said

    Now that the fat lady is singing its time to give the GG the serve it deserves. Their bias has done more damage to the conservatives than all of the left wing pundits. They need to look at psychology 101 and get a handle upon cognitive dissonance. Shanaham, Albrectdictator Overnwoman and Kelly have put journalism in disrepute through uncritical bias. They have abnegated their responsibility to advise the government of the day in matters of ethics and social responsibility. If they, by some miracle,realize how much they have damaged the right they should hang their heads in shame. There is a cost for uncritical fanaticism and they will pay the price. In some respect it is hilarious to see the doyens of intellectual conservatism imploding and self-destructing under the weight of their own polemic. Their is a lesson here for the Labor party as well. Obsession leads to blindness which leads to fanaticism which leads to masochism which eats like a disease from within.

  60. Don said

    Poss, as more polls come in, will your forecast algorithm simply add more decimal points, or is it likely to move around a bit?

  61. Tom said

    I will be celebrating the Rodent’s downfall in 7 days with my homebrewed (fullmash) Whitbread 1850 London Porter.

    Something appropriate in drinking an 1850 era beer that night :)

  62. Lukas said

    Possum,

    Thanks for an interesting and strong analysis.

    Can I confirm a couple of interpretations that I take from your coefficients?

    1. WorkChoices has added 1.5 percentage points to the ALP 2PP?

    Actually, quite a bit of the WC effect would also be reflected in the AC Nielsen survey (Nielsen showed an even bigger leap in ALP 2PP in November/December 2005, when WC was introduced and debated in Parliament, than did Newspoll). Without the ACN variable, would the WC effect rise to about 1.7 points or so?

    I agree that WC is the single most important policy variable in explaining this election outcome. Published ACTU polling showed that it was far and away the most important issue given by voteswitchers, and this is consistent with more anecdotal material since then. Were there any other policy variables you tried to include and junked?

    2. Latham (contra post #7) on it own added 1.9 points to the ALP 2PP vote, but the true Latham effect may have been negative because of the drop in his approval rating as reflected in your OPSAT(-1)-OPSAT(-3) variable?

    3. Rudd has added 2.0 points to the ALP 2PP, and that is probably roughly the true effect as his approval rating hasn;t shifted much in the last 3 months?

  63. Evan said

    Possum, You’ve gotta be the Dr Strangelove of Australian psephology (I mean this in the nicest possible way). Impressive work, mein Fuehrer.

  64. smssiva said

    Possum,

    People have read your conclusion. How do I know? Centrebet blew out to 4.50 today from 3.60 yesterday

  65. Gus said

    Hi Possum

    Love your work!!

    One issue though. Your validation of the 1998, 2001 and 2004 elections is bound to overstate the accuracy of the model because your model includes those elections in the calculations of the parameters. Common options to get around this problem are to use cross-validation (randomly split a dataset in half, use on half to build the model and the other half to validate it) or temporal validation (develop the model up to 2001 and test it on 2004).

    Your problem is that you only have three elections in the series, so the loss of one is likely to seriously alter the parameters. Still, it will give you a more realistic sense of the predictive capacity of the model.

    Bye for now, Gus

  66. Enemy Combatant said

    Rodents!?! We don wan no Steenkin’ Rodents!

    1. LABOR 1.20
    2. COALITION 4.60

    CBet 10 am Q time.
    Surprised The Surge took so long really.

  67. Ratsak said

    Poss,

    How do you climb trees with a pair of cajones that big?

    It’s one thing to predict an ALP win, or even to give a number of seats. To predict the TPP to 2 decimal places and then promise to do same for state splits is a whole ‘nother order of magnitude. The vast majority of us have absolutely no idea what most of what you’ve written means. As a professional I trust other professionals to know their stuff, but that statement in bold is BOLD. I know the graph is showing upper and lower limits, but nary a + or – on the number is putting em on the line. I love it!

    I also hope you’re right. Firstly it’s the least the Coalition deserves, and secondly I’d hate for a final result in the 52-53 range on the night deliver a comfortable Rudd victory but open your good self up to criticism from the “Hendersons” out there who are so very wise in hindsight. More power to your paw possum.

  68. Ratsak said

    EC @ 66

    Holy Firetruck Batman!

    $4.60? The damn wall is about to go.

  69. barney said

    Possum,

    Don’t pretend to understand the maths but the graphs are self-evident.

    The only caveat I have is from a book of unnatural laws that I used to have.

    O’Toole’s commentary on Murphy’s Law: MURPHY WAS AN OPTIMIST!!

    Let’s hope that this applies to the rodent rather than the rudster.

  70. Neilbris said

    Sportingbet has blown out to 1.25/3.85 ALP/LNP. “She canna take it Cap’n! She’s gonna blow!!”

  71. dare u 2 said

    Please someone anywhere in Australia in this forum, hang out in your local shopping centre, and try and get a photo op with johnny in front of a tv crew. And proclaim loudly, but in a friendly jovial manner “Hey Mr Howard, I live a marginal electorate, will you buy me a Plasma?, aw please?.” post this on every forum in the land, this will sink him definitely, can you imagine the lead in the night’s 6pm news!!!

  72. codger said

    Wiggles my trunk Possum & Mr A.

  73. stevet said

    Dare u 2, what I have wanted to say to Howard in front of TV cameras for quite a while is something along the lines of, “I get really angry and insulted when you refer to my Mum and Dad as thugs and bullies, because they are anything but. They are the salt of the earth, and as union delegates they did their best to get decent working conditions for their members…”

  74. Andos the Great said

    Enemy Combatant @ 50 and Robert Beswick @ 58:

    That has got to be the most of “The Australian” that I have read for a long time. Loved Lynton Crobsy’s effort in the “Who Won Week 5?” panel.

    “Mixed fortunes this week. No clear winner. A better launch for the Coalition – even starting on time! State Labor was again the spotlight (sic) with cockroaches in a Sydney hospital, a toddler’s sad death in Queensland, corruption in Western Australia, tram stoppages in South Australia and police corruption in Victoria all pointing to the problems of Labor governments. Labor’s attempt to misrepresent the cost of the policies by taking those on just one day played into the Government’s economic management hands. Rudd’es education revolution is a clever line but a good education requires more than a laptop. Meanwhile, after weeks of shouting from the treetops, Peter Garrett seems to have gone missing in the forest.”

    Seriously. He almost did as well as Piers Akerman last week on “The Insiders”…

    The rest of the panel; Paul Kelly, Dennis Shanahan, Stephen Lossley (fmr ALP national president), Patricia Karvelas, Chris Uhlmann, Matthew Franklin, AND David Speers ALL called the week for Labor.

    Great read.

  75. paulyt said

    I have no idea what half the words in that post meant but it sure turned me on.

  76. Andos the Great said

    This story is also very interesting:

    http://www.theaustralian.news.com.au/story/0,25197,22772644-2702,00.html

    “Andrews orders snap review of detainees

    IMMIGRATION Minister Kevin Andrews has ordered a snap review of all 450 people held in immigration detention in Australia to establish whether an administrative error means their incarceration is technically unlawful.”

    How does it play in the context of this story:

    http://www.abc.net.au/news/stories/2007/11/14/2091103.htm

    “Govt silent on ALP’s immigration secrecy claims

    The Federal Government is refusing to respond to an accusation from the Labor Party that it is failing to abide by the caretaker conventions during the election campaign.”

    Very interesting.

  77. Stig said

    Heh. We talked about the supermodel a few months back, and now that she’s out in public, she is truly beautiful to behold. Not quite what I’d expected, but with a TPP like that then I for one am not arguing with it!

    Well done as always.

  78. Andos the Great said

    You forgot to say something about “nice curves”, Stig…

  79. Gecko said

    As you may or may not be aware, I have repeatedly said this and other blogs that the ALP will win 104 seats.

    In light of your extraordinary analysis… I thought it only fair to pass over my own mathematical theorems when applied to Psephology and the coming election.

    Some considered facts:

    1. I am five foot eleven and three quarter inches and two bits. (As we know, two bits are f*ck all… so should be discounted… and since fraction is only an ‘r’ away from faction, this creates political noise and should be avoided.) 5’11” therefore equals 71 = 8 in numerology.

    2. On a good day I can throw a house brick the length of a cricket pitch: which is twenty-two yards. My wife who was a liberal voter (but not any more) can throw a brick only seven. And finally, if I try really hard I can throw my wife six. (Interesting but irrelevant fact is that when she tries to throw me she falls backwards a negative two.) This gives us: 22-7+6 = 21

    3. 150 (seats) when counted numerically out loud requires 3 breathes before turning blue.

    4. John Howard has been in power for eleven and a half years, is a square and completely rooted. This gives us: √11.5 = Error (as my calculator is broken). But if I spin the calculator while hopping on alternate legs (to give equal time to left and right bias) ‘Error’ looks eerily like four (4)… and this forms an important part of the equation.

    This gives us: (8+21-3)4 = 104.

    I hope this has been of assistance.

  80. Stig said

    Oh yes Andos – the curves. The curves are very, very nice.

  81. Alan H said

    Thanks, both possum and gecko. I think possum is more likely to be right, but hope fervently that the gecko trumps us all, as his method has more lateral thinking involved, and that is to be encouraged.

    cheers,

    Alan H

  82. Ze said

    Possum : How well does your method predict the data if you don’t use the full data set in training and use part of it as a test set :)

    I’ll happily grant you that it’s a great model for the data it’s trained on , hopefully it’s a great model for the election since I like the conclusions :p However I would like to see how well your methodology predicts unknown data , we can do this by rewinding a bit using known data as a test set and retraining without it , then seeing how well it predicts it.

  83. Possum Comitatus said

    Ze – it’s based on using leadership dummy variables to level the ceteris paribus party vote, and then a dummy variable for the “event” of the election.That then allows PM dissatisfaction and the Opposition primary to “take the weight” with a lagged ACN to remove the volatility of the noisy Newspoll and those two longer variables at the end to adjust the ALP overshoot in the election campaign itself that reliance on PMDISAT as a variable produces for the ALP TPP vote.

    So if I go back and build the equivalent model for the 2004 election and only use the period up to September 2004 as the sample, I’d institute a dummy variable starting at the “troops home by Christmas” remark and going through to the election (the equivalent of which in the 2007 model is the “Workchoices” DV). Once I do that, and then forecast ahead the last month into the election itself, I end up with an ALP TPP of 47.6, which is close to the 47.2 which was the result.

    But now for 2007, I dont need that event DV of “troops home by christmas”, I can just let the other variables take the weight of that over the end of the 2004 period and redo the model for the 2007 election.

    The way it works is that it accounts for the leadership and honeymoon effects, then adjusts for the big issue going into the election. Yet, by doing it that way, it creates an overweight in the other explanatory variables which is then adjusted in the forecast with the last two variables in the model. Those last two variables look a bit counter intuitive to start with, but it actually accounts for the “running home to momma” effect of late deciders if people start moving away from the government in large amounts because of their dissatisfaction with it, or to the ALP because of their satisfaction with it.So the error correction type function it fullfills is based on both types of voters that move – those that do it because of their earlier dislike of one side, and their earlier attraction to one side.

  84. Grumps said

    Poss,

    Thanks for the links to help explain some of the tools of your modelling. I am becoming quite brain dead, and any additional information to stimulate brain cells is joyfully devoured.

    Great work on your behalf but do feel gut instinct says it will be a win but not by your 55.15% :)

  85. Lukas said

    Possum,
    any comments on #62?

  86. Possum Comitatus said

    Lukas,

    Yep, Workchoices added about 1.5 points in this model, but it could be more as some of the Rudd variable could actually be Workchoices – but for the model it doesn’t really matter either way. If you take out ACN, the Workchices coefficient lifts to a little over 1.8

    I’ve played around a lot with other policy variables doing other things for the site, and none apart from Workchoices really amount to a hill of beans over the last couple of years.

    The true Latham effect was little negative on election (those campaign period error correction variables did that) , but before the campaign was positive because of the general slump the ALP found itself in before his leadership.

    Rudd added about 2 points to the leadership in this model. Using other models to analyse the Rudd effect, it’s sometimes a bit higher, sometimes a bit lower in TPP terms. The big effect of Rudd was to lift the ALP primary vote – and he did that by around 5 points depending on how you model the primary vote.

  87. Lukas said

    Thanks, Possum. That all makes sense. What is the WC effect if you take out ACN & Rudd?

  88. OMG…Have you seen this ?

    http://www.thewest.com.au/default.aspx?MenuID=145&ContentID=47610

  89. I’m sorry if you explained this, but it was not clear to me: Do you have any out-of-time test results?

    -Will Dwinnell

    http://matlabdatamining.blogspot.com/

  90. Possum Comitatus said

    Will, see comment 83.

    I could only use an equivalent model for 2004 because the number of observations are too small to go back any further than that. Of course, using 2004, the dummy variables slightly change, but the election day correction component is exactly the same.

  91. Terry said

    Poss

    nice work-can you do an analysis of Howard’s/ALP’s numbers in Bennelong since 1996, predict his demise and make my bloody day!
    Terry

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 57 other followers

%d bloggers like this: