![]() |
BarCamp is the inspiration for RootsCamp. If you are from the BarCamp/Geek world, consider spearheading a RootsCamp in your area.
More notes from RootsCampDC...
(Viewed: times)
Led by David Boyle in a personal capacity. beglen at gmail dot com
(works at Catalist)
This session focused on the basics of how models are constructed, with an emphasis on understanding the modelling process, and setting forth a low-end modelling process that can be used by campaigns with limited budgets. While these models will not rival their costly counterparts in quality, they still represent a significant improvement over traditional targeting for several reasons: (1) they are more fine-tuned that traditional segmentation; and (2) when done properly, the predictability of the models is verified, making their performance better than anecdotal.
On a basic level, modelling is the process of taking a small set of data -- typically survey results -- and then using those results to predict attitudes of an entire universe. In a campaign context, this generally plays out as follows:
To generate model scores, you follow three steps:
To generate a model, Boyle started with a voter file sample in Excel that contained 3,000 responses from a phone survey. He then loaded it in AnswerTree, which is an SPSS module. In AnswerTree, Boyle then ran a full CHAID (Chi-Square Automatic Interaction Detector) analysis. AnswerTree starts with a single box containing your entire universe. When you click on that particular box, it shows you all of your variables, from the most significant to the least. You choose a variable with a high Chi-square score and a low P-value, and it gives you two or more branches based on that variable. You can then repeat the process of each of the branches until you've reached the level of granularity that you're seeking.
At the end of this process, you'll have a model. The next step -- which is very important -- is to verify your results. Call through a number of people who had not been previously ID'ed, and see if the model seems like a good predictor of support. The more people you call, the better. You should probably aim to call about a hundred people who the model thought would be 30% likely to support your candidate and see if around 30 of them were. Then call around 100 people who the model thought would be 70% likely to support your candidate. If around 70 of them do, then it looks like your model is a helpful predictor of actual behavior.
Once your model has been verified, append it to your original voter file and start cutting better universes. Organizers and volunteers will be able to pull lists of voters who are, for example, 30%-70% likely to support your candidate for persuasion programs, and 70%+ likely to support your candidate to GOTV them.
Question and Answer:
Q: If your original survey was only of certain demographic or geographic location, can you still apply your model to the entire voter file?
A: Your results are only valid for the universe of your original survey. If you only called women, your model would only be predictive of female behavior. If your IDs are only for people in Baltimore City, beware of applying a model created from them to people who live in rural areas.
Q: Given limited resources and statistical knowledge, how do candidates get through this?
A: Catalist provide lots of data points that can help models to be built. If you're thinking about it, contact Copernicus Analytics, Ken Strasma or other modelling firms. They will talk to you about doing it properly. If there isn't budget for that, some modelling companies will work with you on your data, including finding gaps to fill in, rather than conducting an entire new poll. If, for instance, all of your IDs focus on certain counties, they'll help you by telling you which counties you need to call into to create a valid model for your entire universe.
With that said, more expensive models produce better results, and building a home-grown model, or a model on the cheap, shouldn't been seen as an equal substitute to a full vendor model.
... but don't think that the $100,000 option is the only one out there.
Page Information
|
Wiki Information |
Recent PBwiki Blog Posts |