by Tom Adams
I need your help in extending this study. I encourage pool managers and others to send me the details of all pool entries in other pools and/or other years to tadamsmar@aol.com so that this study can be extended. This study gives you an idea of what can be learned from simulating office pools.
I will keep your entry names confidential or you can change them before you send the data. I did not do this with the Packard pools in this study because the pool entries were already posted on web.
I will be happy to share the results of the analysis of any pool data sent to me.
Two multiple-entry strategies were also evalulated. In a standard-scoring pool, a multiple-entry strategy that involved varying the championship pick was better. But, in a pool with upset incentives, a multiple-entry strategy based on using EPM pool sheets derived from various Markov models proved to be better.
The validity of this study is limited by the fact that only two pools are analyzed from one year. Pool managers and others are encouraged to send the details of all pool entries in other pools and/or other years to tadamsmar@aol.com so that the study can be extended.
Office pools based on the NCAA men's basketball tournament have been of widespread popular interest for many years, but the question of the best strategies for playing these pools has only recently been addressed by statisticians and operations research experts. In 1997, Brieter and Carlin [1] described a Monte Carlo process for estimating the pool entry sheet that would have the highest expected point total. In 2001, Kaplan and Garstka [2] described an improved direct calculation method for the expected-point-maximizing (EPM) entry sheet, explored three new alternatives methods for determining the Markov model used to calculate the EPM entry, and reported results from participating in a number of pools including two web-based pools managed by Eric Packard: Packard #1 and Packard #2.
The Markov model in question is just a table of probabilties for the outcome when one team meets another. These probabilities are derived from the predicted spreads or scores from rating systems and/or pre-tournament first-round betting lines. Given the number of rating systems (Sagarin, Massey, etc.) and the variety of ways that the ratings and betting lines can be combined to yield win probabilities, there are actually more than a dozen of potential Markov models, although only four such models have been developed in the literature. The Kaplan-Garstka Markov models are based on non-constant variance. They claim this as an advantage over the Breiter-Carlin Model which uses a constant variance. But, at this point, the advantage is theoretical since it has not been demonstrated to provide an edge in practice. Such an edge could well be subtle and hard to demonstrate.
The direct calculation method reported by Kaplan and Garstka was independently discovered by Tom Adams and used to develop Poologic Calculator , a Java applet that may be used to calculate the EPM pool sheet for a variety of pool scoring rules.
In a letter to the editor of Chance, Tom Adams pointed out that the EPM pool sheet might not always be best since there are situations where the EPM sheet will not maximize your probability of winning. For instance, the EPM pool sheet is not best in some situations where the EPM sheet picks the favorite as champ and that favorite is grossly overbet in the pool. Carlin's response letter argued that that contrarian strategies involve the complex field of game theory and that it is by no means clear how or when to implement an effective contrarian strategy.
The matter of the best multiple-entry stategy is also in dispute. Breiter and Carlin suggested entering multiple estimated EPM pool sheets resulting from different estimation strategies. The Poologic web site recommends varying only the championship pick on a single EPM pool sheet. These approaches are strikingly different because multiple estimated EPM pool sheets will often all have the same championship pick.
This paper reports on a novel approach to analyzing a pool. If the details of all the entries are known, then it is possible to use a Markov model to simulate the tournament. The simulation may be repeated thousands of times and the pool winners tabulated. This is similar to replaying the tournament thousands of times. On a moderately powerful computer it is possible to simulate the tournament and tabulate the results 10,000 times in a few minutes.
The one limitation of this approach is that the details of each pool entry must be available. A source of this information from the thousands of office pools held each year is not readily available. This information is available on the web for Packard #1 and Packard #2. So, these pools are the used in a pilot study of the this pool simulation method.
The Packard pools are similar to an office pool, but there are some differences worth noting. First, the pools don't require an entry fee and do not have a cash prize, so this might change the behaviour of players somewhat. Second, someone (probably Eric Packard) adds entries directly derived from various rating systems.
Scoring rules vary from pool to pool. Most pools have standard scoring factors that are awarded for a win regardless of the teams seed. Others have seed factors that are multiplied by the team's seed, so that a 7-seed is awarded 7 times as many points as a 1-seed for a win in a particular round. These are the types of factors used in the Packard Pools. Other pools award points using factors based on the seed difference (winner seed - loser seed) when the winner beats a lower seeded team.
Packard #1 uses only standard scoring with standard scoring factors of 32, 48, 72, 108, 162, 243 for rounds 1 to 6. This type of scoring is common, but the round factors seem to vary from pool to pool. Kaplan and Garstka noted that the EPM pool sheet provides less advantage for these scoring rules, since simply betting on the lowest seed or the highest ranked team provides a good approximation of the EPM sheet.
Packard #2 scoring rules provide incentives for picking a higher seed to advance. Packard #2 used standard factors and seed factors. Both the standard and seed factors have values of 945, 1980, 3696, 6160, 9240 and 13860 for rounds 1 to 6. When high seed or upset incentives are present, then it is hard for a pool player to get close to an EPM pool sheet without the aid of a computer program. The relative size of the Packard #2 upset incentives (relative to the standard factors) fall in the middle range. I am aware of pools that have relatively larger and smaller upset incentives. The most extreme upset incentives I have seen used in an office pool are seed factors of 1,2,4,8,16, and 32 with zero standard factors.
In 2001, Kaplan and Garstka entered thee EPM pool sheets, based on the three Markov models described in their paper, in Packard #2. Tom Adams entered the Poologic Calculator EPM pool sheet (which is based on the Markov model described in Brieter and Carlin), in Packard #2. In Packard #1, Kaplan and Garstka entered two EPM pool sheets and Tom Adams entered none.
You can view the 2001 Packard #1 entries here and the original 2001 Packard #2 entries here.
In order to make this analysis similar to an office pool analysis, I removed all of the ranking and rating system based pool sheets entered by Eric Packard from the pools before simulation. While it is possible that some office pool players do base their entries strictly systems like the seed ordering or the Sagarin rating, it seems unlikely that 10% of the pool participants would do this, particularly in a pool like Packard #2 where the scoring rules have upset or high seed incentives. With the ranking and rating system entries removed, Packard #1 has 126 entries remaining and Packard #2 has 37 entries remaining.
The names on the EPM pool sheets are Stan Garstka, Ed Kaplan, linda beise, and Mr. Poologic. Note that Poologic should have some advantage in these simulations, since both the Poologic entry sheet and the simulation are base on the same Markov model.
In Packard #1, the Ed Kaplan entry is based on Vegas betting lines and the Garstka entry is based on the Massey Rating System. In Packard #2, the Ed Kaplan entry is based on Sagarin, the Garstka entry is based the Massey Rating System, and the Beise entry is based on Vegas betting lines. (Information from private communications with Ed Kaplan.) The Poologic entry is based on Vegas lines for the first round and Sagarin for subsequent rounds. Of all these EPM entries, only Poologic treated spread variance as a constant. The Poologic entry was based on methods from [1], whereas the other EPM entries where based methods from [2].
Here are the top entries (all entries with more than 1000 wins) in the Parkard #1 pool (based on 100,000 simulations of the tournament) ranked by number of wins:
EPM pool sheet was not very effective in this type of pool. This is partly due to the fact that Duke is overbet in this pool and all three EPM entries bet for Duke. It seems better to adopt the contrarian strategy of picking a team other than the favorite as champion on your entry sheet. The seven best sheets are contrarian entries that picked a champion other than Duke. One of the EPM sheet (beise) did not even get 1000 wins in the simulation. (Poologic did not have an entry in this pool.)
Here are the top entries (all entries with more than 1000 wins) in the Parkard #2 pool (based on 100,000 simulations of the tournament) ranked by the number of wins:
In this pool, the EPM pool sheets did very well. Garstka won 8823 pools or 8.8% of the total. Four of the top six are EPM sheets. And, this result may even understate the effectiveness of EPM analysis since the four EPM pool sheets are somewhat similar and therefore tend compete for the same wins. Any single EPM sheet bet in an office pool would do relatively better.
With 37 entries in this pool, a player would have to win 1/37 of the time to break even if this were winner-take-all office pool. That would be 2703 wins in 100,000. Only 13 of 33 non-EPM players, or 39%, got more than the 2703 wins required for a break-even sheet. So, 61% of non-EPM player made bad bets, bets that had a negative expected return, in this pool.
It is notable that there is a 2-fold difference in the performance of the best (Garstka) vs the worst (beise) EPM pool sheet. This indicates that there are important factors not fully captured by the EPM analysis. But it is unclear what these factors are. And, it may not be possible to predict these factors in advance to improve on an EPM pool sheet. Note that the performance of an entry can be cut in half in someone by chance happens to bet a similar entry.
One factor in Garstka sheet performance seems to be that it advanced 4-seed Kansas to the Final Four rather than 2-seed Arizona. Kansas and Arizona provide about the same expected score, but the Kansas pick tend to distinguish the Garstka sheet, since only the Garstka and Kaplan sheets advanced Kansas. Also, the Garstka sheet was unique in advancing Wake to the final 8. Other simulations (data not reported) indicate that these are important factors. (Ed Kaplan brought these factors to my attention in private communications.)
The Garstka advantage might point to an advantage for this particular Kaplan-Garstka Markov model over all the other Markov models, but this observation is not statistically significant. Similar results for at least three different tournament years would be needed for this observation to be considered statistically significance at a .05 level.
Poologic had the best mean score, but this is due to the fact that the Poologic Markov model was used in the simulation.
One additional improvement might be to use 2-seed Arizona rather than 1-seed Illinois as the fourth championship pick, since the EPM sheets tended to pick Arizona over Illinois for the Final Four in 2001.
Table 5
(Note that the Stanford prediction is correct due to the fact that the EPM sheet strength was set to make it correct.) The ROI predictions are generally good. Duke is the worst prediction, off by a factor of about 2. The ROI predictions (not shown) for North Carolina, Kentucky, Maryland, and North Carolina are also reasonably consistent with Table 4 results.
These ROI inputs were derived with the pool and the simulation results in hand. If the default EPM sheet strength of 6 is used in the ROI calculator, then the ROIs too large by a factor of about 2. And, I did not have to guess the number of sheets that would be picked for each champion, as one has to do when using the ROI calculator prior to a tournament. So, larger errors are to be expected in the real-world application of the ROI calculator.
The main application of the ROI calculator is to determine the champions to bet. Therefore, the relative orders of the ROIs is perhaps more important than the ROI values. The ROI calculator picks Stanford as the best contrarian pick and the simulation supports this. In general, the relative order of the predicted ROIs is consistent with the simulation results in Table 4.
As pointed out in the ROI calculator documentation, it is unclear how to apply the ROI calculator to a pool like the Packard #2 pool. The ROI calculator does a poor job of predicting the Packard #2 pool simulation results represented in Table 3. The main problem seems to be that ROI calculator assumes that it is necessary to pick the correct champion to win the pool, but this is not true according to the Table 3 results. For instance, the Poologic sheet for Michigan loses at least 50% of the time when Michigan wins. This is due to the importance of upset incentives in the Packard #2 pool.
The results indicate that the strategy must be tailored to the scoring rules of the pool.
For pools like Packard #1 with only standard scoring and a relatively high number of points awarded for correctly picking the champion, the EPM strategy does poorly. It is better to make a contrarian pick for champ, avoiding the consesus favorite. A multiple-entry stategy based on varying the championship pick performs quite well.
For pool like Packard #1 with significant incentives for predicting upsets, the EPM strategy is very powerful. A multiple-entry stategy based on varying the championship pick in an EPM sheet does a good job, but it seems that betting multiple estimated EPM sheets based on multiple Markov models does even better.
Many office pools have rules similar to the Parkard #1 pool. Some have upset incentives as large as or larger than the Packard #2 pool. But some pools fall in the middle, with upset incentives that are smaller than Packard #2's incentive relative to the standard scoring factors in the pool. For these pools, it is still unclear whether a contrarian or EPM strategy is best.
Note that the best strategy is based on assumptions about the competition. In some pools, the nationwide consensus favorite might not be the local favorite. If use of the EPM stategy becomes widespread, then its advantage will decrease. Poologic recieved about 4000 hits in 2001, so we seem to be a long way from saturation knowledge of the EPM strategy at this point.
The overperformance of the Garstka sheet in Packard #2 implies that there may still be opportunities for improvement in pool strategy. If the factors that make one EPM sheet outperform another can be identified without access to all the pool entries, then further improvements in strategy may be possible.
The results support the use of the ROI calculator as an aid in predicting the best champion or champions to bet in a pool with only standard scoring. The values of the ROIs were off by a factor of 2 or more, but the relative order of the ROIs was approximately correct for the Packard #1 pool. However, it is not clear that the ROI calculator improves much on the obvious contrarian strategy of betting on the #2 team in the pools (Stanford) rather than the overbet #1 team (Duke).
The validity of generalizations from these simulation results are limited by the fact that that only two pools for the same year are analyzed. If you are a pool manager or have access to the details of all entries in a pool, please send to me at tadamsmar@aol.com so that I can extend this analysis. The names associated with entries sent to me will not be revealed to provide privacy. (Privacy was not a concern with the Packard pool entries since they are already posted on the web.)
[1] "How to Play Office Pools If You Must" by David Breiter and Bradley Carlin (Chance Vol. 10, No 1, 1997, pp. 5-11)
[2] "March Madness and the Office Pool" by Edward H. Kaplan and Stanley J. Garstka (Management Science Vol. 7, No 3, March 2001, pp. 369-382)