Let’s wrap up this series of posts about logistic regression and predicting awards (here’s part one and part two if you missed it).
Accuracy
We last left off with a working model. Let’s look at how it performs on 2018 after training on data from 2010-2017:
Award | Accuracy |
---|---|
Event Winner | 73.76% |
Event Finalist | 68.74% |
Chairman’s | 89.20% |
Engineering Inspiration | 88.30% |
Creativity | 90.10% |
Engineering Excellence | 90.16% |
Entrepreneurship | 90.23% |
Dean’s List Finalist | 87.46% |
Gracious Professionalism | 89.07% |
Imagery | 89.06% |
Industrial Design | 89.77% |
Safety | 91.83% |
Innovation in Control | 89.0% |
Quality | 89.32% |
Spirit | 89.78% |
Volunteer | 98.26% |
Woodie Flowers | 95.31% |
Judges | 89.26% |
Not bad! Keep in mind that we can’t predict the autonomous award because it’s only been awarded twelve times in history; there’s just not enough of a sample size. Also, accuracy isn’t the whole story here. The interested reader can go through and look at the precision and recall, which are not nearly as good.
Coefficient Analysis
If you were with us from the beginning, you’ll notice that our features are just a geometric series of past award wins. If we look at the coefficients for each of these models, we see some pretty interesting things:
- Some awards appear to be heavily indicative of winning the award in the future. For example, the single dominating factor in predicting the safety award, imagery award, and entrepreneurship award is winning that award in the past.
- The engineering inspiration award is widely regarded as very similar to the Chairman’s award. Interestingly, the model learned this. The coefficients place almost the same value on the two when predicting future Chariman’s performance.
- Some awards seem to not be very related. The coefficients suggest that winning the Chairman’s award is not a strong indicator of winning the engineering excellence, imagery, or innovation in control awards. In contrast, the awards that make up the engineering “quinfecta” (the creativity, engineering excellence, industrial design, innovation in control, and quality awards) are good indicators of winning other quinfecta awards.
So who’s going to win these awards at the events I’m attending in 2019?
It turns out that given the way our model is constructed, this isn’t an easy question to answer. The main problem is that different events have different numbers of teams – having a variable number of dimensions is not usually something that machine learning handles nicely. This is a similar problem that we dealt with by using an exponential decay factor in a previous post, but that won’t help us here. That was dimensionality over time, and this is more like dimensionality over space. How are we going to get out of this one?
But there is a way. As it turns out, our model predicts log probabilities for a given team before we translate them back into normal probabilities. Most neural network classifiers are built to do the same thing, but then softmax these together for final probabilities of each of the classes. We can do the same thing here! Without further ado, here’s my predictions for the 2019 Utah Regional, using registration data from December:
Award | Prediction |
---|---|
Event Winner | 399, 14.89% |
Event Finalist | 399, 14.30% |
Chairman’s | 399, 38.90% |
Engineering Inspiration | 399, 28.90% |
Creativity | 971, 14.00% |
Engineering Excellence | 399, 22.58% |
Entrepreneurship | 399, 21.34% |
Dean’s List Finalist | 399, 19.32% |
Gracious Professionalism | 399, 21.87% |
Imagery | 3239, 12.43% |
Industrial Design | 399, 25.27% |
Safety | 3239, 45.16% |
Innovation in Control | 971, 21.78% |
Quality | 399, 24.02% |
Spirit | 2102, 30.47% |
Volunteer | 399, 40.82% |
Woodie Flowers | 399, 25.26% |
Judges | 2102, 11.90% |
“But Brennon,” you might ask, “I’m not going to the Utah Regional. What about my particular event?” Because we can create events with this technique at will, I think the best way to answer this for everyone simultaneously is to create an event where every team in FRC attends. Here you go:
Award | Prediction |
---|---|
Event Winner | 67 |
Event Finalist | 67 |
Chairman’s | 1241 |
Engineering Inspiration | 1241 |
Creativity | 67 |
Engineering Excellence | 67 |
Entrepreneurship | 1676 |
Dean’s List Finalist | 1241 |
Gracious Professionalism | 217 |
Imagery | 2246 |
Industrial Design | 67 |
Safety | 1622 |
Innovation in Control | 469 |
Quality | 67 |
Spirit | 1266 |
Volunteer | 1986 |
Woodie Flowers | 1305 |
Judges | 217 |
And, to answer the question this blog post title raises, I’d bet on 1241.
A few caveats:
- If your team isn’t on these lists, don’t despair! First of all, keep in mind that we didn’t train our model to explicitly do this. We’re also only showing what the model thinks is most likely and omitting some pretty competitive runners up. Further, none of the predictions are very confident. The most confident selections for Utah are still under 50%.
- Do to the large number of teams, it’s worthless to state confidence when looking at all teams. Just know that the model doesn’t have high confidence here (as in under a single percent for each of the ones I’ve listed).
- We’re assuming that winning an award in a year is a good approximation of winning an award in an event. This is probably a good assumption, but I would expect our model to be much less accurate because of the relationship between variance and sample size. To put it simply, it’s a lot easier to predict things on average than on a single instance.
- The model has no concept of event difficulty. It may be that some events are more competitive for certain awards than others. It also doesn’t consider multiple wins of an award over a year period.
- Because we’ve structured this as separate logistic regressions, we’ve implicitly assumed that the probabilities of winning an award is independent of winning other awards. This is a pretty bad assumption – as good as they are, it’s unlikely that 399 will win all 11 awards at Utah the model predicts. Same thing with 67 and the world model.
In Summary
We’ve learned a lot together over the last few posts! We’ve learned what logistic regression is, written some Python code, and used the API for The Blue Alliance. We’ve harnessed the power of logistic regression and made some pretty compelling predictions. And perhaps most importantly, we’ve been able to inspect the internals of our model and gained insights into the problem our machine learning model is analyzing.
We’ll see you in our next post.