Who is most likely to win the Chairman’s Award?

Let’s wrap up this series of posts about logistic regression and predicting awards (here’s part one and part two if you missed it).

Accuracy

We last left off with a working model. Let’s look at how it performs on 2018 after training on data from 2010-2017:

Award	Accuracy
Event Winner	73.76%
Event Finalist	68.74%
Chairman’s	89.20%
Engineering Inspiration	88.30%
Creativity	90.10%
Engineering Excellence	90.16%
Entrepreneurship	90.23%
Dean’s List Finalist	87.46%
Gracious Professionalism	89.07%
Imagery	89.06%
Industrial Design	89.77%
Safety	91.83%
Innovation in Control	89.0%
Quality	89.32%
Spirit	89.78%
Volunteer	98.26%
Woodie Flowers	95.31%
Judges	89.26%

Not bad! Keep in mind that we can’t predict the autonomous award because it’s only been awarded twelve times in history; there’s just not enough of a sample size. Also, accuracy isn’t the whole story here. The interested reader can go through and look at the precision and recall, which are not nearly as good.

Coefficient Analysis

If you were with us from the beginning, you’ll notice that our features are just a geometric series of past award wins. If we look at the coefficients for each of these models, we see some pretty interesting things:

Some awards appear to be heavily indicative of winning the award in the future. For example, the single dominating factor in predicting the safety award, imagery award, and entrepreneurship award is winning that award in the past.
The engineering inspiration award is widely regarded as very similar to the Chairman’s award. Interestingly, the model learned this. The coefficients place almost the same value on the two when predicting future Chariman’s performance.
Some awards seem to not be very related. The coefficients suggest that winning the Chairman’s award is not a strong indicator of winning the engineering excellence, imagery, or innovation in control awards. In contrast, the awards that make up the engineering “quinfecta” (the creativity, engineering excellence, industrial design, innovation in control, and quality awards) are good indicators of winning other quinfecta awards.

So who’s going to win these awards at the events I’m attending in 2019?

It turns out that given the way our model is constructed, this isn’t an easy question to answer. The main problem is that different events have different numbers of teams – having a variable number of dimensions is not usually something that machine learning handles nicely. This is a similar problem that we dealt with by using an exponential decay factor in a previous post, but that won’t help us here. That was dimensionality over time, and this is more like dimensionality over space. How are we going to get out of this one?

But there is a way. As it turns out, our model predicts log probabilities for a given team before we translate them back into normal probabilities. Most neural network classifiers are built to do the same thing, but then softmax these together for final probabilities of each of the classes. We can do the same thing here! Without further ado, here’s my predictions for the 2019 Utah Regional, using registration data from December:

Award	Prediction
Event Winner	399, 14.89%
Event Finalist	399, 14.30%
Chairman’s	399, 38.90%
Engineering Inspiration	399, 28.90%
Creativity	971, 14.00%
Engineering Excellence	399, 22.58%
Entrepreneurship	399, 21.34%
Dean’s List Finalist	399, 19.32%
Gracious Professionalism	399, 21.87%
Imagery	3239, 12.43%
Industrial Design	399, 25.27%
Safety	3239, 45.16%
Innovation in Control	971, 21.78%
Quality	399, 24.02%
Spirit	2102, 30.47%
Volunteer	399, 40.82%
Woodie Flowers	399, 25.26%
Judges	2102, 11.90%

“But Brennon,” you might ask, “I’m not going to the Utah Regional. What about my particular event?” Because we can create events with this technique at will, I think the best way to answer this for everyone simultaneously is to create an event where every team in FRC attends. Here you go:

Award	Prediction
Event Winner	67
Event Finalist	67
Chairman’s	1241
Engineering Inspiration	1241
Creativity	67
Engineering Excellence	67
Entrepreneurship	1676
Dean’s List Finalist	1241
Gracious Professionalism	217
Imagery	2246
Industrial Design	67
Safety	1622
Innovation in Control	469
Quality	67
Spirit	1266
Volunteer	1986
Woodie Flowers	1305
Judges	217

And, to answer the question this blog post title raises, I’d bet on 1241.

A few caveats:

If your team isn’t on these lists, don’t despair! First of all, keep in mind that we didn’t train our model to explicitly do this. We’re also only showing what the model thinks is most likely and omitting some pretty competitive runners up. Further, none of the predictions are very confident. The most confident selections for Utah are still under 50%.
Do to the large number of teams, it’s worthless to state confidence when looking at all teams. Just know that the model doesn’t have high confidence here (as in under a single percent for each of the ones I’ve listed).
We’re assuming that winning an award in a year is a good approximation of winning an award in an event. This is probably a good assumption, but I would expect our model to be much less accurate because of the relationship between variance and sample size. To put it simply, it’s a lot easier to predict things on average than on a single instance.
The model has no concept of event difficulty. It may be that some events are more competitive for certain awards than others. It also doesn’t consider multiple wins of an award over a year period.
Because we’ve structured this as separate logistic regressions, we’ve implicitly assumed that the probabilities of winning an award is independent of winning other awards. This is a pretty bad assumption – as good as they are, it’s unlikely that 399 will win all 11 awards at Utah the model predicts. Same thing with 67 and the world model.

In Summary

We’ve learned a lot together over the last few posts! We’ve learned what logistic regression is, written some Python code, and used the API for The Blue Alliance. We’ve harnessed the power of logistic regression and made some pretty compelling predictions. And perhaps most importantly, we’ve been able to inspect the internals of our model and gained insights into the problem our machine learning model is analyzing.

We’ll see you in our next post.

The Blue Alliance Blog

Analysis, stories, tips, and updates related to FIRST Robotics and The Blue Alliance

Who is most likely to win the Chairman’s Award?

Accuracy

Coefficient Analysis

So who’s going to win these awards at the events I’m attending in 2019?

In Summary

Leave a comment Cancel reply

Who is most likely to win the Chairman’s Award?

Accuracy

Coefficient Analysis

So who’s going to win these awards at the events I’m attending in 2019?

In Summary

Share this:

Related

Leave a comment Cancel reply