Today’s tech talk is about a new feature on The Blue Alliance for this season: predicted match times. It is an open secret among our community that events rarely run on schedule and that published match start times are often meaningless. Until now – TBA now shows a live-updating prediction of when each match will start. This blog post will get into the weeds of how these predictions are generated, a surprisingly tricky problem. This is important to pin down, however, since accurately guessing when a match will start is a key component to the BlueZone (our beta automatic webcast switcher), but more on that later.
Let’s start with some definitions and assumptions. A match cycle is the period between successive match starts: it includes one match, the reset time from the previous match, and team introductions. The cycle time is the amount of wall-clock time elapsed between two successive match starts. Typically, events are scheduled with the assumption of 7-minute cycles. If you look at the distribution of cycle times from a few past events. The distribution of cycle times is centered around 7 minutes, but has a “long tail” of slower cycles. Typically, these slow cycles are due to robots not connecting to the field and requiring FTA intervention, field repairs, or other technical difficulties. These delays are essentially random and very difficult to predict. Thankfully, they’re infrequent, although it takes a long time for the schedule to recover from their effects. Let’s now start by examining the distribution of qualification match cycle times this year.
Interestingly, this is pretty close to a skewed Gaussian distribution centered at about µ=400 seconds (slightly less than 7-minute cycles) with a standard deviation σ=120 seconds and a skew parameter γ=4.9. If we draw the (scaled) probability density function over the histogram, it’ll fit nicely.
Next, let’s briefly examine how these outlier cycles affect the overall schedule. It’s very common to get behind due to slow cycles, but it’s much harder to pull off many fast cycles to make that time up.
This data shows us the average range of event “on-timeness”. Events can run over two hours behind schedule, but they are rarely more than 15 minutes early. This is due to the fact FTAs want the event to be on schedule, whenever possible. They can adjust the speed of match flow to shift the field earlier or later. Of course, it’s easier to waste time (hello, dance break!) than it is to make it up. We will want our predictions to error on the side of being early. The time prediction algorithm must take this into account and try to “skew” its error in that direction.The first step of predicting match times is to find the ideal cycle time. This is the time it takes to run a match, reset the field, and do team introductions. In other words, this is the cycle time less any random delays. Once we find this time, we can assume that each future cycle will take this long. We find this value by taking into account all cycle times for played matches on the current day. This calculation has a few quirks, however:
- Matches with a scheduled gap of longer than 15 minutes are ignored. We don’t want lunch breaks skewing the calculations. Cycles that take more than 150% of the scheduled time are ignored. In such cases, is likely there were connectivity delays or a field repair before this match which delayed its start. Since we want to under-shoot matches, these outlier cycles are ignored. We assume that “the schedule is king” so cycle times are slightly biased towards the time scheduled.
biased_cycle = 0.7 * real_cycle + 0.3 * scheduled_cycle
We take the 35th percentile of the biased cycle times. Again, we want our computed cycle times to be early and unaffected by long outlier cycles.
One we have the ideal cycle time, we can apply predictions to future unplayed matches. We know the real start time of the most recently played match so we can apply one ideal cycle length in between each subsequent match.
Let’s see how the algorithm performs!
In conclusion, this is a decent attempt at providing more accurate schedule data at events. We can reliably predict a match within 5 minutes of its actual start. Ideally, we still want predictions to be more accurate, otherwise BlueZone will get boring quickly. But this is pushing the limits of what we can do, given the known variability of cycle times. Furthermore, our predictions have to be a little extra lenient because we don’t know exactly when a match starts (instead, we only know when matches end). This means it’s difficult to tell if a 2-minute delay is due to the match not having started yet or if the match is already over but scores haven’t been posted yet.
Next year, we’re looking at using Computer Vision technology (similar to what powered @FRC_Replay during champs) to determine when matches actually start by inspecting the Audience Display overlay on the webcast. We aim to always be improving our offerings to make it easier for you to follow your favorite teams.
Like this kind of thing? All the code for this blog post is on my GitHub. In addition, The Blue Alliance is an open source community driven project and we’re always looking for new contributors. Drop us a line if you’d like to get involved!