At The Blue Alliance, we pride ourselves on maintaining the largest database of FRC event results. This data powers everything we do – showing it to you on thebluealliance.com and in the Android app, calculating numerical insights into how various games play out, and powering your own applications via our APIs.
Often, people ask us interesting questions – like “how many times have teams 254 and 971 played against each other?” Traditionally, one of the admins would write a script in python that would calculate this by directly querying the TBA datastore or anyone could write a script using the API. While a fun exercise in programming, this is not always the best approach. Having an easier way to compute various queries over the TBA data would unblock many insightful observations across the FRC community.
To solve this, we have created a public Google BigQuery dataset with all of TBA’s data. BigQuery is a large-scale data warehouse that is easy to query. You can run queries across the TBA database using standard SQL 2011. The dataset is available at https://thebluealliance.com/bigquery.
Note: The schema of the BigQuery tables exactly matches the schema of the internal TBA datastore, so some fields have been deprecated or are not ideally represented. Many times, enums are stored as integers – you can match them to human readable values in the code
For example, if you wanted to answer the above question (“how many times have teams 254 and 971 played against each other?”), you can run the following query:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#standardSQL | |
SELECT match_key, red_score, blue_score, red_teams, blue_teams | |
FROM ( | |
SELECT __key__.name as match_key, team_key_names, | |
JSON_EXTRACT_SCALAR(alliances_json, '$.red.score') as red_score, | |
JSON_EXTRACT_SCALAR(alliances_json, '$.blue.score') as blue_score, | |
REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json, '$.red.teams'), '"(frc[0-9]+)"') AS red_teams, | |
REGEXP_EXTRACT_ALL(JSON_EXTRACT(alliances_json, '$.blue.teams'), '"(frc[0-9]+)"') AS blue_teams | |
FROM `tbatv-prod-hrd.the_blue_alliance.match` | |
WHERE 'frc971' IN UNNEST(team_key_names) AND 'frc254' IN UNNEST(team_key_names)) | |
WHERE 'frc971' IN UNNEST(red_teams) AND 'frc254' IN UNNEST(blue_teams) | |
OR 'frc971' IN UNNEST(blue_teams) AND 'frc254' IN UNNEST(red_teams) | |
ORDER BY match_key | |
LIMIT 100 |
In less than 5 seconds, we can see that the most recent of these matches was Quals 59 at the 2017 San Francisco Regional.
What if we wanted to find the event that had the most Unicorn Matches (where the winning alliance obtained the Rotor Bonus and KPA Bonus) during qualifications this year? Easy.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#standardSQL | |
SELECT | |
event_key, | |
COUNT(*) as unicorn_count | |
FROM ( | |
SELECT | |
__key__.name as match_key, | |
event.name as event_key, | |
JSON_EXTRACT_SCALAR(alliances_json, '$.red.score') as red_score, | |
JSON_EXTRACT_SCALAR(alliances_json, '$.blue.score') as blue_score, | |
JSON_EXTRACT_SCALAR(score_breakdown_json , '$.red.kPaRankingPointAchieved') as red_kpa, | |
JSON_EXTRACT_SCALAR(score_breakdown_json , '$.red.rotorRankingPointAchieved') as red_rotor, | |
JSON_EXTRACT_SCALAR(score_breakdown_json, '$.blue.kPaRankingPointAchieved') as blue_kpa, | |
JSON_EXTRACT_SCALAR(score_breakdown_json, '$.blue.rotorRankingPointAchieved') as blue_rotor | |
FROM `tbatv-prod-hrd.the_blue_alliance.match` | |
WHERE comp_level = 'qm' | |
) | |
WHERE (red_score > blue_score AND red_kpa = 'true' AND red_rotor = 'true') OR (blue_score > red_score AND blue_kpa = 'true' AND blue_rotor = 'true') | |
GROUP BY event_key | |
ORDER BY unicorn_count DESC |
Running this query, we see that IRI far and away has the most Unicorn Matches, but the Darwin Division and the Carver Division were tied for the most of any official event.
As an exercise, try changing this query to find out if there were any qualification matches where an alliance earned both bonuses but still lost.
In the coming weeks, we will build a system that automatically keeps this dataset up-to-date with the TBA database, so you’ll always be able to get the most accurate results for your queries.
We hope that you’ll enjoy having the full power of the TBA dataset available to query and can compute some interesting insights. When you do, don’t forget to share them with the community!