Tech Talk: We <3 Big Data

At The Blue Alliance, we pride ourselves on maintaining the largest database of FRC event results. This data powers everything we do – showing it to you on thebluealliance.com and in the Android app, calculating numerical insights into how various games play out, and powering your own applications via our APIs.

Often, people ask us interesting questions – like “how many times have teams 254 and 971 played against each other?” Traditionally, one of the admins would write a script in python that would calculate this by directly querying the TBA datastore or anyone could write a script using the API. While a fun exercise in programming, this is not always the best approach. Having an easier way to compute various queries over the TBA data would unblock many insightful observations across the FRC community.

To solve this, we have created a public Google BigQuery dataset with all of TBA’s data. BigQuery is a large-scale data warehouse that is easy to query. You can run queries across the TBA database using standard SQL 2011. The dataset is available at https://thebluealliance.com/bigquery.

Note: The schema of the BigQuery tables exactly matches the schema of the internal TBA datastore, so some fields have been deprecated or are not ideally represented. Many times, enums are stored as integers – you can match them to human readable values in the code

For example, if you wanted to answer the above question (“how many times have teams 254 and 971 played against each other?”), you can run the following query:

In less than 5 seconds, we can see that the most recent of these matches was Quals 59 at the 2017 San Francisco Regional.

What if we wanted to find the event that had the most Unicorn Matches (where the winning alliance obtained the Rotor Bonus and KPA Bonus) during qualifications this year? Easy.

Running this query, we see that IRI far and away has the most Unicorn Matches, but the Darwin Division and the Carver Division were tied for the most of any official event.

Screenshot from 2017-08-31 22-10-49

As an exercise, try changing this query to find out if there were any qualification matches where an alliance earned both bonuses but still lost.

In the coming weeks, we will build a system that automatically keeps this dataset up-to-date with the TBA database, so you’ll always be able to get the most accurate results for your queries.

We hope that you’ll enjoy having the full power of the TBA dataset available to query and can compute some interesting insights. When you do, don’t forget to share them with the community!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s