Tech Talk: Migrating the TBA Backend to Python 3

At the start of the 2021 season, we announced that the majority of the site was being served by a new python3 backend. And now, two and a half years later, we can report that the legacy python2 application has been fully decommissioned. We’re incredibly proud of the fact that we were able to complete this migration with a volunteer team with minimal site downtime along the way. Let’s take a look back at the migration and share some lessons we learned along the way.

TBA’s backend is built in python and runs on the Google App Engine standard runtime. The database is Google Cloud Datastore and the NDB ORM library. There are periodic jobs which fetch data from upstream sources (like FIRST’s APIs) or do some computations and write the results to the NDB. Separately, the site’s frontend will query data requested by the user from the NDB and render the response that’s ultimately returned to your browser.

Benefits of Python3

Even if python2 were not end-of-life, there are still a number of reasons the site is better served on a modern and actively supported stack, for both developer experience and performance reasons.

I think type checking is one of the biggest reasons – in any large python project, it becomes difficult to track the expect schemas of objects, or to know “can this variable be None or will it always exist?”. We use the pyre type checker to validate type hints in the code, which allows development to move faster and with more confidence than it could before.

Principles of the Migration

Incrementalism – We’d have been doomed to failure if we structured the migration in such a way where we build up the new version independently from the old one and cut the site over all at once. We wouldn’t have been able to learn as we go from observing how the new stack works in production, and the risk of having to redesign something after we tried it would have been huge. By cutting over pages one at a time using App Engine’s dispatch routing and separate subdomains (py2 and py3), we could test out new approaches on a small scale and prioritize pages based on how heavily used they were. Plus, if there were any issues, we’d only break a single component of the site and could more easily roll it back.

No Data Changes – Both python2 and python3 interact with the same NDB. Having to rewrite the application layer was a big enough lift, we didn’t want to include a full data migration as well. Having the data remain static also means the migration was backwards compatible (because we didn’t want to risk one version writing data the other didn’t know how to process). We migrated the “read only” parts of the site and made sure they were robust before migrating the “write path” to further reduce the risk of data corruption. In fact, up until the very end, the python2 version of the site still worked – at testament to prioritizing compatibility at the data layer as a migration strategy.

Testable & Typed – Since we already had to touch essentially the entire codebase to make the necessary python3 syntax changes, we decided to take a bit of extra time and make the code better while we were there. This meant adding type hints and ensuring we had appropriate test coverage. Investing in improving the baseline code quality makes us more confident in the migration, and reduces the barriers to entry for future contributors – effort which will certainly pay off in the long run!

Boring – However, this was not the time to a full-stack rearchitecture of the site. We wanted to decouple functional changes to the site from the runtime migration as much as we could, because that reduced the scope of work, and an expectation of equivalent behavior makes it easier to validate changes. While scope creep is tempting, this is a time to pick the boring thing, and take the approach we know to work.

App Engine Bundled Services

App Engine originally came with a number of awesome libraries with a generous free tier, which made it very easy to build apps in the ecosystem. However, in 2020, few of these services had equivalents in python3 (and if there were present, they we no longer free). This is one of the reasons we put the migration off for so long, but once python2 went out of support, we were left with no choice.

This Github issue shows a comparison of the bundled services between python2 and python3. Many of the new offerings had compatibility or feature gaps, which made an in-place migration quite difficult. This wiki page tracks some of the compatibility hacks we had to make initially – like a datastore emulator for unit tests, specific test coverage for datastore backwards compatibility, batch sending of notifications, and a custom serialization implementation to maintain pickle compatibility between versions. At the beginning, we definitely spent the majority of our effort on re-implementing parts of the GAE builtins instead of focusing on TBA code, because the ecosystem was still not mature enough to make the migration straightforward.

Thankfully, later in 2021, Google announced proper support for the bundled services on python3, which meant we could continue using the libraries with the same expectations in python2. This was huge, because it let us get back to our core principles, re-evaluate how we want to interact with our dependencies, and ultimately remove a lot of janky code.

The last dependency that needed investigation was Cloud Endpoints, a framework that we used to build an RPC contract with the mobile apps (to register for push notifications, update myTBA preferences, etc). The library only supported Python2 and had since become unmaintained, so there was no officially recommended migration path available. Ultimately, we had all the pieces to implement what we needed much more simply, although it took a fair amount of investigation and reading the legacy code to figure out how to glue everything together.

Once we got the dependencies sorted out, the work became an exercise in getting all the remaining application code ported over. All in all, the Python3 version of TBA is over 40,000 lines of code (and about 50,000 lines of tests)

A Thank You

A huge thanks to everyone who has contributed to the py3 branch over the last 2.5 years – TBA is fully volunteer driven, and none of this would have been possible without the community’s contributions.

If you’d like to get involved with TBA, reach out to us on GitHub (we’re always looking for new contributors and would be happy to help get you ramped up!) or consider donating to help fund the site’s operating costs.

See below for the git shortlog summary of the work that’s been done since the beginning of the migration, and thanks again!

     8  AGPapa
     1  Abhay Shukla
     2  Alan
     4  Andrew Dassonville
    17  Andrew Ke
     1  Blake Bourque
     3  Caleb Denio
     1  Cheru Berhanu
     1  EmreAdabag
   244  Eugene Fang
     3  Greg Marra
     1  Greg Stanton Marra
     1  Jack R
     4  Jared Hasen-Klein
    19  Jordan Miller
     1  Josh Bacon
    11  Justin
     6  Justin Tervay
     1  Kim Flynn
     2  Michael Leong
     4  Ofek Ashery
   357  Phil Lopreiato
   243  Zachary Orr
     1  bovlb
     1  gaming
     1  julianewberry

Leave a comment