How We Made Rails Upgrades Manageable at Jane, Part 1

Rails upgrades don't have to be a dreaded, all-hands event. We share the repeatable framework our team uses to upgrade incrementally, shipping changes to production as we go, so the actual version switch is just an environment variable flip.

How We Made Rails Upgrades Manageable at Jane, Part 1

Hi, I am Akshay a developer on the Backend Foundations team at Jane. Our team is responsible for maintaining the foundations of Jane and one of those is Rails, and Rails upgrades. This is an attempt share what our team has learnt during the upgrades in recent years.

Introduction

Framework upgrades in software development are inevitable - a bit like taxes. Just like for a sound financial standing, taxes need to be kept track of and paid, frameworks and languages along with dependencies need to be upgraded.

It sounds boring, but upgrades do bring important benefits - keeps the application secure, helps adopt modern practices, and brings new features.

Jane started as a Rails 3 app and now we are working towards upgrading to Rails 8 (from 7.2) as I write this. During this time, we’ve had good opportunities to learn from these upgrades and created a framework we’ve used nearly a dozen times to keep upgrades smooth and predictable and would love to share it with you. Let’s dive in.

Upgrades - the manageable way

I want to begin by admitting that historically, we were not good at upgrades, especially Rails. There were times where we fell behind a security supported version and remained that way for so long. Though one thing we did realize was that if we want to be on the edge, not leading, not trailing, but somewhere in the middle, we want to be predictable with the upgrades.

But how do you do that? Well, for us, the answer was to identify a repeatable framework. Every upgrade is different, but the shape of the work is mostly consistent.

The ease or difficulty of upgrading a language/framework depends on a lot of moving parts, but one of the most important factors is dependencies. In a Rails app, they are in the form of gems and can also be forks or monkey patches on top of something fundamental in Rails. The key is to try to identify them early in the process.

Also, another thing that has been the core of our strategy is to make the upgrade part of day-to-day work. Whether it is fixing deprecation warnings or adopting changed syntax for something, we always upgrade little by little - releasing changes as we go. It has more than a few advantages:

  • There’s never a long running branch that needs constant attention
  • The dependency resolution and technical debt is paid slowly over time
  • The work is divided into smaller chunks where domain experts can choose their own pace, risk mitigation and deploys.
  • The actual version change deploys are seamless
  • The application is always in a state where it can be rolled back safely

Next, let’s talk about the broad strategy, but first - a few things about setup.

The Setup

Let's take a look at a few fundamental tools that work for us. They fit pretty well with the broad strategy. Feel free to skip to the strategy section and come back when needed.

Dual booting

Dual booting is setting the app to boot in not only the current, but the next (targeted) version of the upgrade. It ties back to the fundamental strategy of not having to maintain a separate branch. While it is not required from the get-go, it's highly recommended as this sets our app in such a way that code changes which are not compatible with the current version of Rails are easy to make.

Dual booting can either be achieved by using a bundler plugin such as bootboot from Shopify or by creating a symlink of a Gemfile. We use the bootboot plugin because it maintains a single Gemfile and two lockfiles (Gemfile.lock and Gemfile_next.lock). We also add a rails_next? or rails_#{next_version}? method to Kernel so as to always have a conditional available to execute incompatible code between versions.

plugin 'bootboot'

module ::Kernel
  def rails_8_0?
    ENV["RAILS_8_0_NEXT"] == "1" || ENV["RAILS_8_0_NEXT"] == "true"
  end
end

if ENV["RAILS_8_0_NEXT"] == "1" || ENV["RAILS_8_0_NEXT"] == "true"
  gem 'rails', '~> 8.0'
else
	gem 'rails', '~> 7.2'
end

Deprecation visibility

With every version change, some code or configs are deprecated and marked for removal from the next version. The idea is to draw a line in the sand - the current deprecations in the app get fixed from now on and don't increase.

There are a few different ways of doing this. Deprecation toolkit is one example - it integrates with RSpec and records the current deprecations in yml files. Any new deprecations being added result in test failures.

The other way to get this signal is through logs. Not only turning the deprecation warnings on, but also collecting them in a meaningful way using the ActiveSupport::Notifications API is a great way:

config.active_support.deprecation = :notify
  
# In an initializer
ActiveSupport::Notifications.subscribe(/deprecation\\./) do |_name, _start, _finish, _id, payload|
		Loggingservice.log_deprecation(payload)
end

For us, this means piping deprecation warnings to a Datadog dashboard which makes the information readily available to every dev.

Dual booting CI

This is another part of the setup (we’ll talk about this more in testing section), but I’ll mention this here since this plays a crucial role later on.

The upgrade path

Now let’s dive in. I’ll try not to get super detailed with the tools that can be used here - AI will help move things along much faster, but good old manual ways of doing things also achieve the same goals.

I've broken down the process into functional phases - they don't necessarily have to be executed serially, but we do always begin with an exploration.

Exploration

Think of exploration as creating a mental model of things that need to be done for the upgrade. Kind of creating an inventory - but it does not have to be perfect. Scrappy works, with one note: As soon as it's clear that a dependency needs to be updated, or for instance, there are changes to the way we auto-load constants, or config updates, we note it down (for us, in the form of tickets). Whether it's a task or a project, we refine that later.

To get started, we create a branch with the sole goal of booting in the next version of Rails. Mostly achievable by one dev (leveraging AI is great here) - unblocking one error at a time. It’s fine to assume that the branch that boots the app will not be anywhere near the shipped version of the upgrade. We take a lot of liberties at this point - need to add 20 inflectors to get autoloading working? perfectly fine. Need to change the config which will be too risky if done in a real production app - totally okay.

The key here is to try to build out actionable spikes or tasks out of this process and not get too much into the weeds of everything. At the end of this exploration, there's a greater chance that these things will be clear:

  • A list of gems/dependencies that need to be upgraded
  • Adoption of the framework changes needed, for example moving to strong params, or use of Zeitwerk autoloader
  • Code changes needed in the initialization process - that could be updating configs or monkey patches that use syntax/abstraction no longer valid in the next Rails version
  • A bunch of unknowns that need a deeper digging into

A successful exploration may not even have a booting branch by the end. It could very well generate a few project spikes to begin with. Anything that throws light on the path forward is a successful exploration. Sometimes it discovers hidden technical debt, like having to remove unmaintained dependencies such that it makes more sense to execute on that and come back to exploration again, which is totally fine. It has been times that we paused exploration - for example when Zeitwerk became the only autoloader supported in Rails 7.0 and we had to make sure we are able to switch to it before doing any more exploration.

Execution

At this point, we've already done some form of exploration. Also, we've set up some way to capture deprecation warnings generated from code, and a line is drawn in the sand so that while old deprecation warnings are fixed, new deprecated code is not introduced. There are 3 main goals of this phase:

  • Run spikes identified from the exploration phase
  • Work on things identified from these spikes (could be project size, example - strong params)
  • Fix deprecation warnings generated from code
  • Update dependencies/gems and pay technical debt if needed
  • Adopt any configurations needed

In my opinion, this phase benefits from parallelism. Most of the work is divided into silos where the Foundations team runs spikes and we leverage domain experts to help with anything that comes out of the spikes, whereas the deprecation warnings are either fixed by domain experts or (simpler ones) are fixed in a single swoop.

This is the longest-running phase of the project. For us, it can last up to 3-4 months.

All the code changes here are merged into master/production. Even if it means incompatible changes live under if/else block depending on which Rails version the app is running.

Also, as we move through this phase, it is normal to uncover more dependencies that need to be untangled. One such example is upgrading gem A which depends on a different version of gem B, which in turn depends on updating syntax in the code.

Something that helps here again is just keeping code changes smaller - meaning just keep the dependent dependencies as a separate change and execute in a different order. From our experience, a slow burn is much more productive than trying to overwhelm ourselves.

Also, before we move on, I want to touch a bit on the role of CI/test suite:

CI plays a major role in an upgrade. More test coverage means more confidence. Also, with tools like deprecation toolkit which solely depend on the quality of tests an application has.

Even a single failure in our test suite has led us to uncovering the need to upgrade multiple dependencies. During this phase, as the needle on deprecation warning fixes and dependency upgrades moves, the signal from CI becomes clearer. We tend to keep an eye on it by running CI in the next version of Rails fortnightly. As the deprecation-related failures shrink, it becomes easy to identify and investigate errors not related to obvious deprecation in the code.

At the end of the execution phase:

  • All deprecation warnings are fixed
  • All dependencies have been upgraded/updated
  • CI runs successfully in the next version of Rails
  • The infrastructure is in place to be able to run CI on current and next version of Rails (more in next section)
  • We are able to switch between Rails versions using dual boot—the app runs the next version

Testing & Preparing for deploy

Now, testing for a Rails upgrade is subjective and depends on the setup of the application. An app can solely rely on the test suite, manual QA, or a combination of both.

We do the latter, therefore we have a couple of weeks between when the execution phase is done and the deploy. But at this point, because we don't want to go back in time meaning accidentally introduce deprecated or incompatible changes, we enable dual CI runs on all release branches. This means that CI not only runs the test suite booting the current version of Rails, but also does so booting the next (target) version of the Rails upgrade. Generally for us, this phase lasts about a couple of weeks, ensuring we always have master in a deployable-to-target-version state. We also run all our sandbox instances using the next version of Rails during this time.

Also, we do spend some time mimicking the actual deploy and rollback - making sure users don’t have a bad experience, for example getting kicked out of a session etc.

Deploy

At this point, deploying is just a formality. With tools like bootboot, Rails is booted in the next version of Rails depending on whether an env variable is set. Choosing a date and time, making sure we are monitoring the right things, we just change an env variable and restart. Rollbacks are as easy as removing the environment variable and restarting.

Cleanup

After the deploy is out, now comes the fast followup - cleaning artifacts. It mostly includes removing conditional code for previous version of Rails or preparing bootboot and lockfiles for next version, so we can start again from the top.

Conclusion & the next part

I hope you are able to draw some inspiration from this framework. I’ve intentionally not dived deep in the tools that we use because we are re-imagining how we leverage AI to help us get there. The broad framework remains the same. That is something we’ll talk about in the next iteration along with other tools that help us - after we are done upgrading to Rails 8. Thanks for reading and stay tuned!