Automated Dependency Management

Aug 19th, 2020

engineering

Managing dependency versions for your applications and modules can be a time consuming pain in the bum. Not only must you keep tabs on the release cycles of all of your dependencies, but you must then assess each release invidually, manually update the versions in all the consuming applications you own, and make any changes required to support these latest versions - every. single. time.

Luckily as software development practices and the dependency ecosystems we use have matured, we've seen the advent of several automation tools that allow us to simplify this once burdensome process into something far more manageable.

Why Manage Dependencies At All?

We've already highlyighted some of the pain points in keeping dependencies up-to-date, but we haven't really discussed why you should even care to do so in the first place. When I suggest the need to keep dependencies up-to-date, I'm regularly met with the same refrain:

It works with the current version, so I won't even bother. I'll update when I need a new version.

This is an understable - albeit misguided - position for several reasons:

Practicly speaking, by the time you find yourself needing functionality exposed in newer versions of any given dependency, you risk finding yourself several breaking versions behind. In this (all too common) situation, you artificially increase the burden of incorporating newer versions by having to make several unrelated changes to your existing codebase - just to get access to a single version which itself may not actually require any changes at all!

Another more nebulous concern is that of security. Any given dependency may be functionally adequate for the given state of your application, but security vulnerabilities are discovered and patched every day. By neglecting to keep your dependency tree up to date, you put your application, and therefore your users, at significant risk. This is perhaps the single most critical reason to keep your dependencies up to date.

Automatic Versioning

Hopefully by now you're convinced that keeping your dependencies up to date is an important task. However, we also know that it is by no means a trivial one. Keeping dependencies as up to date as possible could possibly be a full-time job in itself, depending on the scope of your ownership. That said, there are several tools we can leverage to make this process much easier.

Tools

SemVer

Perhaps the most important concept to build upon is that of Semantic Versioning or SemVer. Semantic Versioning is a practice that aims to imbue any given version of a code module with an explicit meaning that correlates to the code itself. When releasing new versions of modules that you own, its important to adhere to this standard so as to make life easier for those who consume your work.

In practice, Semantic Versioning manifests itself as three different release categories, each indicated by the various pieces of a version tag (you may be familiar with the syntax: v1.2.3). Let's look at those in more depth:

Example: v.1.2.3

Major

A Major version release is often the least common. This is because a major version change indicates that something, well, major has changed with the underlying code that necessitates consumers modify their consumption thereof. This is often referred to as a "Breaking" change or even a "Major Breaking" change (cue military salute: "Major Breaking Change").

In the example above, 1 is the major version.

Minor

A Minor version release indicates that the underlying module has changed in such a way that any consumer should be able to update without having to make any changes to their existing codebase. These minor versions, however, likely contain new features that we may wish to make use of in our applications.

In the example above, 2 is the minor version.

Patch

A Patch version release is often the most common. Patches indicate changes to the underlying code that should be invisible to those consuming it. These sorts of changes include (but are not limited to): bugfixes, security vulnerability patches, and performance optimizations.

In the example above, 3 is the patch version.


It's worth acknowledging that the version definitions above are flexible. It may be impossible to address a given security vulnerability without introducing a breaking change to a codebase, and therefore a major version change may be required for what would normally simply be a patch. Being cognizant of how your code is used by others is a critical facet of developing shared modules.

Semantic Release

Understanding what semantic versions mean is one thing, but having to make this decision every time you wish to release a version of your application can be a nightmare - especially if you share ownership of a codebase with other developers who may have unreleased changes that you're unfamiliar with. In that scenario, how do you know what category is correct for releasing your code changes if that release will also include changes from others? This can lead to many wasted hours of effort simply communicating between developers who may not even inhabit the same time zone!

One excellent way to ensure that this communication can be done quickly and without coordinating disparate human beings (see: automate) is to leverage what is known as Semantic Commits (are you noticing a trend?). Semantic Commits aim to make Git commits convey meaning similar to the way that Semantic Versioning conveys meaning of releases. With semantic commits, each commit message is prefixed with a common keyword that indicates to other developers what that change includes. Examples of semantic commit prefixes include (but are not limited to):

Semantic commit prefixes are entirely subjective. The goal is simply to foster better asynchronous communication between developers. Establishing a shared language with your collaborators is crucial here. That said, several automation tools build on established shared languages, and so you'll often find that folks will gravitate towards using the prefixes established by the Angular Project, as they are simply the most commonly used (and many automated tools build on them).

One such tool is Semantic Release, and this tool is critical for streamlining adherence to SemVer for maintainers of shared codebases. Semantic Release is a tool for automatically creating new releases of your codebase, typically on every merge to the master branch. Semantic Release requires Semantic Commits, as it will use the aforementioned prefixes to determine what level of release to create.

This ensures significantly higher fidelity of release versioning, as its much more likely that any individual commit will align with its semantic meaning as opposed to a large bundle thereof. It also ensures that any release created by an individual developer is not burdened by unrelated changes made by another.

By cutting a new release on every merge to the master branch, we also wind up creating far more releases than we may be used to. This enables consumers far more control over what code changes to incorporate into their applications, but also increases the cognitive load of making that decision. Wouldn't it be great if we could automate that as well?

Automatic Upgrades

Finally, we get to the best part: making sure that our consuming applications are always on the latest version of their dependencies! As SemVer has rapidly taken over the world of open-source software development (thanks in no small part to the many tools and practices outlined above), similarly more tools have been created that build on the assumption of SemVer to simplify the consumption side as well as maintenance. We'll look at two popular tools below: WhiteSource's Renovate and GitHub's Dependabot:

Renovate

Long the reigning champ of automated dependency management, Renovate (recently purchased by WhiteSource Software) is an application that runs on a dedicated compute instance and scans source code repositories (think Github, GitLab, Bitbucket, etc.) that specify their dependencies in code (package.json, pom.xml, etc.) and leverages public module repositories (npm, maven, etc.) to know when a dependency needs updated.

If you're using GitHub to host your consuming application, Renovate will open a Pull Request to your application whenever a new version of a dependency is released. This completely removes the burden of having to track dependencies from developers. Simply monitor you codebase for pull requests from Renovate's bot, and merge away! With some simple configuration, you can even configure GitHub + Renovate to automatically merge these pull requests without you having to even verify them yourselves.

Be sure to only automerge minor or patch releases, however! Major versions require other changes that require an understanding of your application that Renovate simply won't be able to automate!

Dependabot

Perhaps less well known, Dependabot is another tool for automating updates to dependencies. After being acquired by GitHub, Dependebot has become far more prevalent within the GitHub ecosystem as a tool of first resort when approaching security. If you're using public GitHub, you'll likely have noticed the new "Security" tab on some of your code repositories; this is largely powered by Dependabot!

The fact that Dependabot's adoption into the GitHub suite of tools is maturing rapidly, its quickly becoming a qualified alternative to tools such as Renovate. With its security integrations being baked into the GitHub interface, and enterprise support coming soon, it's definitely worth giving it a shot if you're just getting started with automated dependency management or are still using Renovate.

A Word of Caution

As with any automation, the lack of manual oversight presents its own concerns. While systems and tools like SemVer aim to enforce a strict definition of dependencies that indicate consumption requirements, its entirely possible that erroneous release versions can occur - eg. it is entirely possible that a minor release version can include breaking changes. By automatically consuming this minor version without modifying our consumption, we risk breaking our application quickly and without warning.

The best way to mitigate this possibility is the same way we mitigate any other potentially destructive change to our applications: with a robust automated testing ecosystem. By automatically ensuring the integrity of our applications on every single change, we're able to confortably allow automation to assume more and more responsibilities so that we may focus on driving value for our business.