Technology Subset

Contributing to open source at GitHub – Technology Subset

Clicky
Spread the love


Ariel Deitcher (@mntlty) is a Senior Software Engineer at GitHub, working on Pull Requests and Merge Queue (beta). In this post, he shares the challenges he encountered finding his path to contributing to open source, what it was like contributing to open source at GitHub, and some of the lessons he learned.

Getting started with open source can be overwhelming

As a computer science graduate in 2011 and searching for my first tech job, I read that contributing to an open source project could help. It was a great way to build skills, make industry connections, and gain practical experience with a real-world problem. Perfect, I thought, I’ll just pick an open source project on this new website called GitHub, and, well, actually I wasn’t sure how to do that. Finding that “Goldilocks” project (where the size, language(s), domain, and community felt just right) was a lot harder than I thought, and I didn’t feel self-confident enough to make much progress. Overwhelmed, I decided the timing wasn’t right but resolved to try again someday.

It bugged me that the contribution graph on my GitHub profile remained stubbornly empty, as all the code I had committed lived in private repositories. That changed in 2016 when contributions to private repositories could be shown on my profile, but my contributions to open source had not. Between my family, work, life, and the explosive growth in projects to choose from, making that first contribution to open source felt more daunting than ever.

The opportunity to contribute to open source at GitHub

Fast forward to 2021. I read Working in Public: The Making and Maintenance of Open Source Software by Nadia Eghbal while interviewing at GitHub. I was especially captivated by the Stadium model of open source projects, where a small number of maintainers and occasional contributors are vastly outnumbered by a project’s users. This aligned with my mental model of open source projects, where a few performers on a digital stage would conjure feats of coding wizardry. I could only imagine how vulnerable working in public could be, and hoped it would feel less intimidating working at GitHub.

I joined the GitHub team building Merge Queue (beta), a feature which helps users coordinate their merges to a protected branch, ensures that changes are up to date, and that all required checks pass before automatically merging a pull request. Early on, I shared my long-held goal of contributing code to an open source project with my manager, and discussed the GitHub CLI, an open source tool written in Go which lets users interact with GitHub from the command line, as a possible candidate.

While building Merge Queue, our team carefully integrated it with GitHub’s many APIs and tools, checking each one for compatibility and correctness. Testing different scenarios of merging a pull request with the GitHub CLI, I saw that once a Merge Queue was required, running the CLI command gh pr merge would fail in most cases. The Merge Queue was correctly preventing direct merges to its protected branch, and so I began scoping out what changes the CLI might need to support Merge Queue.

As I didn’t have write access to the CLI repository, I forked it, started a new Codespace, and spent some time getting familiar with the CLI’s contributing guidelines and code. Wanting to minimize my changes, I targeted a few places in the merge command to modify. When I was ready, I pushed a commit to my fork and opened a pull request to share with the CLI maintainers. I expected that I would provide support but defer to them for the final implementation.

In reviewing my pull request with the CLI maintainers, it quickly became clear that my changes were hard to reason about. The merge command had accumulated sufficient technical debt that adding more complexity to it was risky. The team asked if I could refactor the merge command in an initial pull request and follow up with a subsequent pull request for the Merge Queue changes after the first was merged. What I had thought would be a rough guide of changes for the CLI maintainers was, in fact, the opportunity I had been looking for to contribute to open source at GitHub. I confirmed that my manager was onboard with this increased commitment, and was ready to get started.

Refactoring the merge command and adding Merge Queue support

I set out to refactor the merge command with a focus on simplicity, readability, and returning early over deeply nested conditionals. The existing test coverage gave me a confidence boost as I began stepping through the code, copying each section into a separate file for later reference, and wrote comments which I felt captured the intent of the removed section. I then grouped related Git and API operations, consolidated common code into appropriately named functions and variables, trimmed unreachable code paths, created a MergeContext struct to encapsulate state, and leaned into Go’s explicit error returns – all of which gave the code a more linear and consistent structure.

As an example, the mergeRun function, which is the heart of the merge command, went from over 220 lines to just 30:

func mergeRun(opts *MergeOptions) error {
    ctx, err := NewMergeContext(opts)
    if err != nil {
        return err
    }

    // no further action is possible when disabling auto merge
    if opts.AutoMergeDisable {
        return ctx.disableAutoMerge()
    }

    ctx.warnIfDiverged()

    if err := ctx.canMerge(); err != nil {
        return err
    }

    if err := ctx.merge(); err != nil {
        return err
    }

    if err := ctx.deleteLocalBranch(); err != nil {
        return err
    }

    if err := ctx.deleteRemoteBranch(); err != nil {
        return err
    }

    return nil
}

When I was finished, I opened a pull request from my fork to the CLI repository, and was blown away by how supportive the code review process was. After a few rounds of feedback, my code was merged and ready to ship in the next release. I was an open source contributor at GitHub!

Returning to my fork, my original Merge Queue changes were now completely out of date. In fact, much of the code I had on my branch no longer existed on the CLI’s trunk branch. Fortunately, I was now intimately familiar with the merge command and was able to make the Merge Queue changes and tests in a subsequent pull request quickly and with confidence.

Lessons learned

Looking back, I learned that searching for the right open source project on my own, trying to create time outside of work, and context switching from my existing projects were obstacles I could not overcome. Instead, the key for me was to find an open source project that was important to what I was already working on, and that I was accountable for. If this sounds familiar, consider asking your manager if you can devote some time to work on an issue in an open source project that you or your team rely on. It’s much easier to get started with an open source project you know and can align with work you’re already committed to. I recognize how fortunate I was to be in the right place at the right time, and with the right support from my manager, but it wasn’t easy.

Many people I know struggle with impostor syndrome, and working in public made me even more aware of mine. I am learning to accept that even though my commits aren’t perfect, and that I’m afraid of being judged for creating bugs like this regression, which will be discoverable forever, I should still contribute. Despite these challenges, I enjoy picking up new issues in the CLI labeled “help wanted (contributions welcome)” whenever I can, and hope you will too!



Source link