Fetching a private GitHub repository in CircleCI to build a Docker image

Private dependencies from GitHub in your Docker container

Kamil Lelonek
Kamil Lelonek  - Software Engineer

--

Every so often your deployment pipeline is pretty complex and consists of multiple steps including testing, building, packaging and releasing your entire application.

Such a Continuous Integration process is not always trivial and sometimes touches authorization in different services to fetch necessary dependencies and be able to deploy your project.

During Continuous Delivery steps you may want to lint your codebase, perform the test suite, build or compile your project, create artifacts and prepare an executable package to be run on an external environment.

In this article, I’d like to focus on the most tricky parts and answer the following questions:

  1. How to use a private package from GitHub in your package manager?
  2. How to access a private library in CircleCI from another repository?
  3. How to fetch a private dependency from GitHub on CircleCi when building a Docker image?

Package Manager

No matter what language you use, almost for sure, you use some kind of package manager. It’s either npm, bundler, hex or any other pip.

Most of the time, it’s very easy, especially when you are working on that locally. You may, for example, fetch them like that:

gem 'private', git: 'https://<username>:<password>@github.com/org/repo.git'# orgem 'private', git: 'https://x-access-token:<token>@github.com/org/repo.git'

If your Git repository requires authentication, such as basic username:password HTTP authentication in URLs, it can be achieved via Git configuration, keeping the access rules outside of source control.

git config --global url."https://YOUR_USER:YOUR_PASS@example.com/".insteadOf "https://example.com/"

For more information, see the git config documentation: https://git-scm.com/docs/git-config#git-config-urlltbasegtinsteadOf

The problem is that sometimes you don’t want to expose your own credentials, nor generate your personal access token. You usually want to have some more fine-grained solution with better security control.

Considering that, we will stick to the SSH configuration like this:

defp deps do
[
{:repo, git: "git@github.com:org/repo.git"},
]
end

What are the alternatives?

Most of the package managers offer some kind of private registries, usually paid ones. They are for example:

However, if for some reasons (security or money), you may want to fetch dependencies directly from GitHub private repositories, I’ll describe how to do that in the next section.

GitHub

Each time you want to clone a public repository from GitHub, you can use just git clone and that’s quite enough. If the project is private, you can provide your username/password pair in the URL or a personal access token. However, as we decided we don’t want to do that for some reasons, we need to find another way. Here come deploy keys.

A deploy key is an SSH key that is stored on your server and grants access to a single GitHub repository. They are often used to clone repositories during deploys or continuous integration runs.

GitHub attaches the public part of the key directly to your repository instead of a personal user account, and the private part of the key remains on your server.

The procedure is as follows:

https://developer.github.com/v3/guides/managing-deploy-keys/

Although the official docs say to use:

ssh-keygen -t rsa -b 4096 -C "email@example.com"

It won’t work on CircleCI though. You will see the following error (even if GitHub accepts the key):

We’ll talk about that later on but, for now, to avoid that, you should use this command instead:

ssh-keygen -t rsa -m PEM -C "email@example.com"

Please do not provide any passphrase there either (press enter for no passphrase), otherwise, you will see:

Once we have a deploy key generated and assigned to a particular GitHub repository, let’s see now how to integrate it with your favorite CI server.

Continuous Integration

In more advanced environments and complex setup, you don’t build your projects locally. There are multiple continuous integration servers to automate this process. They are usually very similar and it doesn’t matter which one you choose so, as you might have already guessed, I’ll pick CircleCI. I assume you have your project integrated there so I’ll skip directly to its configuration.

The integration between CI and GitHub works by creating a new deploy key for your repository automatically by your CI tool. A deploy key is a repo-specific SSH key and GitHub has the public part while CI stores the private one. It authorizes the server to checkout your code and to run all defined commands like linting, testing and building for example, sometimes even with deploying.

Because our building process refers to multiple repositories, CircleCI will need an additional and specific GitHub deploy key because each of them is valid for only one repository.

  1. Open https://circleci.com/gh/you/your-private-dependency/edit#ssh and add the key you created in the previous step. Leave the Hostname field empty, and press the submit button.
    A side note: if you put github.com as a hostname, CircleCI will try to fetch the actual project using this key but, as you assume, it won’t work as the deploy key was configured for another repository, not for the entire GitHub user, neither for the “main” repository.
  2. In your config.yml, add the fingerprint using the add_ssh_keys key:
version: 2
jobs:
build:
steps:
- add_ssh_keys:
fingerprints:
- "SO:ME:FIN:G:ER:PR:IN:T"
- checkout
- install_dependencies.sh

However, if you execute the build now, you will see the following error:

* Getting dependency (git@github.com:org/dependency.git)
ERROR: Permission to org/dependency.git denied to deploy key
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
** (Mix) Command "git --git-dir=.git fetch --force --quiet --progress" failed
Exited with code 1

It tries to use our id_rsa deploy key configured for the checked out repository to also fetch our private dependency.

What I recommend to do now is to remove all identities after checkout and add only this deploy one. First of all, we already have done the checkout, secondly, we want to fetch our private dependency using the other key we created.

version: 2
jobs:
deploy-job:
steps:
- add_ssh_keys:
fingerprints:
- "SO:ME:FIN:G:ER:PR:IN:T"
- checkout
- run:
name: Configure SSH agent
command: |
ssh-add -D
ssh-add ~/.ssh/id_rsa_somefingerprint
- install_dependencies.sh

Now, you can fetch your private dependency.

What are the alternatives?

If your server needs to access multiple repositories, you can create a new GitHub account and attach an SSH key that will be used exclusively for automation. Since this GitHub account won’t be used by a human, it’s called a machine user.

In such case multiple keys are not needed — one per server is adequate. You can add the machine user as an outside collaborator on an organization repository (granting read, write, or admin access), or to a team with access to the repositories it needs to automate (granting the permissions of the team).

Docker

The final step of our deployment pipeline is to build an application as a Docker image.

Initial setup

Since I’m using alpine as the base image, it requires adding git to clone a git repository via SSH:

FROM alpineRUN apk add \
--update \
--no-cache \
git

otherwise, you would get:

Step 3/3 : RUN git clone git@github.com:org/dependency.git
---> Running in 632f576ecdf5
/bin/sh: git: not found
The command '/bin/sh -c git clone' returned a non-zero code: 127

All instructions described in a Dockerfile run in an isolated environment and don’t have access to the host machine directly. Thus, neither SSH agent nor SSH keys are available there out of the box. If you try to fetch our private dependency inside a container, you will see:

Step 8/18 : RUN install_dependencies.sh
---> Running in 48738e8ae342
* Getting dependency (git@github.com:org/dependency.git)
error: cannot run ssh: No such file or directory
fatal: unable to fork
** (Mix) Command "git --git-dir=.git fetch --force --quiet --progress" failed
The command '/bin/sh -c install_dependencies.sh' returned a non-zero code: 1
Exited with code 1

SSH agent forwarding

What I suggest to do now is to use SSH agent forwarding which allows you to use your local SSH keys instead of copying them to your server. In our case local means actually CircleCI and sever is a Docker image.

There are a couple of steps to cover. First of them is to make sure you are using at least Docker Engine 18.09. Once you do that, you are able to enable BuildKit which provides an improvement in performance, storage management, feature functionality, and security.

To enable docker BuildKit by default, set daemon configuration in /etc/docker/daemon.json feature to true and restart the daemon:

{ "features": { "buildkit": true } }

On CircleCI you can alternatively use:

version: 2
jobs:
build:
docker:
- image: circleci/elixir
environment:
DOCKER_BUILDKIT: 1

Additionally, modify the .circleci/congig.yml to make sure it will work the specific Docker version we need:

- setup_remote_docker:
version: 18.09.3
docker_layer_caching: true

A side note: SSH Agent Forwarding has its flaws. If your proxy machine is compromised and you use SSH agent forwarding to connect to another machine through it, then you risk also compromising the target machine. You might say that host only belongs to yourself, there is no other user on it, even less so someone with root access. But then again: Why take the chance? You might also say that the window of compromise is small since it is only open while you’re connected to the host. Again: Why take the risk?

You may use ssh-agent -c which will show a confirmation window each time some program wants to use the agent to authenticate somewhere. Moreover, you can use either ProxyCommand or ProxyJump. That way, ssh will forward the TCP connection to the target host and the actual connection will be made on your workstation. If someone on the proxy machine tries to MITM your connection, you will be warned by ssh.

Building

The easiest way from a fresh install of docker to enable BuildKit is to set the DOCKER_BUILDKIT=1 environment variable when invoking the docker buildcommand, such as:

$ export DOCKER_BUILDKIT=1 
$ docker build .
# or just$ DOCKER_BUILDKIT=1 docker build .

Since we are adjusting docker build command, let’s stay here for a moment. It has an --ssh option to allow the Docker Engine to forward SSH agent connections. Therefore, our final command will be:

DOCKER_BUILDKIT=1 docker build --ssh default .

The flag accepts a key-value pair defining the location for the local SSH agent socket or the private keys. The socket path can be left empty if you want to use the value of default=$SSH_AUTH_SOCK. Note that when using the default configuration you need to add your keys to your local SSH agent as we did in the previous step as it won’t connect your ~/.ssh/id_rsa key automatically.

Dockerfile

Only the commands in the Dockerfile that have explicitly requested the SSH access by defining type=ssh mount have access to SSH agent connections.

To request SSH access for a RUN command in the Dockerfile, define a mount with type ssh. This will set up the SSH_AUTH_SOCKenvironment variable to make programs relying on SSH automatically use that socket.

You might imagine we will need a command like:

RUN --mount=type=ssh git clone git@github.com:org/dependency.git

However, having it running out of the box would be too simple. You will quickly see the error: Dockerfile parse error line 8: Unknown flag: mount.

The mount flag is currently not available in the stable channel of external Dockerfiles, so you need to use one of the releases in the experimental channel. To do that, set the first line of the Dockerfile as a comment with a specific frontend image # syntax=docker/dockerfile:experimental.

OK, let’s add that and check once again:

=> ERROR [3/3] RUN --mount=type=ssh git clone git@github.com:org/dependency.git                                                                                                                                        0.6s
------
> [3/3] RUN --mount=type=ssh git clone git@github.com:org/dependency.git:
#10 0.356 Cloning into 'dependency'...
#10 0.358 error: cannot run ssh: No such file or directory
#10 0.358 fatal: unable to fork
------

It seems we don’t have openssh-client installed in our alpine image but we can change that easily:

RUN apk add \
--update \
--no-cache \
git \
openssh-client

I know you expect to have it working finally, but I have to disappoint you again. Don’t run it yet, I’ll tell you what would happen:

=> ERROR [3/3] RUN --mount=type=ssh git clone git@github.com:org/dependency.git                                                                                                                                        1.5s
------
> [3/3] RUN --mount=type=ssh git clone git@github.com:org/dependency.git:
#10 0.594 Cloning into 'dependency'...
#10 1.204 Host key verification failed.
#10 1.204 fatal: Could not read from remote repository.
#10 1.204
#10 1.204 Please make sure you have the correct access rights
#10 1.204 and the repository exists.
------

What we have to do is to use ssh-keyscan tool to gather the public GitHub SSH host keys and put them into ~/.ssh/known_hosts. To do that, add the following line to your Dockerfile:

RUN mkdir -p -m 0600 ~/.ssh && \
ssh-keyscan github.com >> ~/.ssh/known_hosts

and that’s it. At this stage, everything will work as expected. To confirm that, run:

~/Desktop » DOCKER_BUILDKIT=1 docker build --ssh default -t blog .                                                                                                                                        
[+] Building 5.1s (12/12) FINISHED

What are the alternatives?

Sometimes, you don’t have to be so concerned about security. You may run everything in some private space or a completely isolated environment. There are some simpler solution for such cases.

For instance, you could clone the repository on CI, outside a container, and copy that into it. The other way you may just ADD your id_rsa to the container and use it from there.

Everything depends on your use-case and security level you need to consider. Once you know these methods and are aware of some risks, you may adjust the implementation to your particular needs.

Subscribe to get the latest content immediately
https://tinyletter.com/KamilLelonek

Summary

Right now, your deployment pipeline is pretty complete. You are able to checkout your project from a private GitHub repository with private dependencies stored there as well. Then, you can execute a test suite and codebase linting on any continuous integration server, build an artifact of Docker image ready to run as a container. Finally, you may deploy it to a kubernetes cluster and expose to the world using a favorite LoadBalancer.

I hope you will find this tutorial useful and it will help you when configuring your own automation process.

--

--