Friday, December 30, 2016

Ansible and Vault

Let's start with a cliché. Everyone agrees that security is important, right? We work on deployment process and we need to get secrets into the system securely. Secrets can be passwords and private keys, for instance.

We use Ansible for configuration management so we started with ansible-vault. It has some drawbacks so we found a compelling contender - Vault made by Hashicorp. I would like to describe how we made Ansible work with Vault.

Ansible

Ansible (https://www.ansible.com) - automation for everyone. Dubbed as the most developer friendly automation tool it sports easy syntax, it is simple to translate from what devs are usually most acquiented - Bash - and it runs masterless so one does not need a special master node to handle updates like in cases of Chef and Puppet. The learning curve is favorable.

Vault

Vault (https://www.vaultproject.io) - a server application for storing securely secrets and to allow access to them via remote API. It supports host of features like access tokens, fine-grained access rights, auditing, revoking only secrets which were compromised, certificates provisioning on demand and more.

Running Vault

We use Vault in development configuration since setting up production Vault is more complicated and hard to automate by nature. Development configuration suffice to get you started.

We will provision it locally with Ansible and we will directly set it up with some access policy, enter secrets and get a token which will give us read-only access to some of the secrets.

The structure will be following:

A simple playbook that invokes the vault role.
vault.yml:

It can be invoked with ansible-playbook vault.yml.

We leverage that we are working with Docker (v1.12.5) and we use Ansible Docker integration (available in Ansible v2.2). We deploy official Vault image on Docker first and then perform operations on it.
The steps are:

  • Run the image with Docker.
  • Get the image logs.
  • Grep them for the root token.
  • Run the setup-vault.sh.
  • The script outputs an access token for the specified read policy and we save it to ~/.vault-token file where the Ansible Vault lookup plugin expects it.
The setup-vault.sh script:

The setup creates the access policy, enters two secrets and creates a token which has the aforementioned policy so one can only read these secrets with the token.

The read-policy.json:

You can check the setup works by running:

  • export VAULT_TOKEN=$(cat ~/.vault-token) - to export the file content to env variable.
  • curl -H "X-Vault-Token: $VAULT_TOKEN" localhost:8200/v1/secret/app/test/ldap-pwd. - to hit the Vault API.

Ansible with Vault

Now that we have Vault up and running let's get secrets into Ansible. It is surprisingly easy thanks to the good work of Johan Haals on his ansible-vault lookup plugin (https://github.com/jhaals/ansible-vault). I would say that the name is confusing (being the same with ansible-vault tool) but other than that the plugin works very well. I also read that the chances are the plugin will make it into Ansible so you wouldn't have to install it manually.

ansible-vault installation

Let's say we stick with default Ansible plugins' location as outlined in the configuration.

  • sudo mkdir /usr/share/ansible/plugins/lookup
  • sudo vim /etc/ansible/ansible.cfg and uncomment the line lookup_plugins = /usr/share/ansible/plugins/lookup.
  • sudo git clone https://github.com/jhaals/ansible-vault.git /usr/share/ansible/plugins/lookup/ansible-vault
Well, you're done!

Retrieving Secrets

In our case we have configuration templates and we want to replace hard-coded secrets with lookup. A typical template in our case is an XML file or a properties file. It may look like this:

We put the password in Vault and modify the template:

Ansible template_module uses Jinja2. We wrap variables in {{ }}. We put in the lookup call where we invoke our freshly installed vault lookup plugin.
We can also see how to use variable within Ansible lookup. Simply put it without any quotation. In case you need some literals around it like we do, concatenate it with +.

The last step is to use to invoke the template module from a playbook:

And the role:

The source application.properties can be found in the /files.

That's it! Personally, I struggled a bit with Vault setup, its API calls syntax. But the documentation was helpful and once I had that running making Ansible use Vault was very smooth.

Tuesday, December 20, 2016

Building Docker Images with Packer and Ansible

I recently worked on a project where we were explicitly asked to combine these three technologies. While I'm not hundred percent convinced it is the right pick for the customer it may be useful in some cases so I'll outline how to use them together.

Let's start with what they are.

Docker

Docker (https://www.docker.com) - build, ship, run. It is a tool made for running many small services. It replaces now common paradigm of hardware with VMs with containers in the operating system. It is useful when developing microservices, a generic service oriented architecture or when one has a few dependencies of the system and wants to decrease the dev-prod disparity and have it all the same from bottom up.

Packer

Packer (https://www.packer.io) is a tool for creating machine and container images for multiple platforms from a single source configuration. The configuration is excodessed with a set of provisioners which can be any combination of shell, Chef, Puppet, Ansible, Salt, you name it. The target platform is excodessed with a build. One can provide multiple builders at once though I guess it won't be as straightforward but we will get to that.

Ansible

Ansible (https://www.ansible.com) - automation for everyone. Dubbed as the most developer friendly automation tool it sports easy syntax, it is simple to translate from what devs are usually most acquiented - Bash - and it runs masterless so one does not need a special master node to handle updates like in cases of Chef and Puppet. The learning curve is favorable.

Why Would One Use Combine Them?

In our case we have a few services (2-5) which need some other services (LDAP, RabbitMQ, database) and need to be highly available (so everything needs to run at least twice and an ambassador pattern comes in handy). There is also a DMZ part with reverse proxies for SSL termination and loadbalancing. We have tens of environments but there are probably three categories of these environments and then they differ only in secrets - database, passwords, certificates. The rest can be configured using convention over configuration. We decided to bake the environment specific configuration into our images at the deployment time.

Now we need to build the Docker images we deploy. We already knew Ansible and as it turns out one can leverage that knowledge (and the Ansible tools like templates, Vault plugin, and more) to configure and to some extent build Docker images when one employs Packer. Another reason is that the customer is not hard set on Docker and may turn to AWS for instance. Then, so far teoretically, part of the job is done and we only need to configure another builder.

I would say it may also be handy for operations in bigger organizations where they need to maintain base images used by several teams with different target platforms. Then one may be able to output different images on demand relatively smoothly. Also one may leverage different provisioners (like all the main four - Chef, Puppet, Ansible, Salt) if say security team one and sysadmin team likes the other.

Anyway, for us it was requirement and this is how we made it work.

Putting Things Together

In our case we have a Packer configuration that contains Ansible provisioner and Docker builder. packer build first starts the base image in local Docker engine, then runs Ansible provisioner on it and then can tag the resulting image and push it to a Docker registry.

Packer

Packer is configured with a json file. The build is then started with packer build config.json.

In our case it looked like this:

Let`s go over it section by section since it was not as evident to figure it out.

Variables

  • One need to declare any variable that will be used in the config even if it is passed as a parameter. The parameters are passed into the build like this:
    packer build -var "env=$ENV" configure-web-app.json
  • Variables are referenced in the Packer configuration with {{user `app_version`}} placeholders.

Provisioners

  • type ansible for saying we will use Ansible. Sounds obvious but there is also ansible-local which invokes Ansible on the image but then one needs to install Ansible on the image.
  • playbook_file - references an Ansible playbook. The path is relative.
  • groups - needs to match with hosts in the playbook.
  • extra_arguments - here one passes variables that get into the playbook as well as some Ansible related configuration. The particularly hard ones were:
    • ansible_connection="docker" since the default is ssh and the documentation around Docker does not even mention another type.
    • ansible_user="root" since otherwise it throws some weird error and one finds the right answer in some bug reports. Again, sadly, not much help in the documentation.

Yes it is annoying to have to enumerate all the variables. We assume that the ansible_connection and _user would have to be overridden to something else for other builders.

Builders

We use only the Docker builder. We define:

  • image - the base image to start from.
  • run_command - it may look cryptic at start but these are just parameters to invoke docker with. The result would look like: docker -dit --name default run my-registry/web-app:1.0.42 /bin/bash. Hence it will not output standard output (-d/--detached), it will run interactive (-i/--interactive) so it will not terminate immediately, it will allocate pseudo tty (-t/--tty), the container will be named default so you know what to remove if you terminate Packer in the middle of the run, the base image will be what you've specified. It will run Bash.
  • commit - means the image will be commited (Packer does not say really what that means) but it can later be tagged and pushed so I assume it will be commited to the docker engine harbor or how they call the store of built images.

Post-processors

Post-processors allow you to say what should happen with the build result. We tag the image and push it to the local repository where it is picked up by the deployment (docker-compose).

Ansible

We should mention that running Ansible implies Python has to be installed on every image built which may bloat them a bit.

An Ansible playbook can be something simple like: The role tasks/main.yml can contain following:

Just a bit on what we are doing. We deploy a Java Spring web application running on Tomcat so the deployed image is based on a base image with Java, Tomcat and some helper scripts installed (like wait-for-it.sh). Then we add the application WAR file and configuration like environment properties or logback XML.

Since the configuration may differ for different usecase a bit we support different profiles. It may also as application evolves so it needs to be versioned. As I discussed here I find it useful to have such configuration separated from the source code so I used Spring Cloud Config Server backed with a Git repository (Spring Cloud Config example).

Secrets - Vault and ansible-vault

There are people who don't like secrets in Git in plaintext. We found so far two ways to do it:

  • ansible-vault - a built-in tool in Ansible that allows encrypting and decrypting files.
  • Vault (by Hashicorp) - an application that is storing securely secrets and allows access to them via remote API. It supports host of features like access tokens, fine-grained access rights, auditing, revoking only secrets which were compromised, certificates provisioning on demand and more.

Ansible-vault

Any file can be encrypted with ansible-vault encrypt file. ansible-vault also supports operations edit and decrypt.

If you look at our Ansible role at include secured variables task it references a file in files folder with name like sec-vars-my-env-5.yml which is encrypted with ansible-vault. When Ansible encounters such a file it decrypts it and then uses it. The password can be provided manually but to avoid having it floating around in commands one can specify a password file. Have a look in the Packer configuration JSON for --ansible-vault-password-file option.

The downside of ansible-vault is that one can use only one file to secure all files in one context - so the most fine-grained it gets is one password for every environment if there is no generic encrypted file for all of them. Also there is hardly any auditing of changes since one can only tell the file was changed. One can only dream about revoking someone's access etc.

So we used it only to showcase we can secure some parts of the configuration and it also helped us identify what all needs to be secured.

Vault

We are yet to probe in the Vault direction so there may be another post about that. I think it is also not relevant for this post.

Issues

Combining few tools with different abstraction inevitably introduces some problems.

Building Debian-based Image on CentOS

For instance, we work for a big company who provided us with VMs with latest CentOS. We run locally mostly Debian based distros. Everything works fine. At some point we need to build an image on such a host. Now one may want to install a package in the Ansible role. The base image is Debian. The build fails with:

Internet is silent. Digging did not help so I had to resolve to thinking and documentation :) Turns out Ansible's package module abstracts from different package managers but under-the-hood uses its modules like apt or yum which then call real apt-get or yum. But there is a catch - the apt module requires python-apt and aptitude on the host that executes the module. In our case the CentOS host. So our builds run only if underlying host has the right architecture. Well, we can (and did) revert to Dockerfiles for these kinds of tasks but it breaks the whole abstraction. We can also probe ansible-local Packer provisioner since it runs on the very machine it configures.

The Documentation Could be Better

Case in point, the whole setup of the magic of Packer config. When it is set up it's done for good but the lack of documentation on things like ansible_connection and ansible_user may repel some right at the very beginning.

Proxy

How to run a Packer with Docker behind corporate proxy? Well, so far we don't know and we resolved to run these tasks with plain Dockerfiles running docker build --build-arg http_proxy which propagates the system proxy setting to Docker for image build.