Sensitive info in the project!
Are your AWS keys public?

I have been more and more involved in tasks such as deployment and server management recently and I am still struggling through the nuts and bolts of the whole process from pushing the changes in the repository to deploying those and making sure that each resource in the infrastructure has successfully started running with the new changes.

When going through our deployment scripts and resources the other day, this question popped up in my head.

How do we manage our sensitive info in the project?

The sensitive info that consists of any of the below and more-

After some digging, I found we have stored these into the codebase only and is included in the version control system(git). This is a basic mistake that people do in the initial phase of a project without understanding the consequences. I'll talk about it more later in the post.

Before proceeding, we must understand how and why above happens. So you have created your awesome app which uses some third party APIs or have some critical configuration. The data for those is sensitive and must not be made public, so you have setup your project such that whenever someone is initiating your app, he or she will provide all that sensitive data in a configuration file or through environment variables manually and your app will use that.

All your sensitive info is off the code-base, secured to one or more authorized personnels and things work fine UNTILL your product starts to grow and you face following problems.

That is to say, the offline method is not scalable. When above happens and there are multiple versions of configuration files, you need to find a way to store them securely and manage them better, also making sure their availability and integrity. You can’t keep all those offline anymore.

The intuitive and probably the best option is to put them in a private secure location (accessible only be authorized personnels such as devops engineers) and whenever instantiating the app, all data is pulled from there.

The above is the basic idea on how to automate deployment and configuration of one or more servers while keeping the sensitive data managed at one place securely. Of-course it’ll need some work at our end to modify our deployment script accordingly. To give you a better understanding, I’ll explain what We did in one of my projects-

Note:-

A repository is better in terms of modification and history management of sensitive data itself, which is not fully available with S3. The benefit mainly comes when you have lots and lots of sensitive data and multiple DevOps engineers who may be making changes to same key/s simultaneously. Although this is improbable in most projects.


Now let’s discuss what I wrote earlier - Committing any sensitive data to version control system is a big mistake. Because-

With all that said, there are multiple tools such as truffleHog which search repos for any sensitive info in full history. They must be used if not confident.

*****
Written by Saurabh Goyal on 19 July 2017