Written by Colin Rand October 19, 2020
Secure Remote Access – An Engineer’s Pain
Engineering teams need secure access to dynamic hosts, services, and applications to productively do their jobs. VPNs, falling short of today’s security requirements with their “one size fits all” strategy, are often at the core of serious usability, manageability, and security issues. Lately, Zero Trust is all the buzz, and for good reason. In this post I'm going to discuss some remote access challenges felt by engineering teams that are beautifully solved with a Zero Trust solution.
While access challenges cause pain and suffering to all end users, they can and do present serious issues for development teams. And, engineers, being smart and loving a challenge, often work around those issues. Here’s a few personal anecdotes that highlight what goes wrong in the pits of engineering when remote access fails us. Do any of these sound familiar?
One time I worked in an environment that was really locked down. No developers had access to production, no development environments were accessible without a VPN, etc. All the usual suspects. An enterprising developer who wanted to do some prototyping work from home decided that the VPN was too troublesome, so of course the dev just copied “his” source code, uploaded it to Google drive, downloaded it onto his personal workstation at home, and... need I go on? The lesson – the desire to be productive was treated as more important than pesky security policy and a big security hole was created as a result.
Another time a developer, having heard about new policies coming they didn't want to deal with, set up their own private bastion host in production. Of course, they didn't tell anyone, and soon after ended up leaving the company’s employment. They later told a former colleague what they had done over drinks, laughing about how they could still get into production anytime they wanted.
I’ve even had developers insist they couldn't be on call because they couldn't get onto the VPN from home. Yeah, the rest of the team always loves that excuse.
No More Excuses
Different teams have different remote access needs. All security teams think through the process of what resources are being protected, their sensitivity, and what is at risk of misuse. They have sophisticated means for analyzing risk profiles, but suffer with a blunt tool for handling the needs of the modern remote-first engineer. These design decisions become tradeoffs for what work needs to be done – criticality and time sensitivity of task vs. risk that is introduced. Yesterday we were concerned about 'where' the work needed to be done. Today that is irrelevant, it's anywhere and everywhere.
All Teams Look Alike
Go into a modern software engineering organization and you will see many teams and activities being performed. To name a few:
- Site Reliability Engineer (SRE)
- Apps & Services
- Data Engineering
- Data Analytics
Each team needs to be reviewed from a security perspective to determine what is the least privileged access that they need to perform their roles. Each needs their resources protected, their devices secured, and their identities validated. All that is confirmed? Good, they can perform their critical work. Safety first!
If only it were that easy. Each team has many similarities at a high level, but get into the details and their needs begin to diverge, often widely.
What is different about them?
Let's look at what's the same. They all have a wide assortment of 'things' they need to access that require protection. These 'things' include various TCP services (SSH), web apps and APIs (internally hosted or in the public cloud), SaaS, and oh yeah, throw in Kubernetes too.
The type of access each team needs is quite different. Perhaps your SRE needs access to production environments to see why a load balancer is misbehaving, but does the on-call developer supporting them also need this access? The DevOps team wants access to the build and development tools, such as the git and build servers, plus cloud environments, but should they have full access to production?
Another team, QA, needs to replicate issues found in production in production-like environments. They may need access to the hosts the services run on, or perhaps the databases themselves. But do they get access to the build tooling? What if the QA team is a subcontractor?
Each access decision requires discussion and design. What was previously one size fits all now works for none.
When thinking about the design, fine grain controls need to be implemented for each team, considering the sensitivity of the activity. Is production access needed, or is production data needed but not the rest of the infrastructure? The traditional hard boundaries of physical networks are now messy.
Let's look at a data engineering scenario. A production warehouse will have collection, aggregate, and analysis workloads. This might be implemented as a combination of cloud infrastructure, 3rd party SaaS tools, and internally-developed applications. When a new engineer is onboarded, security factors to consider with regard to access control include whether their device is compromised, or if their disk is encrypted or not. Do you want to allow the engineer do a pull of sensitive data onto such a device, not knowing the state of its security? Perhaps a better path is allowing them to access a reporting UI from a personal device, but no data-level queries can be run. That might be a good alignment of risk vs. task disruption.
Each team has its own ecosystem of tools, each with its own quirks. (It's all software built on software after all.) Each time a different remote access strategy is involved, the engineer gets frustrated as more security workarounds are deployed, making for an increasing fragile system that is more cumbersome to use. Want to eliminate shared passwords on that internally-hosted service that doesn't have SAML support? Want to make sure a particular API is accessed only by devices that are deemed secure?
Oh, and don't forget Compliance! We'll cover what security engineering, cloud engineering, 3rd party, and offshore teams need and a whole host of details around IT Engineering in future posts.
Is it easy?
Is security easy? No. Is achieving “Zero Trust” easy? Certainly not at the boil-the-ocean level, but the good news is that a value-adding project with some sensible constraints is totally achievable. And doing so results in scalable identity-based access that factors in device health and security.
Step one is coming to grips with the challenge and deciding now is the time to take it on. Our engineering teams have been able to use the Banyan Security Remote Access Platform to eliminate VPN use in our own engineering efforts. Having done this myself, my recommendation is to tackle a small project, perhaps just a few SSH hosts, maybe GitHub, or perhaps just getting better visibility into your devices. Understanding the challenge is the first step on the path and nothing beats a little hands-on prototyping. Hit me up if you're interested!