The BeyondCorp paper explicitly mentions that the device state is taken into consideration when giving access to a user, i.e. that the device is identified and controlled, not just the user. It seems to me like it is an important part in the BeyondCorp access model, otherwise wouldn't this just be a SSO portal?
You are correct. The solution presented is not a BeyondCorp but rather an SSO implementation that adds authentication to the internal application.
For BeyondCorp, it essentially:
* Must be Layer 7 protocol, access privilege aware (achieved by an identity-aware access proxy).
* Promotes authorization as opposed to authentication only.
* Should be able to enforce security policies (time, location, context, 2fa).
* Must be aware of the security state of the user device.
Shameless plug: Check out our zero trust service access project TRASA (https://github.com/seknox/trasa). It's free and opensource and addresses many of the requirements outlined by BeyondCorp.
Heh. Though I am not an expert on the topic, I can recommend a few things. First, there are three directions the industry is heading with "zero trust" thing.
(1) Zero trust access (like BeyondCorp, protects application and services when a user, user credentials, user devices are compromised)
(2) Network micro-segmentation (contain impact when one network segment is compromised, dynamic network assignment)
(3) Zero trust browsing (protection for users from getting infected with malicious contents served by trusted but compromised websites)
Honestly, I am only more familiar with zero trust access, and for this, I can recommend you first read -> BeyondCorp A New Approach to Enterprise Security [0] by Google. The trend was kickstarted from that paper
Azure AD provides a hook for this through Conditional Access, which will block sign in to a application if your device isn’t compliant with security policies or updates (or if you are logging in from an unapproved country).[0] Google provides something similar through Context-Aware Access but I don’t know if it goes as deep (Google used Puppet in the original paper to get device state info).[1]
large enterprise deployments of phones or company owned desktops/laptops, etc, very commonly include what would be called "network admission control" software. The device needs to meet a certain defined state of patch level/servicepack/antivirus scan/other things (like GPO registry settings on a windows machine) before allowed to sign on.
it's all good to theoretically say that smaller companies should adopt a 'beyondcorp' type approach. but at a certain point of threat model on the client device (keystroke loggers + tools that send screenshots somewhere else, as is found on black hat remote access tools/botnet tools), you need to have specialists in endpoint/workstation device security keeping on top of threats, and defining the security policy.
what sketches me out about this particular article is that they're essentially trusting any client endpoint device that has the 2FA hardware token, and has a working browser. you could have a totally screwed up windows 10 laptop riddled with some very nasty RATs that would work fine to use the 2FA authentication tool, and sign in to their service with chrome in a browser. there's nothing about verifying the state of the software and trustworthiness of the operating system of the client device which might be potentially accessing very sensitive internal information.
i see literally nothing in that article about inspecting or trusting the state of the operating system or software on the client device. does it have a bunch of malicious browser plugins? who knows. is it running a remote desktop tool that's linked to somewhere else? who knows. is it infected with an advanced remote access tool? who knows. is it six months out of date on windows updates? who knows...
the article's assertion that a vpn based approach is like an eggshell is false in my opinion. you should not have an environment where simple vpn auth allows you in to the squishy inner center of private data. a belt and suspenders approach is needed.
Indeed, and the industry term for this is posture assessment. And many companies take this a step further and permit access only with organization-issued equipment, even if you possess authentication credentials.
Having 100% company owned equipment allows you to do other common sense things like:
a) full disk encryption with key escrow for recovery by admin team
b) storage of crypto public/private key pair on disk of laptop, for instance an openvpn key file that was created on a company owned PKI server, deployed onto the laptop as part of its provisioning process, and is a unique key for both the human and that particular piece of hardware
c) you can use the same crypto key pair on client device local storage, if not for something like openvpn, for other authentication purposes identifying that particular user and hardware
d) obviously, have the device trust your own internal PKI's root CA for access to purely-intranet resources. getting a company root CA trusted by the browsers in a BYOD device environment is a pain in the ass.
What does Google do in this regard? I don't work there, so I'm curious to hear about their solution for endpoint security.
From what I've heard, they don't allow sensitive data on laptops in the first place—you mostly SSH into your desktop or a cloud machine. That's probably not enough to solve the issues you described, so I wonder what else they do.
I would be shocked if they don't have a whole team of people keeping up on the threat models for client workstation windows, macos and linux endpoint devices, and creating the equivalent of windows active directory registry pushes+other software loads to guarantee the condition of an endpoint device.
Otherwise how do you know an endpoint device (assuming it's on a network segment with a default route out to the internet, or is somebody in a work-from-home mode) isn't running a persistent video recording session feeding something like a realtime mirror of the screen, VNC-over-SSH tunnel to some third party.
At smaller size companies I have even seen stories of a person who was hired as a fully remote developer by $software_corp, and proceeded to set up a remote desktop tool and subcontract their entire job to a person in $developing_country, at a significant profit margin.
Nevermind the fact that HDCP has been broken for ages and any random Chinese capture card will ignore it and Chinese HDMI splitter will strip it, if the purpose is to cheat the system then you can just point a camera at a screen (perfect video quality isn't a requirement here).
Note this is for operations/privileged access in high risk environments vs. standard issue desktops. Saw a little bit of this up close a few years ago, seemed quite solid and well thought out.
Is posture asking the device to tell you it’s ok ? For the ignorant like me it seems a motivated and capable adversary can have an insecure device send an ok posture
Posture assessment is often built into modern VPN clients. The actual procedure varies by organization and can sometimes be updated by pushing new validation procedures to the client. It's unlikely to be as simple as "run this file on disk (which an attacker could trivially replace) and check the exit code."
My company has configuration management for company laptops install a client certificate. The "internal frontend" proxy checks for this client certificate in addition to AD credentials + Duo.
This is a great question! I hoped this would get brought up, as it is very important. I decided against covering in this blog as I felt it was already fairly long, but the tldr is that I see two incremental ways with this setup to add authorization:
1. Cognito has something called "Adaptive Authentication" that will compute risk scores for each login based on IP, device info, etc. You can customize in the AWS console how risk-tolerant you want to be.
2. You can go the fully-managed approach, which is what we are implementing at Transcend now. The idea is that you'd use an MDM like Fleetsmith to install a TLS cert onto each managed device, and then validate that cert on each request in the auth portal. There are lots of cool ways (we use the Vanta agent) to verify that a users' device is "good" to authenticate with.
I'd like to write more about option 2, but I try to keep this blog posts as technology agnostic as I can, and my experience is fairly limited right now to Vanta + Fleetsmith
A lot of companies that care deeply about security are moving to this “trust no one” approach which has the added benefit for end users of allowing access to “secure internal sites” over the plain old internet. If done right this can all be a big boost for security and improved end user experience. That said, the old “you need to be on the VPN” approach is going to stick around for some time.
For sure, VPNs will always be used. I think it'll take a BeyondCorp SaaS company to really take off (or have it become a more "Managed"auth method from the big cloud providers).
At Transcend we are able to do it because we had an early focus on protecting our internal apps, but obviously it's a lot harder to migrate hundreds of services than to start out with a newer approach.
I loved not having to use a VPN back when I worked at Google though, and am glad to see that the open source world is starting to offer some tools to play around with.
I mean, yes, if you have billions to dedicate to building a leading class security team-not all organizations have that money and not all organizations need to take that approach. Some do and some need to.
I read a part of the article, but I'm confused. How is this different than making all our internal servers public and using okta or auth0 for sign in?
I wouldn't do that because any of those servers could have a security vulnerability that we're not aware of, so I feel like this must protect against that somehow, but I'm just not fully understanding what it does.
Main difference is that all of these websites are public behind one big proxy (ALB) and not public on their own.
The security concerns are centralised in one place, not 10.
That's not to say that the ALB can't have a bug or a misconfiguration that will render it wide open. But that's probably true for VPN as well.
And the point of this is that, while application security is still important, it at least makes all those vulnerabilities post-auth, which is a huge improvement.
The poor man's version of this is to put all your services behind an nginx reverse proxy with HTTP Basic auth (and TLS of course). For personal/small scale operations, this is a great way to almost completely eliminate your attack surface, if you have single-digit users and they can be trusted. Everyone running webapps personally should prefer this over, or in addition to, app-specific login systems.
This encourages a behavior of copy and pasted authentication javascript from service to service.
The ALB approach from the article at least centralizes the SSO dance in one place, but still a typo in terraform would be very hard to detect.
The BeyondCorp approach Google uses, as far as I know, relies on sophisticated proxy servers in front of ALL protected services to ensure the very tricky aspects like posture assessments, zero day patching, logging, rate limiting and other security best practices are handled in one place.
With a scattershot approach, companies may not be open to a VPN exploit anymore, but may have opened themselves up to many more individual exploits and much slower reaction times when an exploit is found.
Perhaps I don't see what you see, but this is server-side javascript (cloudfront calling a 'lambda at edge' function - similar to Cloudflare Workers).
What's particularly scary about it?
As for "a typo in terraform would be very hard to detect" - perhaps, yes, assuming it didn't fail outright.
To mitigate that I'd expect anyone deploying this for real to protect anything valuable would ensure unit tests were written for the Javascript and to have code reviews of any security-sensitive code like this.
I don't think individual development teams should be hosting this code in their own services. Instead, they should declaratively specify the rules on what roles can do what, and rely on another layer to honor those requirements.
This is a great question! I hoped this would get brought up, as it is very important. I decided against covering in this blog as I felt it was already fairly long, but the tldr is that I see two incremental ways with this setup to add authorization:
1. Cognito has something called "Adaptive Authentication" that will compute risk scores for each login based on IP, device info, etc. You can customize in the AWS console how risk-tolerant you want to be.
2. You can go the fully-managed approach, which is what we are implementing at Transcend now. The idea is that you'd use an MDM like Fleetsmith to install a TLS cert onto each managed device, and then validate that cert on each request in the auth portal. There are lots of cool ways (we use the Vanta agent) to verify that a users' device is "good" to authenticate with.
I'd like to write more about option 2, but I try to keep this blog posts as technology agnostic as I can, and my experience is fairly limited right now to Vanta + Fleetsmith
How’s this work with things that aren’t websites? Do you have to throw a proxy in front of e.g., your database server that keeps track of which IPs have already authenticated over the web?
the iap is an http proxy, so you need a way to send non-http traffic. this might require client modifications (not everything is proxy-aware), and you can't always modify the source.
some protocols are udp and latency sensitive, which doesn't work well enough tunneled
You run a service behind the HTTP proxy, or another proxy with a more suitable protocol like SSH, which can speak the required protocols (or just blindly forward TCP) across production. You run a CLI tool that binds a local port and forwards to this service.
In some ways this is a poor man's VPN server, but it can be smarter: with protocol support, you can combine the identity of the connected developer with application-level data (e.g. this is an INSERT statement) to make AuthZ decisions.
At Transcend, we use a bastion host (like others have mentioned). The key difference that we do that I don't think has been covered is that our bastion only makes outgoing connections, and has no open ports to the world.
Using the AWS SSM managed service, we can create bastions that have no ingress at all.
Household name companies routinely get owned because a pen tester finds a network drop in a conference room, or a smart thermostat that merely needs internet access becomes a beachhead. There's apparently a whole generation of IT that believes security is about what is and isn't allowed on "the corporate network."
In my tech career the office WiFi has never been more privileged than the coffee shop across the street, just faster.
We have a very similar setup with Cognito, GSuite and ALBs.
For CLI/API access we also have API Gateway which allows to authenticate with JWT tokens that we can issue via Cognito.
It's not perfect:
* This setup only deals with Authentication. There is no authorization at all. I.e ANYONE with org gmail account can get in.
* There is no real SSO. Say you have an application you need to login to behind this proxy. The proxy will pass you through to the login page, where you need to login (again) via whatever it is configured. It's not to say that it's impossible to solve as you have enough info in the headers/cookies that are passed from ALB to sort something out with custom solution, but it takes time.
* You need to be very careful with your OAuth config. With GSuite, as example, you can very easily configure the OAuth client to authenticate ANY @gmail user instead of your @company...
This naming is needlessly confusing. BeyondCorp is a name invented by Google to describe its internal approach to security, and now it's a Google Cloud product. Why would Transcend reuse this name? It sounds as if Transcend is a middleman reselling a GCP product.
Apologies for the confusion! I picked BeyondCorp as there seems to be a number of references to it as a more general-cloud term than anything Google Cloud specific.
Close but not close enough. Take an ALB, pair it with your SSO solution (AWS SSO, OneLogin, Okta, etc). Add DUO and make sure to validate the devices as well. You need half if not less of the infrastructure outlined in this article.
I haven't implemented this extensively, but there are sometimes problems with setting this up in a corporate environment where there's a need to do things in a uniform way.
- Often if you use a central database such as Active Directory for your internal users, you may need to set up your OAuth endpoint (say, an AWS Cognito User Pool) and then have an AD admin allow that user pool to be a relying party. This means there is some lag time to setting this up, and that it isn't done with automation. So if you have an application you want to spin up a custom domain with a temporary user pool, test something, and then destroy it later, it's probably not going to work without some custom workarounds. No need for this with a traditional VPN.
- If you have 50 different apps, with 50 different URLs, you're going to need to do the above 50 times. Also have 50 staging and dev portals? Same deal. Now try managing changes to all of those at once. The manual steps, plus all the integration with all the product teams, now means this whole thing is becoming burdensome. This is a lot more work than just "use this VPN client and whitelist these CIDRs".
- If you're working for one of those weird companies that loves to do split-horizon DNS with lots of custom internal-only domains, guess what? Probably not gonna work without a VPN.
- Onboarding can be complicated. You now need to help your users manage their accounts, such as password resets, being a part of different domains, using a supported device, MFA registration, troubleshooting internet connections, using a supported CLI tool for non-web-interface APIs, etc. Versus just saying "are you connected to the VPN?".
- A VPN allows a central network team to manage access control across the entire network (or wherever that network team manages networks). BeyondCorp needs to be managed for all of your products by one team, or you may end up with an uneven and difficult process of supporting users across teams. A lot of companies (maybe most) are just not set up to allow independent product teams to manage internal user access. Even then, authorization maybe, not probably not both AuthN + Z.
- If you have a single domain serving multiple apps in multiple URLs requiring different authorization, things can get more complex.
Ultimately I think most of the protocols used for BeyondCorp today have too many drawbacks to say we could all drop our VPNs in support of it. We'll probably need at least another generation of protocols and management workflows in order for it to become the new norm.
A genuine question on this topic, does that mean we should have SSH ports open to the world too? If we have some form of 2FA? Or I’m misreading the point of beyond Corp.
Zero Trust is about the conditions inside the perimeter; BeyondCorp is about ingress through the perimeter. Things may be wide open on the interior side of the proxy. Or things may be locked down tight even inside the VPN.
I would think of BeyondCorp as end users accessing services through an application-layer perimeter from a public network, instead of directly from a private network.
The network where the services actually sit becomes, in effect, even more private.