Home / Technology / Container Platform Security at Cruise

Container Platform Security at Cruise

  1. Container Platform Security
  1. Authentication
  2. Authorization
  3. Secrets
  4. Encryption

To better understand how all of the different domains interact with one another, we first need to look at Identity. An identity is the representation of a person or program interacting with a system. They always take one of two types, users or services, and their type depends on their use case. Both types of identity include a compound unique identifier and a set of credentials made up of multiple factors.

For identity management, we leverage Okta as our Identity Provider (IdP). Okta enables a Single Sign-On (SSO) experience for users between systems with Multi-Factor Authentication (MFA). Okta isn’t required for GKE or Kubernetes — we could have used another IdP or manually managed users within GCP itself, but Okta provides integration points and management tools that make it easier to secure a wide variety of systems.

Authentication is the means by which we confirm an identity is whom they claim to be. Together, identifiers and credentials can be used to distinguish a given identity from another and establish non-repudiation: high confidence authenticity, proof of origin, and proof of integrity.

  1. Something you have (ownership factor)
  2. Something you are (inherence factor; most common with user identities)
  3. Somewhere you are (location factor)
Multi-factor Authentication

Google has invested heavily into OAuth2, so it may come as no surprise that GCP relies heavily on it for both user and service authentication alike. For users authenticating to GCP, this means authenticating with a password & second factor through an associated IdP. Behind the scenes, this does one of two things depending on if the user is authenticating manually via a browser, or programmatically via GCP’s CLI (gcloud), or API.

  1. Programs: The newer OIDC protocol is used for programmatic interactions. The user or service identity logs in with its credentials and Google generates a signed access token for use in subsequent interactions. The OIDC access token is the basis for API and CLI authentication, analogous to the SAML assertion stored in the browser flow. For terminal access, most users use the gcloud CLI, which handles the OIDC authentication flow and caches the access token.

Once authenticated with the gcloud CLI, GKE users can use it to fetch kubectl credentials, allowing them access to the Cruise PaaS using kubectl, the Kubernetes CLI, provided their identity has the required role bindings. This allows users to only have to manage their GCP credentials, and generate Kubernetes credentials on-demand.

Recently, Google introduced GKE Workload Identity, which allows Kubernetes SAs to act as GCP SAs, so that pods can authenticate with GCP. This replaces the legacy pattern of using GCE instance metadata, which would allow every pod on the node to have access to the same GCP SA credentials.

Authorization is the means by which we enforce what an authenticated identity may access. There are many types of access control, but within the context of container platforms, we typically use Role-Based Access Control (RBAC).

Figure: Groups, Permissions, and Role Based Access Control (RBAC)

Putting identities into groups makes it easier to bind permissions & roles without repeatedly assigning the same roles & permissions to each individual identity. Groups are generally a resource type provided by an IdP; for integration with GCP and GKE, we use Google groups provided by G Suite. In most authentication flows, group membership is a field located within the credential itself (such as a JWT’s claims), or is a property that’s possible to query against the associated IdP.

As mentioned earlier, GKE integrates with GCP and G Suite to provide authentication, identity management, group management, and authorization within GCP.

Figure: RBACSync high level workflow and example config

Secrets can be anything you want to keep private, but in the context of container platforms, it’s mostly just credentials: tokens, passwords, certificates, encryption keys, etc. Kubernetes comes with its own secret storage and injection mechanism, which is especially valuable for bootstrapping, but the built-in secrets solution is generally insufficient when platforms span multiple clusters.

  1. Authorization that supports RBAC and group membership.
  1. Fetches secrets needed by the workload
  2. Writes the secrets to an in-memory volume (to avoid leaking to persistent storage)
  3. Shares the volume with the workload container
  4. Updates the secret at runtime, when it changes in Vault (optional)
Figure: Secrets injection with Vault and Daytona

Encryption is a broad topic, but we can break it down into two categories:

  1. Encryption at Rest
Encryption in Transit & Encryption at Rest

One of the more challenging parts of securing PaaS has been ensuring all of our services communicate in a secure manner. This typically means using Transport Layer Security (TLS).

  1. Kubernetes API
  2. Kubelet API
  3. Workload Ingress

In transit, we can assume the public internet is not implicitly trustworthy, and if Zero Trust best practices tell us anything, we probably shouldn’t trust our private intranet either. Taking this a step further, implicitly trusting the people with access to our physical hardware (and their virtual cloud analogs) is also undesirable. With this in mind, we know that the following data resides on persistent storage, and as a result, we’d look to encrypt it at rest:

  1. Kubernetes Node Disks
  2. Kubernetes Service Account Credentials
  3. Workload Secrets
  4. Workload Volumes

Security concerns affect everything we do, but this post is already longer than most people will read and we do still have a self-driving car to build…

  • Platform Hardening (SecurityContext, Node Metadata Protection, Network Policies, Pod Security Policies)
  • Secure Supply Chains (Trusted Image Building, Vulnerability Scanning, Attestation)
  • Patch Management
  • Zero Trust Networking

In the next blog post of the series, we will take a look at some of the networking challenges that come with building a platform. Stay tuned for more about observability and deployment after that!


Leave a Reply

Your email address will not be published. Required fields are marked *



Check Also

Work on Boeing crash plane ‘not adequately funded’