Cross-cloud identities between GCP and AWS from GKE and/or EKS
When doing multi-cloud you will often run into at least one instance where you’ll have a workload running in one cloud that needs to call the other cloud provider’s APIs. For example, you need a Pod in GCP GKE to be able to call AWS S3. Or you a Pod in AWS EKS to call Google Pub/Sub.
I recently ran into this situation and it took longer than I expected — due to the documentation not being clear and discoverable (at least to me). So, I figured I’d write up a blog post with some examples to help anyone else who finds themselves in a similar situation.
In short, the big and pleasant surprise for me in this learning was that Google Cloud Identities (including GCP IAM Service Accounts) are trusted ‘out of the box’ by AWS STS’s OIDC Federation (along with Amazon Retail and Facebook logins). So, you don’t need to create an OIDC provider per cluster or anything — you just need to drop the following Trust Policy on any AWS IAM Role — with the audience being the unique OIDC ID number assigned to each GCP IAM Service Account.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"Federated": "accounts.google.com"},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"accounts.google.com:aud": "aud-value"
}
}
}
]
}
So, the easiest path to let GKE Pods call AWS seems to be to give them a GCP IAM service account identity through their process (Workload Identity Federation) — and then also give that same Google service account access to assume the relevant AWS IAM Roles too. This gives the GKE Pod access to both clouds in one fell swoop! And you’d usually want a GKE Pod to use GCP APIs in addition to AWS ones anyway (rather than just one or the other)…
For the other direction, giving an EKS Pod in AWS access to GCP, you configure that (getting access to a GCP service account — their version of a AWS IAM Role Assumption) nearly the same way as you would to give a GKE Pod access — via their GKE Workload Identity Federation.
And, interestingly, GCP provides clear documentation for EKS and AKS workloads to get access to Google Cloud in their documentation (linked above) — but AWS doesn’t provide similar documentation on how to give GKE and AKS access to AWS (that I could find anyway). It speaks to how Google is positioning itself effectively as the easy “other cloud” in a multi-cloud environment. But that is the topic for another blog post!
Cloud Identities for Pods in EKS and GKE for their own respective clouds
Before I get into how to do this cross-cloud, I figured I’d start with how to give a Pod an identity within the cloud where it lives (EKS to AWS and GKE to GCP).
AWS Elastic Kubernetes Service (EKS)
With AWS EKS, there are two ways to give a Pod an AWS IAM identity.
- IAM Roles for Service Accounts (IRSA) — this was the first way to do it and it requires you to create an Open ID Connect (OIDC) provider for each cluster and then tell AWS IAM to trust the Kubernetes ServiceAccounts in that cluster to be able to assume tokens in AWS IAM via the Simple Token Service (STS). It also adds a mutating admission controller such that, by putting an annotation on a Kubernetes Service Account, it will automatically add the right environment variables and mount the tokens from STS into the Pods for you that use that ServiceAccouont.
- Pod Identities — this is a newer alternative to IRSA that AWS now offers that works without needing the OIDC endpoint(s). It was in response to three common challenges with IRSA:
– Firstly, the team provisioning EKS might not have enough access to AWS IAM to manage its OIDC providers (which, in many organizations, is managed by another team) — this removes the need to create those.
– Secondly, is that there is a limit on the size of an IAM Role’s Trust Policy that limited you to trusting about 5 IRSA OIDC & Kubernetes ServiceAccount pairs per IAM Role as well as a limit of 100 OIDC Providers per AWS account — this doesn’t have those.
– And, finally, that the binding of IAM Trust Policies to per-cluster OIDC providers made moving workloads been clusters more difficult (as you had to update all the IAM Roles’ Trust Policies of all the workloads on that cluster to do so) — and this doesn’t have that issue. Plus it has a nice UI in the AWS EKS Console (which IRSA doesn’t have).
GCP Google Kubernetes Service (GKE)
Google actually works similarly to AWS IRSA here. Instead of an OIDC Identity Provider in AWS IAM, they use something similar they call a Workload Identity Pool in GCP’s IAM. And, instead of assuming an AWS IAM Role, you are assuming a GCP Service Account. But, basically, you are having Google Cloud trust the Kubernetes ServiceAccounts in the GKE cluster(s) via an OIDC Identity Provider for that cluster for the purposes of ‘assuming’ a GCP Service Account identity.
This is all fairly well documented here.
But how can a pod in GCP’s GKE assume an IAM Role in AWS?
There are two ways:
- You can have AWS trust the service accounts on a specific GKE cluster via adding an Open ID Connect (OIDC) Provider (https://cloud.google.com/kubernetes-engine/multi-cloud/docs/aws/how-to/use-workload-identity-aws)
- You can leverage the fact that AWS STS already implicitly trusts all Google identities — https://aws.amazon.com/blogs/security/access-aws-using-a-google-cloud-platform-native-workload-identity/.
The first option is fairly straight-forward and similar to AWS EKS’s IAM Roles for Service Accounts (IRSA) described above. The downside vs. AWS IRSA is that, unlike AWS, you need to add the various environment variables and a projected volume for the token to each Pod manually for it to work (whereas AWS IRSA in EKS has a mutating admission controller in it that adds those all to everything for you when you put the annotation on the ServiceAccount).
What I found more interesting, though, is the second option — it is less well known and much simpler to configure on the AWS side (as it doesn’t require any OIDC Providers for the cluster(s) to be configured — which then need to live in a particular AWS account making cross-account access within AWS complicated if required).
The GKE Pod to GCP IAM SA to AWS IAM Role Path
What we’re about to do is as follows:
- Give a GKE Pod access to a GCP IAM Service Account (via their Workload Identity Federation). That gives it access to Google Cloud APIs.
- Then give that GCP IAM Service Account’s OIDC jwt (that the Pod now has access to via a simple curl from the metadata endpoint) access to an assume an AWS IAM Role (via AWS STS’s AssumeRoleWithWebIdentity against a Google identity). That gives it access to AWS APIs.
GKE Pod access to a GCP IAM Service Account
Firstly we create a GCP IAM Service Account (gke-to-aws-test) — note the long Unique ID number which we’ll need in a minute as our OIDC Audience:
Next we’ll verify that Workload Identity is turned on in our GKE cluster (it is by default with GKE Autopilot here) and what the namespace is (this is the equivalent of an OIDC endpoint with EKS IRSA — but it can be shared by multiple GKE clusters in GCP):
Now we need to give a Kubernetes ServiceAccount in a particular GKE namespace access to our GCP IAM Service Account (via the role Workload Identity User). The naming convention for the principal is:
<Workload identity namespace>[<K8s Namespace Name/K8s Service Account Name]
Now all we need for a Pod to get access to this GCP Service Account automatically is to put an annotation on the K8s Service Account (in that Namespace) that is assigned to it and it’ll auto-magically have it at runtime:
apiVersion: v1
kind: ServiceAccount
metadata:
name: gke-service-account
namespace: gke-to-aws-test
annotations:
iam.gke.io/gcp-service-account: gke-to-aws-test@project-435400.iam.gserviceaccount.com
GCP IAM Service Account access to an AWS IAM Role
The neat thing here is that Google Cloud Identities, including IAM Service Accounts, are all OIDC-based — and can get a JSON Web Token (JWT) by just curling a particular endpoint.
If I kubectl exec into a Pod with this service account and run the following curl against the metadata endpoint I get an accounts.google.com jwt back!
curl -sH "Metadata-Flavor: Google" "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=108840315781302076417&format=full&licenses=FALSE"
eyJhbGciOiJSUzI1NiIsImtpZCI6ImVlYzUzNGZhNWI4Y2FjYTIwMWNhOGQwZmY5NmI1NGM1NjIyMTBkMWUiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiIxMDg4NDAzMTU3ODEzMDIwNzY0MTciLCJhenAiOiIxMDg4NDAzMTU3ODEzMDIwNzY0MTciLCJlbWFpbCI6ImdrZS10by1hd3MtdGVzdEBwcm9qZWN0LTQzNTQwMC5pYW0uZ3NlcnZpY2VhY2NvdW50LmNvbSIsImVtYWlsX3ZlcmlmaWVkIjp0cnVlLCJleHAiOjE3MzkyNzY2NDksImlhdCI6MTczOTI3MzA0OSwiaXNzIjoiaHR0cHM6Ly9hY2NvdW50cy5nb29nbGUuY29tIiwic3ViIjoiMTA4ODQwMzE1NzgxMzAyMDc2NDE3In0.BR8MqZhSPX8uWSQirc8VVAaaVNvL7a8_L_KOqEMldVQf4KNng5qPoAEK2_QEL_ERJWeJYZZCT-YSqr1Wtpt7SRzOp52hW2nw4GYDtfaD-ij4DD12ob-V3_2SXO785e7TVntHC28NrUOqe5xhlAQtpKkbyVmux_KKnH9hYvtpGJjJv1ZQjtuMIN1KhrMzl9Y2T0v8xx2vDZhcYEWE3lUNgO8peKmY7dVGKpeSCQQbUCNQeM_9fY8fn1-vRDe7LTD4vjo4QI80YYPrmculBH0IYr0Xc6esg8CWqHohR8zdKaxa0DImMooLsAPGAgC2hNw8IRj4Fy6b3PxNH3H1QucC0Q
And if I go to https://jwt.io and paste it in I can decode that — which looks like this. In the payload you’ll see the audience (aud) is the unique ID of our GCP IAM service account and the email is its full ID. We’ll use that aud for our AWS IAM Trust Policy.
NOTE: I have since deleted this GCP IAM service account and removed any access that it had to be safe.
For our last trick, you can just give this jwt straight to AWS Simple Token Service (STS) and assume an AWS IAM Role with it!
First we’ll need to add the following Trust Policy to an AWS IAM Role. It says that the Google identity with that unique aud ID is allowed to AssumeRoleWithWebIdentity. That’s all we need (literally no other AWS changes) — and I can put this Trust Policy on any AWS IAM Role I want in any of my AWS Accounts. This seems to be a remnant of early AWS (it launched in 2013) which seems to have been built as an alternative to using AWS IAM Users before Enterprise SSO became as easy/common?
Now all we need to do is just pass this jwt that we curled from inside our Pod to AWS STS via the AWS CLI. We can do that end-to-end with this script:
GCP_OAUTH_AUD="108840315781302076417"
AWS_ROLE_ARN="arn:aws:iam::281031839323:role/gke-to-aws-test"
jwt_token=$(curl -sH "Metadata-Flavor: Google" "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=${GCP_OAUTH_AUD}&format=full&licenses=FALSE")
credentials=$(aws sts assume-role-with-web-identity --role-arn $AWS_ROLE_ARN --role-session-name $GCP_OAUTH_AUD --web-identity-token $jwt_token | jq '.Credentials' | jq '.Version=1')
echo $credentials
And I’ll get back the required AWS token to be that role from STS. But now I need to feed that to the AWS CLI and SDK to use somehow.
For that, I can tell the AWS CLI and SDKs to run that script for me every time I want to call AWS by specifying the credential_process parameter in the ~/.aws/config file.
And I can even mount both that script and the AWS config file into my Pod at runtime via ConfigMaps too — and then any AWS CLI or SDK commands will just work out-of-the-box without changing the container image! Note that the AWS CLI will need to be in the container image for this to work (as the script needs to run the aws sts CLI) — but if that is an issue you could move this approach to a sidecar within the Pod fairly easily as well.
apiVersion: v1
kind: ConfigMap
metadata:
name: credentials-sh
namespace: gke-to-aws-test
data:
credentials.sh: |-
#!/bin/bash
yum install jq -y &> /dev/null
jwt_token=$(curl -sH "Metadata-Flavor: Google" "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=${GCP_OAUTH_AUD}&format=full&licenses=FALSE")
credentials=$(aws sts assume-role-with-web-identity --profile credentials_script --role-arn $AWS_ROLE_ARN --role-session-name $GCP_OAUTH_AUD --web-identity-token $jwt_token | jq '.Credentials' | jq '.Version=1')
echo $credentials
---
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-config
namespace: gke-to-aws-test
data:
config: |-
[default]
credential_process = /root/credentials.sh
[profile credentials_script]
---
apiVersion: v1
kind: Pod
metadata:
name: awscli
namespace: gke-to-aws-test
spec:
serviceAccount: gke-service-account
containers:
- name: awscli
image: amazon/aws-cli:latest
# Just spin & wait forever so we can kubectl exec in
command: ["/bin/bash", "-c", "--"]
args: ["while true; do sleep 30; done;"]
volumeMounts:
- name: credentials-sh
mountPath: /root
- name: aws-config
mountPath: /root/.aws
env:
- name: GCP_OAUTH_AUD
value: "108840315781302076417"
- name: AWS_ROLE_ARN
value: "arn:aws:iam::281031839323:role/gke-to-aws-test"
volumes:
- name: credentials-sh
configMap:
name: credentials-sh
defaultMode: 0777
items:
- key: credentials.sh
path: credentials.sh
- name: aws-config
configMap:
name: aws-config
items:
- key: config
path: config
To Recap
I just gave my pod running in GKE access to both GCP and AWS via one K8s ServiceAccount seamlessly and simultaneously. If I put both cloud CLIs/SDKs in my container image then I could use both simultaneously with the authentication done automatically with secure short-lived credentials that are automatically rotated.
This is the start of a series of blog posts I’ll be doing re: the differences between GKE and EKS
I personally stated with Kubernetes in cloud with AWS EKS — and, so, come from that background/perspective. Lately, though, I have been exploring GCP and GKE in a multi-cloud context with AWS.
So, I’ll be continuing on a series of blog posts exploring the differences between the two — helping both AWS/EKS people understand GCP/GKE and (hopefully) vice versa.
Up next is networking — https://jason-umiker.medium.com/eks-vs-gke-networking-e1dd397fe86d.