Hey hey !
In the past few days, I've had a chance to look over the AKS-AAD integrations (which utilize guard, under the hood);
That was during some research I'm doing for implementing non interactive logins for AAD enabled clusters (both legacy & managed).
Specifically, I'm looking at auth via service principals, passing a client_credentials-flow token, which holds a service principals object id claim and no UPN claim.
Managed AAD clusters have it mostly solved (a-la kubelogin) - one only needs to issue a token for the multi tenant AKS server app, and any directory entity could be used in this flow - the groups JWT claim is considered along with any overage data.
However, in case of overage - ms graph is consulted, and the request fails for SPNs, as msgraph 404-s fetching group memberships for service principals;
This is because the graph API currently used only supports retrieving memberships for users (and not service principals).
As for legacy clusters, the groups JWT claim isn't considered at all -
ms graph is always consulted in fetching group memberships for the given object id;
Specifically, the "/users/id/getMemberGroups" endpoint is used.
When an spn oid is passed - the API above 404-s in the same manner and it fails the request altogether.
I believe that ms graph had no API for retrieving groups for any given entity back in the day, but now one does exist -"/directoryObjects/id/getMemberGroups".
I was wondering if it would be a good idea to migrate to the new endpoint - this would enable non interactive login flows to legacy clusters and fix the flow for managed integrations for SPNs assigned to many groups.
Otherwise, it might make sense to have the ms graph call be best effort - returning blank groups in cases of error;
It's a bit unfortunate that not being able to retrieve groups, fails the auth attempt altogether -
when reaching that code path, we have a verified JWT at hand with some object id,
it might have made sense to pass it onward and check for any direct k8s role mapping.
Another suggestion might be to flip the flag which considers the given groups claim (on AKS side) for legacy clusters - closing the disparity between the two integrations.
Would love to hear your two cents on this @weinong