OAuth 2.1, JWTs, Service-to-Service Auth, and Identity in Cell Architecture
← Back to Study GuideOAuth 2.1 consolidates OAuth 2.0 best practices and deprecates insecure patterns. It's the foundation of modern authorization.
| Change | OAuth 2.0 | OAuth 2.1 | Why |
|---|---|---|---|
| PKCE | Optional | Required for all clients | Prevents authorization code interception |
| Implicit Grant | Allowed | Removed | Tokens in URLs are insecure |
| Password Grant | Allowed | Removed | Exposes credentials to client |
| Refresh Tokens | Bearer tokens | Sender-constrained recommended | Prevents token theft |
| Redirect URIs | Loose matching | Exact match required | Prevents open redirect attacks |
code_verifier and sends its hash (code_challenge) in the auth request. When exchanging the code, it sends the original verifier. The auth server verifies the hash matches. An attacker who intercepts the auth code can't use it without the verifier.
For devices without browsers (smart TVs, CLI tools). User authenticates on a separate device using a code.
"OAuth 2.1 is the latest authorization standard that consolidates best practices from OAuth 2.0. The biggest changes are: PKCE is now mandatory for all clients—not just mobile—which prevents authorization code interception attacks. The implicit and password grants are removed because they're inherently insecure. For user authentication, you use Authorization Code with PKCE. For service-to-service, you use Client Credentials. The key insight is that OAuth is about authorization, not authentication—it answers 'what can this token access?' not 'who is this user?' For identity, you layer OpenID Connect on top, which adds an ID token containing user claims."
eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWUsImlhdCI6MTUxNjIzOTAyMn0.POstGetfAytaZS82wHcjoTyoqhMyxXiWdR7Nn7A29DNSl0EiXLdwJ6xC6AfgZWF1bOsS_TuYI3OG85AmiExREkrS6tDfTQ2B3WXlrr-wp5AokiRbz3_oB4OxG-W9KcEEbDRcZc0nH3L7LzYptiy1PtAylQGxHTWZXtGz4ht0bAecBgmpdgXMguEIcoqPJ1n3pIWk_dUZegpqx0Lka21H6XxUTxiy8OcaarA8zdnPUnV6AmNP3ecFawIFYdvJB_cm-GvpCSbr8G8y_Mllj8f4x9nBH8pQux89_6gUY618iYv7tuPWBFfEbLxtF2pZS6YC1aSfLQxeNe8djT9YjpvRZA
Header: {"alg": "RS256", "typ": "JWT"}
Payload: {"sub": "1234567890", "name": "John Doe", "admin": true, "iat": 1516239022}
Signature: [Base64 encoded signature]
| Claim | Name | Purpose | Example |
|---|---|---|---|
iss |
Issuer | Who created the token | https://auth.twilio.com |
sub |
Subject | Who the token represents | user:AC123456 |
aud |
Audience | Intended recipient | https://api.twilio.com |
exp |
Expiration | When token expires | 1699900000 (Unix timestamp) |
iat |
Issued At | When token was created | 1699896400 |
jti |
JWT ID | Unique token identifier | abc-123-def |
scope |
Scope | Permissions granted | messages:read voice:write |
HS256, HS384, HS512
RS256, ES256, PS256
| Aspect | JWT (Self-Contained) | Opaque Token (Reference) |
|---|---|---|
| Validation | Local (check signature + claims) | Requires auth server lookup |
| Revocation | Hard (must wait for expiry or use blocklist) | Easy (delete from store) |
| Size | Large (contains all claims) | Small (just a reference) |
| Privacy | Claims visible (base64, not encrypted) | Claims hidden server-side |
| Latency | Low (no network call) | Higher (introspection call) |
| Best For | Service-to-service, short-lived | User sessions, sensitive data |
iss, aud, exp, nbf on every request.JWKS allows services to fetch public keys for JWT verification without sharing secrets.
// JWKS Endpoint: https://auth.twilio.com/.well-known/jwks.json
{
"keys": [
{
"kty": "RSA",
"kid": "key-2024-01", // Key ID - matches JWT header
"use": "sig", // Signature verification
"alg": "RS256",
"n": "0vx7agoebGcQSuu...", // RSA modulus
"e": "AQAB" // RSA exponent
},
{
"kty": "RSA",
"kid": "key-2024-02", // New key for rotation
"use": "sig",
"alg": "RS256",
"n": "1b3aJif8sdjf...",
"e": "AQAB"
}
]
}
"JWTs are self-contained—they carry all claims in the token itself, signed cryptographically. Services can validate them locally by checking the signature and claims without calling the auth server. This is great for low-latency service-to-service auth. The trade-off is revocation: you can't invalidate a JWT before expiry without maintaining a blocklist. Opaque tokens are just random strings that reference server-side session data. They require a lookup on every request, but you get instant revocation and better privacy since claims aren't exposed. My rule of thumb: use short-lived JWTs (5-15 minutes) for service-to-service with refresh tokens for renewal, and opaque tokens for user-facing sessions where revocation matters. In a cell-based architecture, JWTs are particularly valuable because services within a cell can validate locally without cross-cell calls to an auth server."
In a microservices architecture, services need to authenticate each other. This is fundamentally different from user authentication.
Simple but limited. Best for external API access, not internal service-to-service.
| Aspect | API Keys | OAuth Tokens |
|---|---|---|
| Rotation | Manual, disruptive | Automatic via refresh |
| Scoping | Usually all-or-nothing | Fine-grained scopes |
| Expiration | Typically long-lived | Short-lived + refresh |
| Revocation | Requires key regeneration | Instant |
spiffe://twilio.com/cell/enterprise-us-east-1/service/messagingPerfect for Kubernetes environments where pods need identity without managing secrets.
Why: Principle of least privilege. Downstream services only get permissions they need.
Risk: Confused deputy problem. Service B might misuse the user's permissions.
Why: Best of both worlds. External security (revocable opaque), internal performance (local JWT validation).
"I layer multiple mechanisms. At the transport layer, mTLS ensures only trusted services can communicate—this is your zero-trust foundation. At the application layer, I use short-lived JWTs via client credentials grant. Each service has its own identity and requests tokens scoped to what it needs. For user context propagation, I prefer token exchange over forwarding—when Service A calls Service B on behalf of a user, it exchanges the user token for a new token with narrower scope. This prevents the confused deputy problem where Service B might misuse the user's full permissions. In Kubernetes, I'd use SPIFFE/SPIRE for workload identity, which eliminates secrets management entirely—pods get cryptographic identity from the platform. The key principle is defense in depth: mTLS for transport, tokens for application-level authz, and always scope down permissions at each hop."
| Decision | Choice | Rationale |
|---|---|---|
| Identity data location | Global (DynamoDB Global Tables) | Customers authenticate once, access any region |
| Token validation | Local (JWKS cached in cell) | No cross-cell latency on every request |
| Token lifetime | Short (5-15 min) | Limits blast radius of compromised token |
| Token contains cell_id | Yes | Prevents token from being used in wrong cell |
| API Key → Cell mapping | Cached at edge (Redis) | 95% cache hit, ~5ms latency |
| Failure | Impact | Mitigation |
|---|---|---|
| Global Identity Service down | New auth fails, existing JWTs still valid | Multi-region deployment, circuit breaker to cache |
| JWKS endpoint unreachable | Can't validate new tokens | Cache JWKS aggressively (hours), fallback to stale |
| Redis cache miss storm | DynamoDB overload | Request coalescing, circuit breaker |
| Key compromise | Attacker can forge tokens | Key rotation via JWKS, short token lifetime |
"Identity is the one truly global service in our cell-based architecture. Customer accounts, API credentials, and the account-to-cell mapping live in DynamoDB Global Tables, replicated across regions. When a request arrives, the Cell Router validates credentials against the global identity service and looks up which cell owns that customer. It then generates a short-lived JWT—5 minutes—containing the account ID, cell ID, and scoped permissions. The request routes to the correct cell with this JWT. Inside the cell, services validate the JWT locally using cached JWKS keys—no calls back to the identity service. This is critical for latency and isolation. If identity were cell-local, you'd need cross-cell calls or duplicate credentials everywhere. The key insight is separating authentication (global) from authorization (cell-local). The JWT carries enough context that cells can authorize locally. For token compromise, the short lifetime limits blast radius, and we can rotate signing keys via JWKS without coordinating with cells."
| Capability | Description | Twilio Relevance |
|---|---|---|
| Passwordless Auth | Magic links, OTP, biometrics | Integrates with Twilio Verify for OTP delivery |
| OAuth/SSO | Social login, enterprise SSO | B2B customers need SSO for their users |
| Session Management | Token lifecycle, device fingerprinting | Fraud prevention across channels |
| MFA | Multiple second factors | Twilio delivers SMS/Voice factors |
| Connected Apps | Third-party app authorization | AI agents authorizing access to Twilio resources |
AI agents need to act on behalf of users—send messages, access data, make calls. Traditional auth assumes a human in the loop. How do you:
"Twilio sees AI agents as the next major platform shift—like mobile was. When AI agents act on behalf of users, you need a new identity model. Traditional OAuth assumes a human approving each action. With AI, you need: agent authentication—proving Claude is really Claude, not an impersonator; delegated authorization—the user grants the agent permission to send messages, but only to certain contacts, only for this task; scoped tokens that expire when the task is done; and a full audit trail. Stytch provides this infrastructure. The strategic insight is that identity becomes the control point for AI-to-human communication. If you're building AI agents that need to call, message, or email humans, you need both the communication APIs and the identity layer to authorize those actions. Twilio now owns both. It's a platform play: the company that controls how AI agents authenticate will control a huge piece of the AI economy."
A: "I'd use a two-tier system. At the edge, customers authenticate with Account SID and Auth Token—essentially an API key pair. The Cell Router validates these against a global identity service using DynamoDB Global Tables for low-latency lookups. Once validated, I generate a short-lived JWT containing the account ID, assigned cell, and scoped permissions. This JWT travels with the request through the cell. Inside the cell, services validate the JWT locally using JWKS—no calls back to identity. This gives us the best of both worlds: simple API key UX for customers, but modern token-based auth internally with scoping and expiration. The API keys themselves are hashed with Argon2 in storage and can be rotated without downtime by supporting multiple active keys per account."
A: "This is where short token lifetime is your friend. With 5-minute JWTs, you're never more than 5 minutes from automatic revocation. For immediate revocation—like a compromised account—I use a deny list approach: a small, fast-path check against a bloom filter of revoked token IDs (jti claims). The bloom filter gives false positives but never false negatives, so worst case you re-authenticate valid tokens. For account-level revocation, I increment a 'token generation' counter in the account record. Tokens include this generation; if it doesn't match current, the token is invalid. This scales because it's a single integer comparison, not a list lookup. The deny list is eventually consistent—replicated via Kafka to all cells within seconds."
A: "Within a cell, I use mTLS as the baseline—every service presents a certificate, and only services with valid certs can communicate. This is handled by the service mesh, transparent to application code. On top of mTLS, I use JWTs for authorization. When the Messaging service calls the Delivery service, it includes the original user JWT plus its own service identity. The Delivery service validates both: mTLS proves it's really the Messaging service calling; the JWT proves what account and permissions this request has. For service-to-service calls that aren't on behalf of a user—like batch jobs—I use the client credentials grant with service-specific scopes. Each service has minimum necessary permissions. Tokens are cached but short-lived, and we use token refresh to avoid thundering herd on expiry."
A: "Identity is one of our most critical global services, so it's designed for extreme availability. DynamoDB Global Tables give us multi-region active-active with automatic failover. The Cell Router has circuit breakers—if identity is slow or failing, it fails open for existing cached mappings but closed for new accounts. Existing JWTs remain valid until expiry, so authenticated sessions continue working. JWKS is cached aggressively—we can validate tokens for hours without reaching the identity service. For new authentication, we have a degraded mode: rate-limited, using a stale read from the nearest replica. We accept slightly stale data over complete unavailability. The key insight is separating the 'new authentication' path—which can degrade—from the 'validate existing token' path—which must always work locally."
A: "The token is the isolation boundary. Every JWT contains the account ID and cell assignment. When a request enters a cell, the first thing we check is: does this token's cell_id match this cell? If not, reject immediately—this prevents requests from being misrouted or replayed to the wrong cell. Within the cell, every database query includes account_id in the WHERE clause—enforced at the ORM layer, not optional. We use row-level security in PostgreSQL as a second layer: even if application code has a bug, the database won't return other tenants' data. For defense in depth, our service mesh logs all cross-service calls with account_id, and we have anomaly detection for unusual access patterns—like one account suddenly accessing data from many other accounts. The token creates the logical boundary; infrastructure enforces it."
Key points to hit:
Example angle: "We were implementing MFA for our API and security wanted mandatory hardware keys. Product showed data that 40% of developers would churn. I proposed tiered MFA—TOTP for standard accounts, hardware keys for accounts with PCI data. Security accepted because we scoped the risk."
Show:
| Metric | Recommended Value | Why |
|---|---|---|
| Access token lifetime | 5-15 minutes | Limits blast radius of compromise |
| Refresh token lifetime | 24 hours - 30 days | Balances UX with security |
| JWKS cache TTL | 1-24 hours | Survives identity outages |
| Key rotation frequency | 90 days | Compliance requirement, limits exposure |
| API key hash algorithm | Argon2id | Memory-hard, resists GPU attacks |
| JWT signature algorithm | RS256 or ES256 | Asymmetric for distributed validation |