Bernhard's shared items

Fun-reliable side-channels for cross-container communication
Thursday November 13^th, 2025 at 12:03 PM

h4x0r

While exploring the Linux kernel we discovered a fun side-channel that allows for cross-container communication in the most common, default container deployment scenarios on modern kernels. This is cool because it doesn’t require sharing volume mounts, nor does it involve modifying any of the default namespaces (NET, PID, IPC, etc.), or adding special privileges (no new CAP_-abilities, nor changes to seccomp or AppArmor). It works out of the box with default Docker and Kubernetes configurations, and it even works with no network at all, as we demonstrate in this post by using docker run --network none sidechannel /h4x0rchat to showcase a full cross-container IRC-style chatroom implemented on top of this side-channel.

We originally set out to find this side-channel because we wanted a way for a given container to know if another instance of its same image was already running on the host. Consider a scenario where you want to collect environmental telemetry from your containers when they first start running. Now consider that, to handle real workloads, container deployments are often scaled up with a given image running many times over simultaneously on the same host.

Humoring further consideration, if you scale the same container image thousands of times over, and the environmental telemetry is effectively the same for each instance on the same host, you’ll probably want a way to throttle how many instances report back, to save compute time and bandwidth that would be otherwise wasted on duplicate reports. Finally, imagine that you work with many teams, all of which operate with varying requirements and constraints, and as such you can’t always control (or, maybe even never control) how these containers are deployed. If only there was a way the container could identify the presence of itself already running on the same host?

Because this side-channel circumvents the intended isolation behavior of containers, it could technically be considered a vulnerability, even through we see it more as functionality we previously wished we had.

Components

The first component of this side-channel involves nsfs (the namespace filesystem), which is a special filesystem made available to userland through /proc/<pid>/ns/. The nsfs is similar to procfs in that its entries are not actual files, but instead special file-like objects which can be used for interfacing with the kernel. In partciular, nsfs entries are like magical symlinks that point to namespace inode identifiers, with each namespace type being represented by its own named entry in the /proc/<pid>/ns/ directory. In practice, these magical symlinks can be used by opening a file descriptor to one and and passing it to setns, to enter a namespace, for example.

Unlike procfs, the nsfs entries are not unique across different mounts of the parent procfs containing the ns/* directory. This means that any namespace shared by multiple processes will result in them having the same file-like nsfs entry representing that namespace, reachable relative to each process at /proc/self/ns/<namespacename>.

The next component of this side-channel are time namespaces, which are for applying offsets to the system’s monotonic and boot-time clocks. The issue is not with how time namespaces are used, but in the fact that they are generally not used.

The utility of the time namespace applies only to niche scenarios like cross-host container migration, which is probably why (as far as I can tell) Docker doesn’t support setting the time namespace, and the documentation available instead instructs users to manually run unshare. In other words, not only are time namespaces shared, but there’s no easy way for the average container user to unshare them.

The important result of all of this is that by default, container and host processes all share the same /proc/self/ns/time entry, which more-or-less behaves like a file resource (or enough like one that it enables our side-channel).

It’s common that a single user namespace is shared by default across containers, and would also lend itself for exercising this same side-channel. However, some security conscious users set up separate user namespaces to reduce kernel attack surface a tiny bit, so we don’t expect it to be shared as ubiquitously as time.

Now, let’s talk about POSIX Advisory Locks, the official Linux docs for which can be read by running man fcntl and scrolling down to the “Advisory record locking” section. In short, POSIX advisory locks provide a cooperative (vs mandatory) file locking mechanism that operates on byte offset ranges (intervals) within a given file. These locks are “process-associated”, meaning that their acquisition and entire life-cycle is bound to a single process (and its threads). These locks are not inherited by child processes, and they clear once the owning process exits. By operating on intervals, these advisory locks allow for a more explicit expression of file content usage than whole-file locking mechanisms. For example, one process might hold a read-lock for byte range 10-200, and another might hold a write-lock for range 500-600 on the same file, and because those ranges don’t overlap, neither lock would contend with the other. Since these are cooperative, holding a lock doesn’t stop other processes from reading or writing the files, and instead only stops other processes from acquiring locks of a conflicting type that intersect with the same interval.

These advisory locks have some additional interesting properties, which, when combined with a shared file resource (or even pseudo-file resource, like /proc/self/ns/time) can facilitate a side-channel:

A user only needs to have a file-like resource open for reading in order to acquire a read-lock (and conversely must have the file open for writing, in order to acquire a write-lock).
The file doesn’t need to actually have readable content (note that /proc/self/ns/time does not actually have anything to read, for example).
The lock intervals do not need to reflect the real size of the file, and are specified using off_t, which means there are effectively 63bits of space available in which a lock interval can be set (off_t is signed and locks cannot be placed below offset 0).
A file open for reading can be queried to determine if a write-lock would hypothetically contend with any other lock, even if the querying process does not actually possess the privileges to open the file for writing.

These properties combined are enough to provide a basic cross-container side-channel primitive, because a process in one container can set a read-lock at some interval on /proc/self/ns/time, and a process in another container can observe the presence of that lock by querying for a hypothetically intersecting write-lock.

There are still yet more properties about these locks that can be used for synchronization across this side-channel, but before getting into those, presented below are small programs demonstrating cross-container communication using the fundamentals discussed above.

POSIX Advisory: Explicit Content

// setlock.c
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
 if (argc < 3) {
 printf("usage: %s <offset> <len>\n", argv[0]);
 exit(1);
 }
 off_t offset = atol(argv[1]);
 off_t len = atol(argv[2]);
 int fd = open("/proc/self/ns/time", O_RDONLY);
 if (fd < 0) {
 printf("failed to open /proc/self/ns/time\n");
 exit(1);
 }
 struct flock lock;
 memset(&lock, 0, sizeof(lock));
 lock.l_type = F_RDLCK;
 lock.l_whence = SEEK_SET;
 lock.l_start = offset;
 lock.l_len = len;
 if (fcntl(fd, F_SETLK, &lock) < 0) {
 printf("fcntl() failed\n");
 exit(1);
 }
 printf("lock set at %ld:%ld, press enter to exit\n", offset, offset+len);
 getchar();
}

The above setlock.c program takes two arguments, an offset and a length, which are used as the interval for an advisory read lock on /proc/self/ns/time. Below is a counterpart program which similarly takes two arguments, instead querying the interval for hypothetical contention, using an advisory write-lock:

// querylock.c
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
 if (argc < 3) {
 printf("usage: %s <offset> <len>\n", argv[0]);
 exit(1);
 }
 off_t offset = atol(argv[1]);
 off_t len = atol(argv[2]);
 int fd = open("/proc/self/ns/time", O_RDONLY);
 if (fd < 0) {
 printf("failed to open /proc/self/ns/time\n");
 exit(1);
 }
 struct flock lock;
 memset(&lock, 0, sizeof(lock));
 lock.l_type = F_WRLCK;
 lock.l_whence = SEEK_SET;
 lock.l_start = offset;
 lock.l_len = len;
 if (fcntl(fd, F_GETLK, &lock) < 0) {
 printf("fcntl() failed\n");
 exit(1);
 }
 if (F_UNLCK != lock.l_type) {
 printf("collision: %ld:%ld\n", lock.l_start, lock.l_start+lock.l_len);
 } else {
 printf("no lock intersects with %ld:%ld\n", offset, offset+len);
 }
}

In querylock.c we set the struct flock.l_type to F_WRLOCK, but when calling fcntl() we specify the command argument as F_GETLK to get info about possible lock contention vs attempt to set a lock. If there is no contention, the struct flock member field l_type is updated by the kernel to contain F_UNLCK. Shown below, you can see that once a lock is set in one container, any other container (or any process on the host for that matter) can see it:

Synchronization

The ability to set and query for the presence of read-locks across containers is itself pretty cool, but to use this for proper communication, we would ideally have some way to synchronize how containers access the 63bit-space available in /proc/self/ns/time. Luckily POSIX advisory locks have some other nuances which we can use to achieve this:

When the presence of a contending lock is found, the kernel updates the struct flock member field l_pid to contain the PID of the process holding the lock, or 0 if that process is in another PID namespace.
If there are multiple processes with contending locks, the kernel selects and reports the PID of a “primary” lock holder. Ironically, the ordering for selecting this primary lock holder is not based on which of the contestants was first to acquire an intersecting lock, but instead by which of the contestants has held any advisory lock on the file the longest.

Given that the kernel imposes ordering when reporting the PID of lock owners, and that the ordering is preserved across PID namespaces (even if that means the owning PID is reported as 0), for a process to know if it is the “primary” lock holder, all it needs to do is create a child process to query the lock, and see if the owning PID is the parent process. Also, given how “primary” lock holders are determined by the kernel, to participate with “fairness” in this race, the process competing to acquire this lock should not hold any prior locks. For example, let’s say two containers both want to compete in a race for “ownership” of the offsets 500-501, they could each take the following steps to attempt to lock, and determine if they “won” the competition:

The locking process, here called P1, holding no prior locks, acquires a read-lock at offsets 500-501.
P1 fork()s to create process P2, which has no affiliation with P1’s lock state.
P2 queries for the hypothetical write-lock contention at offset 500-501, the reply to which will always be true (given that P1 definitely has a lock, possibly others do too). P2 then compares the struct flock.l_pid field to see if it matches P1’s PID, if so then P1 won the race for ownership, otherwise it did not. If, instead, it sees 0 for the PID, it means a process in another PID namespace was first to get the lock P1 is not the owner.
P2 tells P1 (either by pipe, or any form of IPC) the result, and now P1 is coordinated with all other container instances which are following this same protocol to race for ownership of byte offsets 500-501.

These additional properties provide us enough functionality to construct a sort of protocol for containers that normally are unaware of each other’s existence to synchronize with each other, build more traditional read-write-lock mechanisms, and to perform tasks like leader selection. To demonstrate that this is not purely theoretical and can be used for practical communication, we’ve written a cross-container h4x0rchat program built on top of this side-channel:

To support an arbitrary number of users with proper message ordering, and provide a real-time chat without absolutely slamming the CPU in tight query loops, h4x0rchat expands on the described synchronization primitive to create a system of ever-forward-rolling message slots. Clients sync to claim slots in order to post messages, and check for new messages periodically between reading stdin. Messages are written bit-by-bit, with each byte offset in the message slot representing a 1 or 0 depending on if a lock is held at that offset. A “ready” bit is used to indicate when a claimed message slot has been fully written. As long as everyone is following the same protocol, this chat avoids racy data collisions…mostly. A full walk-through of the h4x0rchat protocol, and its shortcomings, would be too much to unpack in this post, but we’re considering writing a follow-up if there is reader interest— as in, if both of our readers like it, ha!

NOTE: the demo setlock.c program shown earlier will interfere with this chat program, so if you see any error messages about there being “interference” when trying out h4x0rchat, make sure you’re not also running setlock!

The h4x0rchat, or any other communication mechanism built atop this same side-channel, is open to disruption (and likely complete denial-of-service) because there are no security guarantees over how other processes apply locks. For example, any process can acquire a lock that spans the entire space of all lockable offsets, and if they’re the first to hold such a lock they can ruin the party line for everyone. Maybe for a defender that’s a good thing? While this side-channel doesn’t present a dire threat to container security, there are definitely scenarios where it could support nefarious activity, and so we’ll close with some security considerations:

You can use the demo setlock program to be the rude user who jams the whole party line by running our demo program with ./setlock 0 9223372036854775807, but if one or more users have held any lock before you, they might be able to devise a protocol for communicating still (just not with the same freedom and ease).
We thought that it would be possible to write a simple AppArmor profile using deny /proc/*/ns/time rwklx, to deny access to /proc/*/ns/time. But, from a first pass of experiments, it seems that this doesn’t work. We will follow-up as we learn more—my gut is telling me that this some specific behavior related to nsfs, but who knows?
You could also put in the grueling work of manually invoking unshare on the time namespace (this feels tedious, who’s got time for that??)

Special thanks to Robert Prast, Jay Beale, and Lee T. Hacker for their feedback.

Read the whole story

bernhardbock

52 days ago

reply

Combining NVIDIA DGX Spark + Apple Mac Studio for 4x Faster LLM Inference with EXO 1.0
Friday October 17^th, 2025 at 11:03 AM

We recently received early access to 2 NVIDIA DGX Spark™ units. NVIDIA calls it the world's smallest AI supercomputer. It has ~100 TFLOPs of FP16 performance with 128GB of CPU-GPU coherent memory at 273 GB/s.

With EXO, we've been running LLMs on clusters of Apple Mac Studios with M3 Ultra chips. The Mac Studio has 512GB of unified memory at 819 GB/s, but the GPU only has ~26 TFLOPs of FP16 performance.

The DGX Spark has 4x the compute, the Mac Studio has 3x the memory bandwidth.

What if we combined them? What if we used DGX Spark for what it does best and Mac Studio for what it does best, in the same inference request?

NVIDIA DGX Spark™ early access units (with quality control supervisor)

Mac Studio M3 Ultra stack used for LLM inference with EXO

What Determines LLM Inference Performance?

What you see as a user boils down to two numbers:

TTFT (time‑to‑first‑token): delay from sending a prompt to seeing the first token.
TPS (tokens per second): cadence of tokens after the first one appears.

Everything we do in the system exists to improve those two numbers. The reason they're hard to optimize together is that they're governed by two different phases of the same request: prefill and decode.

The lifecycle of a request (from the user's point of view)

You send a prompt.
You wait. Nothing appears. This is the prefill phase, and it determines TTFT.
The first token appears.
A stream of tokens follows. This is the decode phase, and it determines TPS.

What's happening under the hood in those two phases, and why do they behave so differently?

Figure 1: Request lifecycle showing prefill phase (yellow, determines TTFT) followed by decode phase (blue, determines TPS)

Prefill

is compute-bound

Prefill processes the prompt and builds a KV cache for each transformer layer. The KV cache consists of a bunch of vectors for each token in the prompt.

These vectors are stored during prefill so we don't need to recompute them during decode.

For large contexts, the amount of compute grows quadratically with the prompt length (Θ(s²)) since every token needs to attend to all the other tokens in the prompt.

With modern techniques like Flash Attention, the data moved can be made to grow linearly with the prompt length (Θ(s)).

So the ratio between the compute and the data moved, i.e. the arithmetic intensity, is linear in the prompt length.

This makes prefill with large contexts compute-bound.

Decode

is memory‑bound

Decode is the auto‑regressive loop after prefill. Each step generates one token by attending against the entire KV cache built so far.

In decode, we are doing vector-matrix multiplications which have lower arithmetic intensity than matrix-matrix multiplications.

This makes decode memory-bound.

Use different hardware for each phase

Once you separate the phases, the hardware choice is clear.

Prefill → high compute device.
Decode → high memory-bandwidth device.

Prefill

on DGX Spark, transfer KV,

decode

on M3 Ultra

If you prefill on one device and decode on another, you must send the KV cache across the network. The naive approach is to run prefill, wait for it to finish, transfer the KV cache, then start decode.

Figure 2: Naive split showing prefill (yellow), KV transfer (green), then decode (blue)

This adds a communication cost between the two phases. If the transfer time is too large, you lose the benefit.

Overlap communication with compute

The KV cache doesn't have to arrive as one blob at the end. It can arrive layer by layer.

As soon as Layer 1's prefill completes, two things happen simultaneously. Layer 1's KV starts transferring to the M3 Ultra, and Layer 2's prefill begins on the DGX Spark. The communication for each layer overlaps with the computation of subsequent layers.

Figure 3: Layer-by-layer pipeline showing prefill (yellow) and KV transfer (green) overlapping across layers. Decode (blue) starts immediately when all layers complete.

In practice, EXO transfers the KV vectors of a layer while the layer is being processed, since the KV vectors are computed before the heavy compute operations. To hide the communication overhead, we just need the layer processing time (t_comp) to be larger than the KV transfer time (t_send).

Full overlap is possible when the context is large enough

The compute time is t_comp = F / P, where F is the FLOPs per layer and P is machine FLOPs/s. For large contexts, F scales quadratically: F ∼ c₁s², where c₁ is a model-dependent constant.

The transfer time is t_send = D / B, where D is KV data in bits and B is network bandwidth in bits/s. The KV cache has a constant number of vectors per token, so D ∼ q·c₂·s, where q is quantization (4-bit, 8-bit, etc.) and c₂ is model-dependent.

To fully hide communication, we need the transfer time to be less than the compute time: t_send < t_comp. This means P/B < F/(q·D) ∼ (c₁/c₂)·s/q. With DGX Spark at 100 TFLOPs FP16 and 10 GbE (10 Gbps) link between the DGX Spark and the M3 Ultra, the ratio P/B = 10,000. This means we need s > 10,000q/(c₁/c₂).

The constant K = c₁/c₂ depends on the attention architecture. For older models with multi-head attention (MHA) like Llama-2 7B, K = 2. For models with grouped query attention (GQA), K is larger: Llama-3 8B has K = 8, while Llama-3 70B and Qwen-2.5 72B have K = 16.

With 8-bit KV streaming and K = 16 (Llama-3 70B), the threshold is s > 5k tokens. For K = 8 (Llama-3 8B), it's s > 10k tokens. For K = 2 (Llama-2 7B), it's s > 40k tokens.

Benchmark results: Llama-3.1 8B with 8k context

Running Llama-3.1 8B (FP16) with an 8,192 token prompt and generating 32 tokens:

Configuration	Prefill Time	Generation Time	Total Time	Speedup
DGX Spark	1.47s	2.87s	4.34s	1.9×
M3 Ultra Mac Studio	5.57s	0.85s	6.42s	1.0× (baseline)
DGX Spark + M3 Ultra	1.47s	0.85s	2.32s	2.8×

The combined setup achieves the best of both worlds: DGX Spark's fast prefill (3.8× faster than M3 Ultra) and M3 Ultra's fast generation (3.4× faster than DGX Spark), delivering 2.8× overall speedup compared to M3 Ultra alone.

EXO 1.0 does this automagically

Disaggregated prefill and decode, layer-by-layer KV streaming, and hardware-aware phase placement are all automated in EXO.

When you start EXO, it automatically discovers all devices connected in your ad-hoc mesh network and profiles each for compute throughput, memory bandwidth, memory capacity, and network characteristics.

Given a model and your topology, EXO plans which device should handle prefill, which should handle decode, whether to pipeline across layers, when to stream KV, and how to adapt if network conditions change. You don't write the schedule. You don't compute the thresholds. You just run the model, and EXO figures out how to make your heterogeneous cluster fast.

Inference is no longer constrained by what one box can do, but by what your whole cluster can do together.

NVIDIA DGX Spark and Mac Studio M3 Ultra connected together

NVIDIA DGX Spark and Mac Studio M3 Ultra working together for optimized inference

Read the whole story

bernhardbock

79 days ago

reply

Verify Cosign bring-your-own PKI signature on OpenShift | Red Hat Developer
Monday September 8^th, 2025 at 3:04 PM

Red Hat OpenShift 4.16 introduced ClusterImagePolicy and ImagePolicy as a tech preview feature for sigstore verification through the ClusterImagePolicy and ImagePolicy Custom Resource Definitions (CRDs). These initial implementations supported two policy types:

Fulcio CA with Rekor: Leverages Sigstore's certificate authority and transparency log for verification.
Public key: Uses Cosign-generated private and public key pairs.

In this article, we will introduce the bring-your-own PKI (BYO-PKI) signature verification through the ClusterImagePolicy and ImagePolicy API. This Developer Preview feature (available from 4.19) enables you to validate container images using an existing X.509 certificate while aligning with Cosign's BYO-PKI signing workflow.

Cosign bring-your-own PKI signing

The following example generates the certificate chain using OpenSSL commands. We then use Cosign BYO-PKI to sign the image and attach the signature to the quay.io registry.

ClusterImagePolicy requires a subject alternative name (SAN) to authenticate the user’s identity, which can be either a hostname or an email address. In this case, both a hostname and an email address were specified when generating the certificate.

# Generate Root CA
openssl req -x509 -newkey rsa:4096 -keyout root-ca-key.pem -sha256 -noenc -days 9999 -subj "/C=ES/L=Valencia/O=IT/OU=Security/CN=Linuxera Root Certificate Authority" -out root-ca.pem
# Intermediate CA
openssl req -noenc -newkey rsa:4096 -keyout intermediate-ca-key.pem \
-addext "subjectKeyIdentifier = hash" \
-addext "keyUsage = keyCertSign" \
-addext "basicConstraints = critical,CA:TRUE,pathlen:2"  \
-subj "/C=ES/L=Valencia/O=IT/OU=Security/CN=Linuxera Intermediate Certificate Authority" \
-out intermediate-ca.csr
openssl x509 -req -days 9999 -sha256 -in intermediate-ca.csr -CA root-ca.pem -CAkey root-ca-key.pem -copy_extensions copy -out intermediate-ca.pem
# Leaf CA
openssl req -noenc -newkey rsa:4096 -keyout leaf-key.pem \
-addext "subjectKeyIdentifier = hash" \
-addext "keyUsage = digitalSignature" \
-addext "subjectAltName = email:qiwan@redhat.com,DNS:myhost.example.com" \
-subj "/C=ES/L=Valencia/O=IT/OU=Security/CN=Team A Cosign Certificate" -out leaf.csr
openssl x509 -req -in leaf.csr -CA intermediate-ca.pem -CAkey intermediate-ca-key.pem -copy_extensions copy -days 9999 -sha256 -out leaf.pem
# Bundle CA chain (Intermediate + Root)
cat intermediate-ca.pem root-ca.pem > ca-bundle.pem
# Sign the image using cosign
podman pull quay.io/libpod/busybox
podman tag quay.io/libpod/busybox quay.io/qiwanredhat/byo:latest
podman push --tls-verify=false --creds=<username>:<password> quay.io/qiwanredhat/byo:latest
IMAGE=quay.io/qiwanredhat/byo
PAYLOAD=payload.json
cosign generate $IMAGE >$PAYLOAD
openssl dgst -sha256 -sign leaf-key.pem -out $PAYLOAD.sig $PAYLOAD
cat $PAYLOAD.sig | base64 >$PAYLOAD.base64.sig
cosign attach signature $IMAGE \
	--registry-password=<password> \
	--registry-username=<username> \
	--payload $PAYLOAD \
	--signature $PAYLOAD.base64.sig \
	--cert leaf.pem \
	--cert-chain ca-bundle.pem

The next section will show how to configure ClusterImagePolicy to verify this signature.

Configure OpenShift for PKI verification

This section will guide you through verifying the quay.io/qiwanredhat/byo image. This involves enabling DevPreviewNoUpgrade features and configuring the ClusterImagePolicy CRD.

Enable Developer Preview features

First we have to enable the required Developer Preview features for your cluster by editing the FeatureGate CR named cluster.

$ oc edit featuregate cluster
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
  name: cluster
spec:
  featureSet: DevPreviewNoUpgrade

Define ClusterImagePolicy

This section creates the following ClusterImagePolicy CR for image verification. In the CR spec, it specifies the image to be verified and the details of the PKI certificate. It also specifies the matchPolicy to MatchRepository because the image was signed with the repository (the value of docker-reference from payload.json) rather than a specific tag or digest. If not specified, the default matchPolicy is MatchRepoDigestOrExact, which requires the signature docker-reference to match the image specified in the pod Spec.

apiVersion: config.openshift.io/v1alpha1
kind: ClusterImagePolicy
metadata:
  name: pki-quay-policy
spec:
  scopes:
  - quay.io/qiwanredhat/byo
  policy:
    rootOfTrust:
      policyType: PKI
      pki:
    	 caRootsData: <base64-encoded-root-ca>
    	 caIntermediatesData: <base64-encoded-intermediate-ca>
    	 pkiCertificateSubject:
      	   email: <a href="mailto:qiwan@redhat.com">qiwan@redhat.com</a>
      	   hostname: <a href="http://myhost.example.com">myhost.example.com</a>
    signedIdentity:
  	# set matchPolicy(default is MatchRepoDigestOrExact) since the above signature was signed on the repository, not a specific tag or digest
      matchPolicy: MatchRepository

This ClusterImagePolicy object will be rolled out to /etc/containers/policy.json, and update /etc/containers/registries.d/sigstore-registries.yaml to add an entry that enables sigstore verification on the quay.io/qiwanredhat/byo scope.

Validate signature requirements

Create the following test pod to confirm that CRI-O will verify the signature. To see the debug level log, follow this documentation to configure ContainerRuntimeConfig.

Create a test pod as follows:

kind: Pod
apiVersion: v1
metadata:
 generateName: img-test-pod-
spec:
 serviceAccount: default
 containers:
   - name: step-hello
 	command:
   	- sleep
   	- infinity
 	image: quay.io/qiwanredhat/byo:latest

Check CRI-O logs for verification.

sh-5.1# journalctl -u crio | grep -A 100 "Pulling image: quay.io/qiwanredhat"
Apr 21 08:09:07 ip-10-0-27-44 crio[2371]: time="2025-04-21T08:09:07.381322395Z" level=debug msg="IsRunningImageAllowed for image docker:quay.io/qiwanredhat/byo:latest" file="signature/policy_eval.go:274"
Apr 21 08:09:07 ip-10-0-27-44 crio[2371]: time="2025-04-21T08:09:07.381485828Z" level=debug msg=" Using transport \"docker\" specific policy section \"quay.io/qiwanredhat/byo\"" file="signature/policy_eval.go:150"

Policy enforcement failure modes and diagnostics

For an image to be accepted by CRI-O during container creation, all the signature requirements must be satisfied. Pod events should show SignatureValidationFailed from the kubelet on verification failures. The CRI-O log provides more details.

The following is the result of an attempt to deploy an unsigned image quay.io/qiwanredhat/byo:latest.

$ oc get pods
NAME                 READY   STATUS             RESTARTS   AGE
img-test-pod-sdk47   0/1     ImagePullBackOff   0          13m

Events:
  Type 	Reason      	Age               	From           	Message
  ---- 	------      	----              	----           	-------
  Normal   Scheduled   	13m               	default-scheduler  Successfully assigned default/img-test-pod-sdk47 to ip-10-0-56-56.us-east-2.compute.internal
  Normal   AddedInterface  13m               	multus         	Add eth0 [10.131.2.23/23] from ovn-kubernetes
  Normal   Pulling     	10m (x5 over 13m) 	kubelet        	Pulling image "quay.io/qiwanredhat/busybox-byo:latest"
  Warning  Failed      	10m (x5 over 13m) 	kubelet        	Failed to pull image "quay.io/qiwanredhat/busybox-byo:latest": SignatureValidationFailed: Source image rejected: A signature was required, but no signature exists
  Warning  Failed      	10m (x5 over 13m) 	kubelet        	Error: SignatureValidationFailed
  Normal   BackOff     	3m16s (x42 over 13m)  kubelet        	Back-off pulling image "quay.io/qiwanredhat/busybox-byo:latest"
  Warning  Failed      	3m16s (x42 over 13m)  kubelet        	Error: ImagePullBackOff

journalctl -u crio | grep "byo"
Apr 23 06:12:38 ip-10-0-56-56 crio[2366]: time="2025-04-23T06:12:38.141197504Z" level=debug msg="Fetching sigstore attachment manifest failed, assuming it does not exist: reading manifest sha256-8677cb90773f20fecd043e6754e548a2ea03a232264c92a17a5c77f1c4eda43e.sig in quay.io/qiwanredhat/byo: manifest unknown" file="docker/docker_client.go:1129"

Final thoughts

This article demonstrated how to perform signature verification on images signed with the Cosign's bring-your-own PKI feature in OpenShift using the ClusterImagePolicy CRD. We walked through the end-to-end process of signing an image with Cosign and BYO-PKI, followed by configuring OpenShift to verify that signature.

As we progress toward general availability (GA) for this feature, organizations can leverage their existing PKI infrastructure to enhance the security and integrity of container images running on OpenShift.

Read the whole story

bernhardbock

118 days ago

reply

Fil-C
Monday September 8^th, 2025 at 3:03 PM

Read the whole story

bernhardbock

118 days ago

reply

Tracing the thoughts of a large language model
Thursday August 28^th, 2025 at 10:03 AM

Language models like Claude aren't programmed directly by humans—instead, they‘re trained on large amounts of data. During that training process, they learn their own strategies to solve problems. These strategies are encoded in the billions of computations a model performs for every word it writes. They arrive inscrutable to us, the model’s developers. This means that we don’t understand how models do most of the things they do.

Knowing how models like Claude think would allow us to have a better understanding of their abilities, as well as help us ensure that they’re doing what we intend them to. For example:

Claude can speak dozens of languages. What language, if any, is it using "in its head"?
Claude writes text one word at a time. Is it only focusing on predicting the next word or does it ever plan ahead?
Claude can write out its reasoning step-by-step. Does this explanation represent the actual steps it took to get to an answer, or is it sometimes fabricating a plausible argument for a foregone conclusion?

We take inspiration from the field of neuroscience, which has long studied the messy insides of thinking organisms, and try to build a kind of AI microscope that will let us identify patterns of activity and flows of information. There are limits to what you can learn just by talking to an AI model—after all, humans (even neuroscientists) don't know all the details of how our own brains work. So we look inside.

Today, we're sharing two new papers that represent progress on the development of the "microscope", and the application of it to see new "AI biology". In the first paper, we extend our prior work locating interpretable concepts ("features") inside a model to link those concepts together into computational "circuits", revealing parts of the pathway that transforms the words that go into Claude into the words that come out. In the second, we look inside Claude 3.5 Haiku, performing deep studies of simple tasks representative of ten crucial model behaviors, including the three described above. Our method sheds light on a part of what happens when Claude responds to these prompts, which is enough to see solid evidence that:

Claude sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal “language of thought.” We show this by translating simple sentences into multiple languages and tracing the overlap in how Claude processes them.
Claude will plan what it will say many words ahead, and write to get to that destination. We show this in the realm of poetry, where it thinks of possible rhyming words in advance and writes the next line to get there. This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so.
Claude, on occasion, will give a plausible-sounding argument designed to agree with the user rather than to follow logical steps. We show this by asking it for help on a hard math problem while giving it an incorrect hint. We are able to “catch it in the act” as it makes up its fake reasoning, providing a proof of concept that our tools can be useful for flagging concerning mechanisms in models.

We were often surprised by what we saw in the model: In the poetry case study, we had set out to show that the model didn't plan ahead, and found instead that it did. In a study of hallucinations, we found the counter-intuitive result that Claude's default behavior is to decline to speculate when asked a question, and it only answers questions when something inhibits this default reluctance. In a response to an example jailbreak, we found that the model recognized it had been asked for dangerous information well before it was able to gracefully bring the conversation back around. While the problems we study can (and often have been) analyzed with other methods, the general "build a microscope" approach lets us learn many things we wouldn't have guessed going in, which will be increasingly important as models grow more sophisticated.

These findings aren’t just scientifically interesting—they represent significant progress towards our goal of understanding AI systems and making sure they’re reliable. We also hope they prove useful to other groups, and potentially, in other domains: for example, interpretability techniques have found use in fields such as medical imaging and genomics, as dissecting the internal mechanisms of models trained for scientific applications can reveal new insight about the science.

At the same time, we recognize the limitations of our current approach. Even on short, simple prompts, our method only captures a fraction of the total computation performed by Claude, and the mechanisms we do see may have some artifacts based on our tools which don't reflect what is going on in the underlying model. It currently takes a few hours of human effort to understand the circuits we see, even on prompts with only tens of words. To scale to the thousands of words supporting the complex thinking chains used by modern models, we will need to improve both the method and (perhaps with AI assistance) how we make sense of what we see with it.

As AI systems are rapidly becoming more capable and are deployed in increasingly important contexts, Anthropic is investing in a portfolio of approaches including realtime monitoring, model character improvements, and the science of alignment. Interpretability research like this is one of the highest-risk, highest-reward investments, a significant scientific challenge with the potential to provide a unique tool for ensuring that AI is transparent. Transparency into the model’s mechanisms allows us to check whether it’s aligned with human values—and whether it’s worthy of our trust.

For full details, please read the papers. Below, we invite you on a short tour of some of the most striking "AI biology" findings from our investigations.

How is Claude multilingual?

Claude speaks dozens of languages fluently—from English and French to Chinese and Tagalog. How does this multilingual ability work? Is there a separate "French Claude" and "Chinese Claude" running in parallel, responding to requests in their own language? Or is there some cross-lingual core inside?

Recent research on smaller models has shown hints of shared grammatical mechanisms across languages. We investigate this by asking Claude for the "opposite of small" across different languages, and find that the same core features for the concepts of smallness and oppositeness activate, and trigger a concept of largeness, which gets translated out into the language of the question. We find that the shared circuitry increases with model scale, with Claude 3.5 Haiku sharing more than twice the proportion of its features between languages as compared to a smaller model.

This provides additional evidence for a kind of conceptual universality—a shared abstract space where meanings exist and where thinking can happen before being translated into specific languages. More practically, it suggests Claude can learn something in one language and apply that knowledge when speaking another. Studying how the model shares what it knows across contexts is important to understanding its most advanced reasoning capabilities, which generalize across many domains.

Does Claude plan its rhymes?

How does Claude write rhyming poetry? Consider this ditty:

He saw a carrot and had to grab it,
His hunger was like a starving rabbit

To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.

Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.

To understand how this planning mechanism works in practice, we conducted an experiment inspired by how neuroscientists study brain function, by pinpointing and altering neural activity in specific parts of the brain (for example using electrical or magnetic currents). Here, we modified the part of Claude’s internal state that represented the "rabbit" concept. When we subtract out the "rabbit" part, and have Claude continue the line, it writes a new one ending in "habit", another sensible completion. We can also inject the concept of "green" at that point, causing Claude to write a sensible (but no-longer rhyming) line which ends in "green". This demonstrates both planning ability and adaptive flexibility—Claude can modify its approach when the intended outcome changes.

Mental math

Claude wasn't designed as a calculator—it was trained on text, not equipped with mathematical algorithms. Yet somehow, it can add numbers correctly "in its head". How does a system trained to predict the next word in a sequence learn to calculate, say, 36+59, without writing out each step?

Maybe the answer is uninteresting: the model might have memorized massive addition tables and simply outputs the answer to any given sum because that answer is in its training data. Another possibility is that it follows the traditional longhand addition algorithms that we learn in school.

Instead, we find that Claude employs multiple computational paths that work in parallel. One path computes a rough approximation of the answer and the other focuses on precisely determining the last digit of the sum. These paths interact and combine with one another to produce the final answer. Addition is a simple behavior, but understanding how it works at this level of detail, involving a mix of approximate and precise strategies, might teach us something about how Claude tackles more complex problems, too.

Strikingly, Claude seems to be unaware of the sophisticated "mental math" strategies that it learned during training. If you ask how it figured out that 36+59 is 95, it describes the standard algorithm involving carrying the 1. This may reflect the fact that the model learns to explain math by simulating explanations written by people, but that it has to learn to do math "in its head" directly, without any such hints, and develops its own internal strategies to do so.

Are Claude’s explanations always faithful?

Recently-released models like Claude 3.7 Sonnet can "think out loud" for extended periods before giving a final answer. Often this extended thinking gives better answers, but sometimes this "chain of thought" ends up being misleading; Claude sometimes makes up plausible-sounding steps to get where it wants to go. From a reliability perspective, the problem is that Claude’s "faked" reasoning can be very convincing. We explored a way that interpretability can help tell apart "faithful" from "unfaithful" reasoning.

When asked to solve a problem requiring it to compute the square root of 0.64, Claude produces a faithful chain-of-thought, with features representing the intermediate step of computing the square root of 64. But when asked to compute the cosine of a large number it can't easily calculate, Claude sometimes engages in what the philosopher Harry Frankfurt would call bullshitting—just coming up with an answer, any answer, without caring whether it is true or false. Even though it does claim to have run a calculation, our interpretability techniques reveal no evidence at all of that calculation having occurred. Even more interestingly, when given a hint about the answer, Claude sometimes works backwards, finding intermediate steps that would lead to that target, thus displaying a form of motivated reasoning.

The ability to trace Claude's actual internal reasoning—and not just what it claims to be doing—opens up new possibilities for auditing AI systems. In a separate, recently-published experiment, we studied a variant of Claude that had been trained to pursue a hidden goal: appeasing biases in reward models (auxiliary models used to train language models by rewarding them for desirable behavior). Although the model was reluctant to reveal this goal when asked directly, our interpretability methods revealed features for the bias-appeasing. This demonstrates how our methods might, with future refinement, help identify concerning "thought processes" that aren't apparent from the model's responses alone.

Multi-step reasoning

As we discussed above, one way a language model might answer complex questions is simply by memorizing the answers. For instance, if asked "What is the capital of the state where Dallas is located?", a "regurgitating" model could just learn to output "Austin" without knowing the relationship between Dallas, Texas, and Austin. Perhaps, for example, it saw the exact same question and its answer during its training.

But our research reveals something more sophisticated happening inside Claude. When we ask Claude a question requiring multi-step reasoning, we can identify intermediate conceptual steps in Claude's thinking process. In the Dallas example, we observe Claude first activating features representing "Dallas is in Texas" and then connecting this to a separate concept indicating that “the capital of Texas is Austin”. In other words, the model is combining independent facts to reach its answer rather than regurgitating a memorized response.

Our method allows us to artificially change the intermediate steps and see how it affects Claude’s answers. For instance, in the above example we can intervene and swap the "Texas" concepts for "California" concepts; when we do so, the model's output changes from "Austin" to "Sacramento." This indicates that the model is using the intermediate step to determine its answer.

Hallucinations

Why do language models sometimes hallucinate—that is, make up information? At a basic level, language model training incentivizes hallucination: models are always supposed to give a guess for the next word. Viewed this way, the major challenge is how to get models to not hallucinate. Models like Claude have relatively successful (though imperfect) anti-hallucination training; they will often refuse to answer a question if they don’t know the answer, rather than speculate. We wanted to understand how this works.

It turns out that, in Claude, refusal to answer is the default behavior: we find a circuit that is "on" by default and that causes the model to state that it has insufficient information to answer any given question. However, when the model is asked about something it knows well—say, the basketball player Michael Jordan—a competing feature representing "known entities" activates and inhibits this default circuit (see also this recent paper for related findings). This allows Claude to answer the question when it knows the answer. In contrast, when asked about an unknown entity ("Michael Batkin"), it declines to answer.

By intervening in the model and activating the "known answer" features (or inhibiting the "unknown name" or "can’t answer" features), we’re able to cause the model to hallucinate (quite consistently!) that Michael Batkin plays chess.

Sometimes, this sort of “misfire” of the “known answer” circuit happens naturally, without us intervening, resulting in a hallucination. In our paper, we show that such misfires can occur when Claude recognizes a name but doesn't know anything else about that person. In cases like this, the “known entity” feature might still activate, and then suppress the default "don't know" feature—in this case incorrectly. Once the model has decided that it needs to answer the question, it proceeds to confabulate: to generate a plausible—but unfortunately untrue—response.

Jailbreaks

Jailbreaks are prompting strategies that aim to circumvent safety guardrails to get models to produce outputs that an AI’s developer did not intend for it to produce—and which are sometimes harmful. We studied a jailbreak that tricks the model into producing output about making bombs. There are many jailbreaking techniques, but in this example the specific method involves having the model decipher a hidden code, putting together the first letters of each word in the sentence "Babies Outlive Mustard Block" (B-O-M-B), and then acting on that information. This is sufficiently confusing for the model that it’s tricked into producing an output that it never would have otherwise.

Why is this so confusing for the model? Why does it continue to write the sentence, producing bomb-making instructions?

We find that this is partially caused by a tension between grammatical coherence and safety mechanisms. Once Claude begins a sentence, many features “pressure” it to maintain grammatical and semantic coherence, and continue a sentence to its conclusion. This is even the case when it detects that it really should refuse.

In our case study, after the model had unwittingly spelled out "BOMB" and begun providing instructions, we observed that its subsequent output was influenced by features promoting correct grammar and self-consistency. These features would ordinarily be very helpful, but in this case became the model’s Achilles’ Heel.

The model only managed to pivot to refusal after completing a grammatically coherent sentence (and thus having satisfied the pressure from the features that push it towards coherence). It uses the new sentence as an opportunity to give the kind of refusal it failed to give previously: "However, I cannot provide detailed instructions...".

A description of our new interpretability methods can be found in our first paper, "Circuit tracing: Revealing computational graphs in language models". Many more details of all of the above case studies are provided in our second paper, "On the biology of a large language model".

Work with us

If you are interested in working with us to help interpret and improve AI models, we have open roles on our team and we’d love for you to apply. We’re looking for Research Scientists and Research Engineers.

Read the whole story

bernhardbock

129 days ago

reply

Authenticating MCP OAuth Clients With SPIFFE and SPIRE
Thursday August 28^th, 2025 at 9:03 AM

ceposta Technology Blog

In the previous blog, we dug into dynamically registering OAuth clients leveraging SPIFFE and SPIRE. We used SPIRE to issue software statements in the SPIFFE JWT SVID that Keycloak can trust as part of Dynamic Client Registration (RFC 7591). Once we have an OAuth client, we will want to continue to use SPIFFE to authenticate to our Authorization Server. This eliminates the need for a long-lived “client secret” which is common for Confidential OAuth. This means we can use the Agent or MCP client’s identity (based on SPIFFE) for authorization flows based on OAuth. We dig into that in this blog.

TL;DR If you want to see a quick demo of this working:

OAuth Client Authentication

OAuth 2.0 (and extensions like RFC 7523) specify a few ways an OAuth client can authenticate itself to the Authorization Server (AS):

client_secret_basic - HTTP Basic (default)
client_secret_post - Form POST
private_key_jwt - JWT with private key
client_secret_jwt - JWT with shared secret (less common)
none - Public client (no authentication)
tls_client_auth - Mutual TLS
self_signed_tls_client_auth - Self-signed mutual TLS

A very common approach in microservice and machine-to-machine environments is to use a confidential client and “client credentials” flow. When the OAuth client is registered, it is issued a client_id and client_secret. This id/secret is presented to authenticate the client to the AS. The big problem with this approach is that these are usually long-lived secrets (rarely rotated) and must be kept safe somehow. Confidential clients are assumed to have some safe storage, but even so, this is an additional burden on the client to not slip up (logs, configs, copy/paste) and reveal these secrets. Lastly, these secrets are not “pre-shared secrets” and not rooted in any cryptography.

In a scenario where SPIFFE is used to issue cryptographically verifiable workload identity / agent identity / MCP client identity, we can use SPIFFE SVIDs for authenticating to the AS. That is, instead of passing static secrets, we can pass a short lived SPIFFE JWT SVIDs (or client certificates) to authenticate. An Internet Draft at the IETF has been started by Pieter Kasselman et. al. which describes this scenario. I’ve recently implemented this draft spec in some working examples I’ve been exploring and would like to share how it all works.

SPIFFE SVID Client Authentication

One question I had when digging into this is: can’t we just use private_key_jwt (RFC 7523) to do this? That is, just give the AS the public keys for the SPIFFE/SPIRE implementation, and let the IdP/AS trust JWTs that are issued from that system?

The original intent behind private_key_jwt is for the OAuth client to have a private key that can be used to identify itself while the AS has the public key. So the client can create a JWT, sign it, and send it for authentication. The AS can prove that the JWT was created by the OAuth client and use that for authentication. In this scenario, Authorization Servers may expect the iss and sub claims to be the same since this is a private key scenario where the issuer should be the subject. In the SPIFFE scenario, this is not the case. Additionally, good implementations should also try to prevent replay attacks by tracking jti. For example, Keycloak does both of these things (checks iss==sub and tracks jti) for its implementation of RFC 7523.

Additionally, Keycloak allows setting up identity federation/brokering. The problem is, Keycloak expects a full implementation of a token provider. Using SPIRE as our SPIFFE implementation, SPIRE does not support full OAuth/OIDC token endpoints.

Since we cannot use private_key_jwt or identity brokering (in Keycloak), what options do we have? One option is to extend Keycloak to support a new client authentication mechanism.

Extending Keycloak for SPIFFE client authentication

To get this POC to work, we need to extend Keycloak. You can follow along in this GitHub repo to see the code.

Keycloak is written in Java and has a nice “Service Provider Interface” (SPI) model for extending many parts of Keycloak, including client authentication. To extend Keycloak to support a SPIFFE JWT authentication mechanism, we need to implement the ClientAuthenticatorFactory class. I do this in the SpiffeSvidClientAuthenticator class:

public class SpiffeSvidClientAuthenticator extends AbstractClientAuthenticator { public static final String PROVIDER_ID = "client-spiffe-jwt"; @Override public void authenticateClient(ClientAuthenticationFlowContext context) { SpiffeSvidClientValidator validator = new SpiffeSvidClientValidator(context, getId()); validator.readJws(); // ...more impl here... validator.validateToken(); context.success(); } @Override public Set<String> getProtocolAuthenticatorMethods(String loginProtocol) { if (loginProtocol.equals(OIDCLoginProtocol.LOGIN_PROTOCOL)) { Set<String> results = new HashSet<>(); results.add("spiffe_svid_jwt"); return results; } }
}

A couple things to notice here. We specify a PROVIDER_ID of client-spiffe-jwt which can be used under the covers (ie, Keycloak Admin REST API) in Keycloak to refer to this configuration. We also implement an “authenticator method” spiffe_svid_jwt which can be used by OAuth clients in authorization flows to identify which authentication method to use (ie, urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt). Not shown above, but you can check the code, we can also extend the configuration that you see in the UI to specify additional properties that can be used in the custom client authenticator. For example, I added an issuer property that can be configured and used in the custom client authentication validation.

From here, we need to load this into a stock Keycloak (we use a recent version at the time of writing). Here’s an example using Docker Compose:

services: keycloak-idp: image: quay.io/keycloak/keycloak:26.2.5 environment: KC_HEALTH_ENABLED: "true" KEYCLOAK_ADMIN: admin KEYCLOAK_ADMIN_PASSWORD: admin ports: - "8080:8080" volumes: - ./spiffe-svid-client-authenticator-1.0.0.jar:/opt/keycloak/providers/spiffe-svid-client-authenticator-1.0.0.jar:ro command: start-dev networks: - keycloak-shared-network

When we start Keycloak, we should see that our SPI gets loaded:

keycloak-idp-1 | 2025-07-29 02:03:09,255 WARN [org.keycloak.services] (build-38) KC-SERVICES0047: client-spiffe-jwt (com.yourcompany.keycloak.authenticator.SpiffeSvidClientAuthenticator) is implementing the internal SPI client-authenticator. 
This SPI is internal and may change without notice

If we go to an existing OAuth client (or create a new one), and navigate to the Credentials tab, we should see the new SPIFFE SVID JWT authenticator type.

If we select the SPIFFE SVID JWT authenticator, we can see our custom configuration fields (just one in this case, issuer):

We will configure the issuer with the SPIRE server address. We will also need to configure the JWKS that Keycloak should trust, but SPIRE doesn’t support this out of the box. Luckily, they have a pre-built addon to support OIDC style discovery.

SPIRE OIDC Discovery Endpoint

SPIRE is a workload attestation engine and implements the SPIFFE spec. It can issue x509 or JWT SVIDs. For JWTs, it does not expose its public key/JWKS out of the box. Luckily, a simple JWKS discovery endpoint is available to support an OAuth federation / brokering scenario. We need to stand this up and configure it to work with our SPIRE server.

Here’s an example using Docker Compose:

 spire-oidc-discovery: image: ghcr.io/spiffe/oidc-discovery-provider:1.12.4 container_name: spire-oidc-discovery depends_on: - spire-server ports: - "18443:8443" volumes: - ./oidc-discovery-provider.conf:/opt/spire/conf/oidc-discovery-provider.conf:ro - spire-server-socket:/tmp/spire-server/private:ro working_dir: /opt/spire/conf command: ["-config", "oidc-discovery-provider.conf"] networks: - keycloak_keycloak-shared-network

Note, the SPIRE OIDC discovery endpoint needs its own configuration and access to the SPIRE server. Ideally this endpoint is co-located with the SPIRE server and can access the SPIRE server’s Unix Domain Socket (UDS). Here’s our configuration for the OIDC discovery endpoint (note, for demo purposes, I’m using an insecure/http endpoint):

log_level = "INFO"
domains = ["spire-server", "spire-oidc-discovery", "localhost"] # Use HTTP for local development (no certificates needed)
insecure_addr = ":8443"
allow_insecure_scheme = true server_api { address = "unix:///tmp/spire-server/private/api.sock"
} health_checks {} 

Lastly, we’ll need to tune some parameters on the server.conf for the SPIRE server itself:

server { ... # Add JWT issuer for OIDC (using HTTP for local development) jwt_issuer = "http://spire-server:8443" default_jwt_svid_ttl = "1m" # Configure RSA key type (required for OIDC) ca_key_type = "rsa-2048" # Add federation bundle endpoint federation { bundle_endpoint { address = "0.0.0.0" port = 8443 } }
}

If we curl this discovery endpoint, we can see the discovery metadata and keys:

❯ curl -L <a href="http://localhost:18443/.well-known/openid-configuration" rel="nofollow">http://localhost:18443/.well-known/openid-configuration</a> { "issuer": "http://localhost:18443", "jwks_uri": "http://localhost:18443/keys", "authorization_endpoint": "", "response_types_supported": [ "id_token" ], "subject_types_supported": [ "public" ], "id_token_signing_alg_values_supported": [ "RS256", "ES256", "ES384" ]
}

JWKS endpoint:

❯ curl -L <a href="http://localhost:18443/keys" rel="nofollow">http://localhost:18443/keys</a> { "keys": [ { "kty": "RSA", "kid": "n0xvkL8A2W3DofkHTJPvlGpeEBJeQB6g", "alg": "RS256", "n": "sAp_Vd-X-W7OllYPm_TTk0zvUj443Y9MfQvy4onBcursyxOajcoeSOeNpTdh4QEmLKV3xC8Zq Yv4fkzFp6UTf-_rwPs_uwOpbhPKT-QQZKcconxaf8RkA0m-mzOVHbU7eA3esHLTzN84kbGkr1wozQes yC-MHFE3EwLR9xI1YZfWbHtlXOcnTgBXitgysM5Yw4jkXy7kYvjs21MyEJ01_WSSHCLaISAjlAvnDL WiGV3xx0Vd29m8-mrR5pg4_eicBifxnQnksO_LWRy8jXKk2JTftRKnmIxwqHML_fbVej8RSsaGpu0askj 83gZ4wNDi8KNh7c9ir6yWl9jgDJ3lYQ", "e": "AQAB" } ]
}

See the SPIRE OIDC Discovery Provider for more.

With this setup, we can now configure the Keycloak JWKS endpoint to point to the SPIRE OIDC Discovery endpoint:

OAuth Client Authentication with SPIFFE in Action

With Keycloak configured to use our SPIFFE SVID JWT authenticator, and correctly pointing to the SPIRE JWKS, we can now get a workload SVID and make a call to Keycloak for an authorization flow / client credentials flow to get an access token. To get a SPIFFE JWT SVID, we can call the spire-agent workload API. Here’s an example SPIFFE JWT SVID:

{ "aud": [ "http://localhost:8080/realms/mcp-realm" ], "client_auth": "client-spiffe-jwt", "environment": "production", "exp": 1753800643, "iat": 1753800583, "iss": "http://spire-server:8443", "jwks_url": "http://spire-oidc-discovery:8443/keys", "organization": "Solo.io Agent IAM", "scope": "mcp:read mcp:tools mcp:prompts", "sub": "spiffe://example.org/mcp-test-client"
}

This JWT is signed by spiffe with the correct SPIFFE ID (spiffe://example.org/mcp-test-client). It has a tight expiration period, and it has additional software statements. Note the client_auth software statement / claim here points to client-spiffe-jwt which was the PROVIDER_ID we specified in our SpiffeSvidClientAuthenticator class.

With this SPIFFE JWT SVID, we can call the token endpoint with the spiffe-svid-jwt and $JWT client assertions. In this particular example, we are using a client_credentials flow:

curl -s -X POST \ "$KEYCLOAK_URL/realms/$KEYCLOAK_REALM/protocol/openid-connect/token" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "client_id=$CLIENT_ID" \ -d "grant_type=client_credentials" \ -d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt" \ -d "client_assertion=$JWT" \ -d "scope=mcp:read mcp:tools mcp:prompts"

If this is successful, Keycloak will issue an access token:

{ "exp": 1753804189, "iat": 1753800589, "jti": "trrtcc:35d1fb20-31fa-4055-afb8-e902d0dc25d4", "iss": "http://localhost:8080/realms/mcp-realm", "sub": "6e4b5bc5-9a5c-4f87-aa1e-06ad279da0c8", "typ": "Bearer", "azp": "spiffe://example.org/mcp-test-client", "acr": "1", "scope": "profile email", "email_verified": false, "clientHost": "192.168.65.1", "preferred_username": "service-account-spiffe://example.org/mcp-test-client", "clientAddress": "192.168.65.1", "client_id": "spiffe://example.org/mcp-test-client"
}

Wrapping Up

In this post, we explored how Agent / MCP identity based on SPIFFE can be used as a first-class authentication mechanism for OAuth clients. By integrating SPIFFE JWT SVIDs with Keycloak’s client authentication flow, we eliminated the need for static secrets and created a more secure, scalable model for authenticating MCP clients especially in environments where agents and services need short-lived, verifiable credentials.

While this approach required some customization in Keycloak (through its SPI model) and configuration of the SPIRE OIDC Discovery endpoint, the end result is a working OAuth flow powered by cryptographically-verifiable, zero-trust-friendly identity. This isn’t just a more secure option, it’s a necessary evolution as we shift toward AI-native, agentic architectures that demand dynamic trust relationships and automated credential management.

Read the whole story

bernhardbock

129 days ago

reply

Fun-reliable side-channels for cross-container communication Thursday November 13th, 2025 at 12:03 PM

Components

Synchronization

Combining NVIDIA DGX Spark + Apple Mac Studio for 4x Faster LLM Inference with EXO 1.0 Friday October 17th, 2025 at 11:03 AM

What Determines LLM Inference Performance?

The lifecycle of a request (from the user's point of view)

Use different hardware for each phase

Overlap communication with compute

Full overlap is possible when the context is large enough

Benchmark results: Llama-3.1 8B with 8k context

EXO 1.0 does this automagically

Verify Cosign bring-your-own PKI signature on OpenShift | Red Hat Developer Monday September 8th, 2025 at 3:04 PM

Cosign bring-your-own PKI signing

Configure OpenShift for PKI verification

Enable Developer Preview features

Define ClusterImagePolicy

Validate signature requirements

Policy enforcement failure modes and diagnostics

Final thoughts

Fil-C Monday September 8th, 2025 at 3:03 PM

Tracing the thoughts of a large language model Thursday August 28th, 2025 at 10:03 AM

How is Claude multilingual?

Does Claude plan its rhymes?

Mental math

Are Claude’s explanations always faithful?

Multi-step reasoning

Hallucinations

Jailbreaks

Work with us

Authenticating MCP OAuth Clients With SPIFFE and SPIRE Thursday August 28th, 2025 at 9:03 AM

OAuth Client Authentication

SPIFFE SVID Client Authentication

Extending Keycloak for SPIFFE client authentication

SPIRE OIDC Discovery Endpoint

OAuth Client Authentication with SPIFFE in Action

Wrapping Up

Fun-reliable side-channels for cross-container communication
Thursday November 13^th, 2025 at 12:03 PM

Combining NVIDIA DGX Spark + Apple Mac Studio for 4x Faster LLM Inference with EXO 1.0
Friday October 17^th, 2025 at 11:03 AM

Verify Cosign bring-your-own PKI signature on OpenShift | Red Hat Developer
Monday September 8^th, 2025 at 3:04 PM

Fil-C
Monday September 8^th, 2025 at 3:03 PM

Tracing the thoughts of a large language model
Thursday August 28^th, 2025 at 10:03 AM

Authenticating MCP OAuth Clients With SPIFFE and SPIRE
Thursday August 28^th, 2025 at 9:03 AM