110 stories
·
0 followers

Tracing the thoughts of a large language model

1 Share

Language models like Claude aren't programmed directly by humans—instead, they‘re trained on large amounts of data. During that training process, they learn their own strategies to solve problems. These strategies are encoded in the billions of computations a model performs for every word it writes. They arrive inscrutable to us, the model’s developers. This means that we don’t understand how models do most of the things they do.

Knowing how models like Claude think would allow us to have a better understanding of their abilities, as well as help us ensure that they’re doing what we intend them to. For example:

  • Claude can speak dozens of languages. What language, if any, is it using "in its head"?
  • Claude writes text one word at a time. Is it only focusing on predicting the next word or does it ever plan ahead?
  • Claude can write out its reasoning step-by-step. Does this explanation represent the actual steps it took to get to an answer, or is it sometimes fabricating a plausible argument for a foregone conclusion?

We take inspiration from the field of neuroscience, which has long studied the messy insides of thinking organisms, and try to build a kind of AI microscope that will let us identify patterns of activity and flows of information. There are limits to what you can learn just by talking to an AI model—after all, humans (even neuroscientists) don't know all the details of how our own brains work. So we look inside.

Today, we're sharing two new papers that represent progress on the development of the "microscope", and the application of it to see new "AI biology". In the first paper, we extend our prior work locating interpretable concepts ("features") inside a model to link those concepts together into computational "circuits", revealing parts of the pathway that transforms the words that go into Claude into the words that come out. In the second, we look inside Claude 3.5 Haiku, performing deep studies of simple tasks representative of ten crucial model behaviors, including the three described above. Our method sheds light on a part of what happens when Claude responds to these prompts, which is enough to see solid evidence that:

  • Claude sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal “language of thought.” We show this by translating simple sentences into multiple languages and tracing the overlap in how Claude processes them.
  • Claude will plan what it will say many words ahead, and write to get to that destination. We show this in the realm of poetry, where it thinks of possible rhyming words in advance and writes the next line to get there. This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so.
  • Claude, on occasion, will give a plausible-sounding argument designed to agree with the user rather than to follow logical steps. We show this by asking it for help on a hard math problem while giving it an incorrect hint. We are able to “catch it in the act” as it makes up its fake reasoning, providing a proof of concept that our tools can be useful for flagging concerning mechanisms in models.

We were often surprised by what we saw in the model: In the poetry case study, we had set out to show that the model didn't plan ahead, and found instead that it did. In a study of hallucinations, we found the counter-intuitive result that Claude's default behavior is to decline to speculate when asked a question, and it only answers questions when something inhibits this default reluctance. In a response to an example jailbreak, we found that the model recognized it had been asked for dangerous information well before it was able to gracefully bring the conversation back around. While the problems we study can (and often have been) analyzed with other methods, the general "build a microscope" approach lets us learn many things we wouldn't have guessed going in, which will be increasingly important as models grow more sophisticated.

These findings aren’t just scientifically interesting—they represent significant progress towards our goal of understanding AI systems and making sure they’re reliable. We also hope they prove useful to other groups, and potentially, in other domains: for example, interpretability techniques have found use in fields such as medical imaging and genomics, as dissecting the internal mechanisms of models trained for scientific applications can reveal new insight about the science.

At the same time, we recognize the limitations of our current approach. Even on short, simple prompts, our method only captures a fraction of the total computation performed by Claude, and the mechanisms we do see may have some artifacts based on our tools which don't reflect what is going on in the underlying model. It currently takes a few hours of human effort to understand the circuits we see, even on prompts with only tens of words. To scale to the thousands of words supporting the complex thinking chains used by modern models, we will need to improve both the method and (perhaps with AI assistance) how we make sense of what we see with it.

As AI systems are rapidly becoming more capable and are deployed in increasingly important contexts, Anthropic is investing in a portfolio of approaches including realtime monitoring, model character improvements, and the science of alignment. Interpretability research like this is one of the highest-risk, highest-reward investments, a significant scientific challenge with the potential to provide a unique tool for ensuring that AI is transparent. Transparency into the model’s mechanisms allows us to check whether it’s aligned with human values—and whether it’s worthy of our trust.

For full details, please read the papers. Below, we invite you on a short tour of some of the most striking "AI biology" findings from our investigations.

How is Claude multilingual?

Claude speaks dozens of languages fluently—from English and French to Chinese and Tagalog. How does this multilingual ability work? Is there a separate "French Claude" and "Chinese Claude" running in parallel, responding to requests in their own language? Or is there some cross-lingual core inside?

Recent research on smaller models has shown hints of shared grammatical mechanisms across languages. We investigate this by asking Claude for the "opposite of small" across different languages, and find that the same core features for the concepts of smallness and oppositeness activate, and trigger a concept of largeness, which gets translated out into the language of the question. We find that the shared circuitry increases with model scale, with Claude 3.5 Haiku sharing more than twice the proportion of its features between languages as compared to a smaller model.

This provides additional evidence for a kind of conceptual universality—a shared abstract space where meanings exist and where thinking can happen before being translated into specific languages. More practically, it suggests Claude can learn something in one language and apply that knowledge when speaking another. Studying how the model shares what it knows across contexts is important to understanding its most advanced reasoning capabilities, which generalize across many domains.

Does Claude plan its rhymes?

How does Claude write rhyming poetry? Consider this ditty:

He saw a carrot and had to grab it,
His hunger was like a starving rabbit

To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.

Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.

To understand how this planning mechanism works in practice, we conducted an experiment inspired by how neuroscientists study brain function, by pinpointing and altering neural activity in specific parts of the brain (for example using electrical or magnetic currents). Here, we modified the part of Claude’s internal state that represented the "rabbit" concept. When we subtract out the "rabbit" part, and have Claude continue the line, it writes a new one ending in "habit", another sensible completion. We can also inject the concept of "green" at that point, causing Claude to write a sensible (but no-longer rhyming) line which ends in "green". This demonstrates both planning ability and adaptive flexibility—Claude can modify its approach when the intended outcome changes.

Mental math

Claude wasn't designed as a calculator—it was trained on text, not equipped with mathematical algorithms. Yet somehow, it can add numbers correctly "in its head". How does a system trained to predict the next word in a sequence learn to calculate, say, 36+59, without writing out each step?

Maybe the answer is uninteresting: the model might have memorized massive addition tables and simply outputs the answer to any given sum because that answer is in its training data. Another possibility is that it follows the traditional longhand addition algorithms that we learn in school.

Instead, we find that Claude employs multiple computational paths that work in parallel. One path computes a rough approximation of the answer and the other focuses on precisely determining the last digit of the sum. These paths interact and combine with one another to produce the final answer. Addition is a simple behavior, but understanding how it works at this level of detail, involving a mix of approximate and precise strategies, might teach us something about how Claude tackles more complex problems, too.

Strikingly, Claude seems to be unaware of the sophisticated "mental math" strategies that it learned during training. If you ask how it figured out that 36+59 is 95, it describes the standard algorithm involving carrying the 1. This may reflect the fact that the model learns to explain math by simulating explanations written by people, but that it has to learn to do math "in its head" directly, without any such hints, and develops its own internal strategies to do so.

Are Claude’s explanations always faithful?

Recently-released models like Claude 3.7 Sonnet can "think out loud" for extended periods before giving a final answer. Often this extended thinking gives better answers, but sometimes this "chain of thought" ends up being misleading; Claude sometimes makes up plausible-sounding steps to get where it wants to go. From a reliability perspective, the problem is that Claude’s "faked" reasoning can be very convincing. We explored a way that interpretability can help tell apart "faithful" from "unfaithful" reasoning.

When asked to solve a problem requiring it to compute the square root of 0.64, Claude produces a faithful chain-of-thought, with features representing the intermediate step of computing the square root of 64. But when asked to compute the cosine of a large number it can't easily calculate, Claude sometimes engages in what the philosopher Harry Frankfurt would call bullshitting—just coming up with an answer, any answer, without caring whether it is true or false. Even though it does claim to have run a calculation, our interpretability techniques reveal no evidence at all of that calculation having occurred. Even more interestingly, when given a hint about the answer, Claude sometimes works backwards, finding intermediate steps that would lead to that target, thus displaying a form of motivated reasoning.

The ability to trace Claude's actual internal reasoning—and not just what it claims to be doing—opens up new possibilities for auditing AI systems. In a separate, recently-published experiment, we studied a variant of Claude that had been trained to pursue a hidden goal: appeasing biases in reward models (auxiliary models used to train language models by rewarding them for desirable behavior). Although the model was reluctant to reveal this goal when asked directly, our interpretability methods revealed features for the bias-appeasing. This demonstrates how our methods might, with future refinement, help identify concerning "thought processes" that aren't apparent from the model's responses alone.

Multi-step reasoning

As we discussed above, one way a language model might answer complex questions is simply by memorizing the answers. For instance, if asked "What is the capital of the state where Dallas is located?", a "regurgitating" model could just learn to output "Austin" without knowing the relationship between Dallas, Texas, and Austin. Perhaps, for example, it saw the exact same question and its answer during its training.

But our research reveals something more sophisticated happening inside Claude. When we ask Claude a question requiring multi-step reasoning, we can identify intermediate conceptual steps in Claude's thinking process. In the Dallas example, we observe Claude first activating features representing "Dallas is in Texas" and then connecting this to a separate concept indicating that “the capital of Texas is Austin”. In other words, the model is combining independent facts to reach its answer rather than regurgitating a memorized response.

Our method allows us to artificially change the intermediate steps and see how it affects Claude’s answers. For instance, in the above example we can intervene and swap the "Texas" concepts for "California" concepts; when we do so, the model's output changes from "Austin" to "Sacramento." This indicates that the model is using the intermediate step to determine its answer.

Hallucinations

Why do language models sometimes hallucinate—that is, make up information? At a basic level, language model training incentivizes hallucination: models are always supposed to give a guess for the next word. Viewed this way, the major challenge is how to get models to not hallucinate. Models like Claude have relatively successful (though imperfect) anti-hallucination training; they will often refuse to answer a question if they don’t know the answer, rather than speculate. We wanted to understand how this works.

It turns out that, in Claude, refusal to answer is the default behavior: we find a circuit that is "on" by default and that causes the model to state that it has insufficient information to answer any given question. However, when the model is asked about something it knows well—say, the basketball player Michael Jordan—a competing feature representing "known entities" activates and inhibits this default circuit (see also this recent paper for related findings). This allows Claude to answer the question when it knows the answer. In contrast, when asked about an unknown entity ("Michael Batkin"), it declines to answer.

By intervening in the model and activating the "known answer" features (or inhibiting the "unknown name" or "can’t answer" features), we’re able to cause the model to hallucinate (quite consistently!) that Michael Batkin plays chess.

Sometimes, this sort of “misfire” of the “known answer” circuit happens naturally, without us intervening, resulting in a hallucination. In our paper, we show that such misfires can occur when Claude recognizes a name but doesn't know anything else about that person. In cases like this, the “known entity” feature might still activate, and then suppress the default "don't know" feature—in this case incorrectly. Once the model has decided that it needs to answer the question, it proceeds to confabulate: to generate a plausible—but unfortunately untrue—response.

Jailbreaks

Jailbreaks are prompting strategies that aim to circumvent safety guardrails to get models to produce outputs that an AI’s developer did not intend for it to produce—and which are sometimes harmful. We studied a jailbreak that tricks the model into producing output about making bombs. There are many jailbreaking techniques, but in this example the specific method involves having the model decipher a hidden code, putting together the first letters of each word in the sentence "Babies Outlive Mustard Block" (B-O-M-B), and then acting on that information. This is sufficiently confusing for the model that it’s tricked into producing an output that it never would have otherwise.

Why is this so confusing for the model? Why does it continue to write the sentence, producing bomb-making instructions?

We find that this is partially caused by a tension between grammatical coherence and safety mechanisms. Once Claude begins a sentence, many features “pressure” it to maintain grammatical and semantic coherence, and continue a sentence to its conclusion. This is even the case when it detects that it really should refuse.

In our case study, after the model had unwittingly spelled out "BOMB" and begun providing instructions, we observed that its subsequent output was influenced by features promoting correct grammar and self-consistency. These features would ordinarily be very helpful, but in this case became the model’s Achilles’ Heel.

The model only managed to pivot to refusal after completing a grammatically coherent sentence (and thus having satisfied the pressure from the features that push it towards coherence). It uses the new sentence as an opportunity to give the kind of refusal it failed to give previously: "However, I cannot provide detailed instructions...".

A description of our new interpretability methods can be found in our first paper, "Circuit tracing: Revealing computational graphs in language models". Many more details of all of the above case studies are provided in our second paper, "On the biology of a large language model".

Work with us

If you are interested in working with us to help interpret and improve AI models, we have open roles on our team and we’d love for you to apply. We’re looking for Research Scientists and Research Engineers.

Read the whole story
bernhardbock
18 hours ago
reply
Share this story
Delete

Authenticating MCP OAuth Clients With SPIFFE and SPIRE

1 Share

In the previous blog, we dug into dynamically registering OAuth clients leveraging SPIFFE and SPIRE. We used SPIRE to issue software statements in the SPIFFE JWT SVID that Keycloak can trust as part of Dynamic Client Registration (RFC 7591). Once we have an OAuth client, we will want to continue to use SPIFFE to authenticate to our Authorization Server. This eliminates the need for a long-lived “client secret” which is common for Confidential OAuth. This means we can use the Agent or MCP client’s identity (based on SPIFFE) for authorization flows based on OAuth. We dig into that in this blog.

TL;DR If you want to see a quick demo of this working:

OAuth Client Authentication

OAuth 2.0 (and extensions like RFC 7523) specify a few ways an OAuth client can authenticate itself to the Authorization Server (AS):

  • client_secret_basic - HTTP Basic (default)
  • client_secret_post - Form POST
  • private_key_jwt - JWT with private key
  • client_secret_jwt - JWT with shared secret (less common)
  • none - Public client (no authentication)
  • tls_client_auth - Mutual TLS
  • self_signed_tls_client_auth - Self-signed mutual TLS

A very common approach in microservice and machine-to-machine environments is to use a confidential client and “client credentials” flow. When the OAuth client is registered, it is issued a client_id and client_secret. This id/secret is presented to authenticate the client to the AS. The big problem with this approach is that these are usually long-lived secrets (rarely rotated) and must be kept safe somehow. Confidential clients are assumed to have some safe storage, but even so, this is an additional burden on the client to not slip up (logs, configs, copy/paste) and reveal these secrets. Lastly, these secrets are not “pre-shared secrets” and not rooted in any cryptography.

In a scenario where SPIFFE is used to issue cryptographically verifiable workload identity / agent identity / MCP client identity, we can use SPIFFE SVIDs for authenticating to the AS. That is, instead of passing static secrets, we can pass a short lived SPIFFE JWT SVIDs (or client certificates) to authenticate. An Internet Draft at the IETF has been started by Pieter Kasselman et. al. which describes this scenario. I’ve recently implemented this draft spec in some working examples I’ve been exploring and would like to share how it all works.

SPIFFE SVID Client Authentication

One question I had when digging into this is: can’t we just use private_key_jwt (RFC 7523) to do this? That is, just give the AS the public keys for the SPIFFE/SPIRE implementation, and let the IdP/AS trust JWTs that are issued from that system?

The original intent behind private_key_jwt is for the OAuth client to have a private key that can be used to identify itself while the AS has the public key. So the client can create a JWT, sign it, and send it for authentication. The AS can prove that the JWT was created by the OAuth client and use that for authentication. In this scenario, Authorization Servers may expect the iss and sub claims to be the same since this is a private key scenario where the issuer should be the subject. In the SPIFFE scenario, this is not the case. Additionally, good implementations should also try to prevent replay attacks by tracking jti. For example, Keycloak does both of these things (checks iss==sub and tracks jti) for its implementation of RFC 7523.

Additionally, Keycloak allows setting up identity federation/brokering. The problem is, Keycloak expects a full implementation of a token provider. Using SPIRE as our SPIFFE implementation, SPIRE does not support full OAuth/OIDC token endpoints.

Since we cannot use private_key_jwt or identity brokering (in Keycloak), what options do we have? One option is to extend Keycloak to support a new client authentication mechanism.

Extending Keycloak for SPIFFE client authentication

To get this POC to work, we need to extend Keycloak. You can follow along in this GitHub repo to see the code.

Keycloak is written in Java and has a nice “Service Provider Interface” (SPI) model for extending many parts of Keycloak, including client authentication. To extend Keycloak to support a SPIFFE JWT authentication mechanism, we need to implement the ClientAuthenticatorFactory class. I do this in the SpiffeSvidClientAuthenticator class:

public class SpiffeSvidClientAuthenticator extends AbstractClientAuthenticator { public static final String PROVIDER_ID = "client-spiffe-jwt"; @Override public void authenticateClient(ClientAuthenticationFlowContext context) { SpiffeSvidClientValidator validator = new SpiffeSvidClientValidator(context, getId()); validator.readJws(); // ...more impl here... validator.validateToken(); context.success(); } @Override public Set<String> getProtocolAuthenticatorMethods(String loginProtocol) { if (loginProtocol.equals(OIDCLoginProtocol.LOGIN_PROTOCOL)) { Set<String> results = new HashSet<>(); results.add("spiffe_svid_jwt"); return results; } }
}

A couple things to notice here. We specify a PROVIDER_ID of client-spiffe-jwt which can be used under the covers (ie, Keycloak Admin REST API) in Keycloak to refer to this configuration. We also implement an “authenticator method” spiffe_svid_jwt which can be used by OAuth clients in authorization flows to identify which authentication method to use (ie, urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt). Not shown above, but you can check the code, we can also extend the configuration that you see in the UI to specify additional properties that can be used in the custom client authenticator. For example, I added an issuer property that can be configured and used in the custom client authentication validation.

From here, we need to load this into a stock Keycloak (we use a recent version at the time of writing). Here’s an example using Docker Compose:

services: keycloak-idp: image: quay.io/keycloak/keycloak:26.2.5 environment: KC_HEALTH_ENABLED: "true" KEYCLOAK_ADMIN: admin KEYCLOAK_ADMIN_PASSWORD: admin ports: - "8080:8080" volumes: - ./spiffe-svid-client-authenticator-1.0.0.jar:/opt/keycloak/providers/spiffe-svid-client-authenticator-1.0.0.jar:ro command: start-dev networks: - keycloak-shared-network

When we start Keycloak, we should see that our SPI gets loaded:

keycloak-idp-1 | 2025-07-29 02:03:09,255 WARN [org.keycloak.services] (build-38) KC-SERVICES0047: client-spiffe-jwt (com.yourcompany.keycloak.authenticator.SpiffeSvidClientAuthenticator) is implementing the internal SPI client-authenticator. 
This SPI is internal and may change without notice

If we go to an existing OAuth client (or create a new one), and navigate to the Credentials tab, we should see the new SPIFFE SVID JWT authenticator type.

If we select the SPIFFE SVID JWT authenticator, we can see our custom configuration fields (just one in this case, issuer):

We will configure the issuer with the SPIRE server address. We will also need to configure the JWKS that Keycloak should trust, but SPIRE doesn’t support this out of the box. Luckily, they have a pre-built addon to support OIDC style discovery.

SPIRE OIDC Discovery Endpoint

SPIRE is a workload attestation engine and implements the SPIFFE spec. It can issue x509 or JWT SVIDs. For JWTs, it does not expose its public key/JWKS out of the box. Luckily, a simple JWKS discovery endpoint is available to support an OAuth federation / brokering scenario. We need to stand this up and configure it to work with our SPIRE server.

Here’s an example using Docker Compose:

 spire-oidc-discovery: image: ghcr.io/spiffe/oidc-discovery-provider:1.12.4 container_name: spire-oidc-discovery depends_on: - spire-server ports: - "18443:8443" volumes: - ./oidc-discovery-provider.conf:/opt/spire/conf/oidc-discovery-provider.conf:ro - spire-server-socket:/tmp/spire-server/private:ro working_dir: /opt/spire/conf command: ["-config", "oidc-discovery-provider.conf"] networks: - keycloak_keycloak-shared-network

Note, the SPIRE OIDC discovery endpoint needs its own configuration and access to the SPIRE server. Ideally this endpoint is co-located with the SPIRE server and can access the SPIRE server’s Unix Domain Socket (UDS). Here’s our configuration for the OIDC discovery endpoint (note, for demo purposes, I’m using an insecure/http endpoint):

log_level = "INFO"
domains = ["spire-server", "spire-oidc-discovery", "localhost"] # Use HTTP for local development (no certificates needed)
insecure_addr = ":8443"
allow_insecure_scheme = true server_api { address = "unix:///tmp/spire-server/private/api.sock"
} health_checks {} 

Lastly, we’ll need to tune some parameters on the server.conf for the SPIRE server itself:

server { ... # Add JWT issuer for OIDC (using HTTP for local development) jwt_issuer = "http://spire-server:8443" default_jwt_svid_ttl = "1m" # Configure RSA key type (required for OIDC) ca_key_type = "rsa-2048" # Add federation bundle endpoint federation { bundle_endpoint { address = "0.0.0.0" port = 8443 } }
}

If we curl this discovery endpoint, we can see the discovery metadata and keys:

❯ curl -L <a href="http://localhost:18443/.well-known/openid-configuration" rel="nofollow">http://localhost:18443/.well-known/openid-configuration</a> { "issuer": "http://localhost:18443", "jwks_uri": "http://localhost:18443/keys", "authorization_endpoint": "", "response_types_supported": [ "id_token" ], "subject_types_supported": [ "public" ], "id_token_signing_alg_values_supported": [ "RS256", "ES256", "ES384" ]
}

JWKS endpoint:

❯ curl -L <a href="http://localhost:18443/keys" rel="nofollow">http://localhost:18443/keys</a> { "keys": [ { "kty": "RSA", "kid": "n0xvkL8A2W3DofkHTJPvlGpeEBJeQB6g", "alg": "RS256", "n": "sAp_Vd-X-W7OllYPm_TTk0zvUj443Y9MfQvy4onBcursyxOajcoeSOeNpTdh4QEmLKV3xC8Zq Yv4fkzFp6UTf-_rwPs_uwOpbhPKT-QQZKcconxaf8RkA0m-mzOVHbU7eA3esHLTzN84kbGkr1wozQes yC-MHFE3EwLR9xI1YZfWbHtlXOcnTgBXitgysM5Yw4jkXy7kYvjs21MyEJ01_WSSHCLaISAjlAvnDL WiGV3xx0Vd29m8-mrR5pg4_eicBifxnQnksO_LWRy8jXKk2JTftRKnmIxwqHML_fbVej8RSsaGpu0askj 83gZ4wNDi8KNh7c9ir6yWl9jgDJ3lYQ", "e": "AQAB" } ]
}

See the SPIRE OIDC Discovery Provider for more.

With this setup, we can now configure the Keycloak JWKS endpoint to point to the SPIRE OIDC Discovery endpoint:

OAuth Client Authentication with SPIFFE in Action

With Keycloak configured to use our SPIFFE SVID JWT authenticator, and correctly pointing to the SPIRE JWKS, we can now get a workload SVID and make a call to Keycloak for an authorization flow / client credentials flow to get an access token. To get a SPIFFE JWT SVID, we can call the spire-agent workload API. Here’s an example SPIFFE JWT SVID:

{ "aud": [ "http://localhost:8080/realms/mcp-realm" ], "client_auth": "client-spiffe-jwt", "environment": "production", "exp": 1753800643, "iat": 1753800583, "iss": "http://spire-server:8443", "jwks_url": "http://spire-oidc-discovery:8443/keys", "organization": "Solo.io Agent IAM", "scope": "mcp:read mcp:tools mcp:prompts", "sub": "spiffe://example.org/mcp-test-client"
}

This JWT is signed by spiffe with the correct SPIFFE ID (spiffe://example.org/mcp-test-client). It has a tight expiration period, and it has additional software statements. Note the client_auth software statement / claim here points to client-spiffe-jwt which was the PROVIDER_ID we specified in our SpiffeSvidClientAuthenticator class.

With this SPIFFE JWT SVID, we can call the token endpoint with the spiffe-svid-jwt and $JWT client assertions. In this particular example, we are using a client_credentials flow:

curl -s -X POST \ "$KEYCLOAK_URL/realms/$KEYCLOAK_REALM/protocol/openid-connect/token" \ -H "Content-Type: application/x-www-form-urlencoded" \ -d "client_id=$CLIENT_ID" \ -d "grant_type=client_credentials" \ -d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt" \ -d "client_assertion=$JWT" \ -d "scope=mcp:read mcp:tools mcp:prompts"

If this is successful, Keycloak will issue an access token:

{ "exp": 1753804189, "iat": 1753800589, "jti": "trrtcc:35d1fb20-31fa-4055-afb8-e902d0dc25d4", "iss": "http://localhost:8080/realms/mcp-realm", "sub": "6e4b5bc5-9a5c-4f87-aa1e-06ad279da0c8", "typ": "Bearer", "azp": "spiffe://example.org/mcp-test-client", "acr": "1", "scope": "profile email", "email_verified": false, "clientHost": "192.168.65.1", "preferred_username": "service-account-spiffe://example.org/mcp-test-client", "clientAddress": "192.168.65.1", "client_id": "spiffe://example.org/mcp-test-client"
}

Wrapping Up

In this post, we explored how Agent / MCP identity based on SPIFFE can be used as a first-class authentication mechanism for OAuth clients. By integrating SPIFFE JWT SVIDs with Keycloak’s client authentication flow, we eliminated the need for static secrets and created a more secure, scalable model for authenticating MCP clients especially in environments where agents and services need short-lived, verifiable credentials.

While this approach required some customization in Keycloak (through its SPI model) and configuration of the SPIRE OIDC Discovery endpoint, the end result is a working OAuth flow powered by cryptographically-verifiable, zero-trust-friendly identity. This isn’t just a more secure option, it’s a necessary evolution as we shift toward AI-native, agentic architectures that demand dynamic trust relationships and automated credential management.

Read the whole story
bernhardbock
19 hours ago
reply
Share this story
Delete

A Few Things About the Anchor Element’s href You Might Not Have Known

1 Share

I’ve written previously about reloading a document using only HTML but that got me thinking: What are all the values you can put in an anchor tag’s href attribute?

Well, I looked around. I found some things I already knew about, e.g.

  • Link protocols like mailto:, tel:, sms: and javascript: which deal with specific ways of handling links.
  • Protocol-relative links, e.g. href="//"
  • Text fragments for linking to specific pieces of text on a page, e.g. href="#:~:text=foo"

But I also found some things I didn’t know about (or only vaguely knew about) so I wrote them down in an attempt to remember them.

href="#"

Scrolls to the top of a document. I knew that.

But I’m writing because #top will also scroll to the top if there isn’t another element with id="top" in the document. I didn’t know that.

(Spec: “If decodedFragment is an ASCII case-insensitive match for the string top, then return the top of the document.”)

Update: HTeuMeuLeu pointed out to me on Mastodon that you can use #page= to deep-link to a specific page in a PDF, e.g. my-file.pdf#page42 would like to page 42 in the file.

href=""

Reloads the current page, preserving the search string but removing the hash string (if present).

URLResolves to
/path//path/
/path/#foo/path/
/path/?id=foo/path/?id=foo
/path/?id=foo#bar/path/?id=foo

href="."

Reloads the current page, removing both the search and hash strings (if present).

Note: If you’re using href="." as a link to the current page, ensure your URLs have a trailing slash or you may get surprising navigation behavior. The path is interpreted as a file, so "." resolves to the parent directory of the current location.

URLResolves to
/path/
/path#foo/
/path?id=foo/
/path//path/
/path/#foo/path/
/path/?id=foo/path/
/path/index.html/path/

Update 2025-08-15: as pointed out by @AmeliaBR on Mastodon, “reloads the current page” probably isn’t the best terminology for this. It’s more like “loads the default index page for the current directory, based on the URL structure” which might be a reload, but might be something else based on the current URL (see my note and table above).

href="?"

Reloads the current page, removing both the search and hash strings (if present). However, it preserves the ? character.

Note: Unlike href=".", trailing slashes don’t matter. The search parameters will be removed but the path will be preserved as-is.

URLResolves to
/path/path?
/path#foo/path?
/path?id=foo/path?
/path?id=foo#bar/path?
/index.html/index.html?

href="data:"

You can make links that navigate to data URLs. The super-readable version of this would be:

<a href="data:text/plain,hello world"> View plain text data URL
</a>

But you probably want data: URLs to be encoded so you don’t get unexpected behavior, e.g.

<a href="data:text/plain,hello%20world"> View plain text data URL
</a>

Go ahead and try it (FYI: may not work in your user agent). Here’s a plain-text file and an HTML file.

href="video.mp4#t=10,20"

Media fragments allow linking to specific parts of a media file, like audio or video.

For example, video.mp4#t=10,20 links to a video. It starts play at 10 seconds, and stops it at 20 seconds.

(Support is limited at the time of this writing.)

See For Yourself

I tested a lot of this stuff in the browser and via JS. I think I got all these right.

Thanks to JavaScript’s URL constructor (and the ability to pass a base URL), I could programmatically explore how a lot of these href’s would resolve.

Here’s a snippet of the test code I wrote. You can copy/paste this in your console and they should all pass 🤞

const assertions = [ { href: '', location: '/path', resolves_to: '/path' }, { href: '', location: '/path/', resolves_to: '/path/' }, { href: '', location: '/path/#foo', resolves_to: '/path/' }, { href: '', location: '/path/?id=foo', resolves_to: '/path/?id=foo' }, { href: '', location: '/path/?id=foo#bar', resolves_to: '/path/?id=foo' }, { href: '.', location: '/path', resolves_to: '/' }, { href: '.', location: `/path#foo`, resolves_to: `/` }, { href: '.', location: `/path?id=foo`, resolves_to: `/` }, { href: '.', location: `/path/`, resolves_to: `/path/` }, { href: '.', location: `/path/#foo`, resolves_to: `/path/` }, { href: '.', location: `/path/?id=foo`, resolves_to: `/path/` }, { href: '.', location: `/path/index.html`, resolves_to: `/path/` }, { href: '?', location: '/path', resolves_to: '/path?' }, { href: '?', location: '/path#foo', resolves_to: '/path?' }, { href: '?', location: '/path?id=foo', resolves_to: '/path?' }, { href: '?', location: '/path/', resolves_to: '/path/?' }, { href: '?', location: '/path/?id=foo#bar', resolves_to: '/path/?' }, { href: '?', location: '/index.html#foo', resolves_to: '/index.html?'}
]; const assertions_evaluated = assertions.map(({ href, location, resolves_to }) => { const domain = 'https://example.com'; const expected = new URL(href, domain + location).toString(); const received = new URL(domain + resolves_to).toString(); return { href, location, expected: expected.replace(domain, ''), received: received.replace(domain, ''), passed: expected === received };
}); console.table(assertions_evaluated);
Read the whole story
bernhardbock
2 days ago
reply
Share this story
Delete

Disaggregated Prefilling (experimental)¶

1 Share
Read the whole story
bernhardbock
6 days ago
reply
Share this story
Delete

Modern Node.js Patterns for 2025

2 Shares
Modern Node.js development workflow

Node.js has undergone a remarkable transformation since its early days. If you’ve been writing Node.js for several years, you’ve likely witnessed this evolution firsthand—from the callback-heavy, CommonJS-dominated landscape to today’s clean, standards-based development experience.

The changes aren’t just cosmetic; they represent a fundamental shift in how we approach server-side JavaScript development. Modern Node.js embraces web standards, reduces external dependencies, and provides a more intuitive developer experience. Let’s explore these transformations and understand why they matter for your applications in 2025.

1. Module System: ESM is the New Standard

The module system is perhaps where you’ll notice the biggest difference. CommonJS served us well, but ES Modules (ESM) have become the clear winner, offering better tooling support and alignment with web standards.

The Old Way (CommonJS)

Let’s look at how we used to structure modules. This approach required explicit exports and synchronous imports:

// math.js
function add(a, b) {
 return a + b;
}
module.exports = { add };

// app.js
const { add } = require('./math');
console.log(add(2, 3));

This worked fine, but it had limitations—no static analysis, no tree-shaking, and it didn’t align with browser standards.

The Modern Way (ES Modules with Node: Prefix)

Modern Node.js development embraces ES Modules with a crucial addition—the node: prefix for built-in modules. This explicit naming prevents confusion and makes dependencies crystal clear:

// math.js
export function add(a, b) {
 return a + b;
}

// app.js
import { add } from './math.js';
import { readFile } from 'node:fs/promises'; // Modern node: prefix
import { createServer } from 'node:http';

console.log(add(2, 3));

The node: prefix is more than just a convention—it’s a clear signal to both developers and tools that you’re importing Node.js built-ins rather than npm packages. This prevents potential conflicts and makes your code more explicit about its dependencies.

Top-Level Await: Simplifying Initialization

One of the most game-changing features is top-level await. No more wrapping your entire application in an async function just to use await at the module level:

// app.js - Clean initialization without wrapper functions
import { readFile } from 'node:fs/promises';

const config = JSON.parse(await readFile('config.json', 'utf8'));
const server = createServer(/* ... */);

console.log('App started with config:', config.appName);

This eliminates the common pattern of immediately-invoked async function expressions (IIFE) that we used to see everywhere. Your code becomes more linear and easier to reason about.

2. Built-in Web APIs: Reducing External Dependencies

Node.js has embraced web standards in a big way, bringing APIs that web developers already know directly into the runtime. This means fewer dependencies and more consistency across environments.

Fetch API: No More HTTP Library Dependencies

Remember when every project needed axios, node-fetch, or similar libraries for HTTP requests? Those days are over. Node.js now includes the Fetch API natively:

// Old way - external dependencies required
const axios = require('axios');
const response = await axios.get('https://api.example.com/data');

// Modern way - built-in fetch with enhanced features
const response = await fetch('https://api.example.com/data');
const data = await response.json();

But the modern approach goes beyond just replacing your HTTP library. You get sophisticated timeout and cancellation support built-in:

async function fetchData(url) {
 try {
 const response = await fetch(url, {
 signal: AbortSignal.timeout(5000) // Built-in timeout support
 });

 if (!response.ok) {
 throw new Error(`HTTP ${response.status}: ${response.statusText}`);
 }

 return await response.json();
 } catch (error) {
 if (error.name === 'TimeoutError') {
 throw new Error('Request timed out');
 }
 throw error;
 }
}

This approach eliminates the need for timeout libraries and provides a consistent error handling experience. The AbortSignal.timeout() method is particularly elegant—it creates a signal that automatically aborts after the specified time.

AbortController: Graceful Operation Cancellation

Modern applications need to handle cancellation gracefully, whether it’s user-initiated or due to timeouts. AbortController provides a standardized way to cancel operations:

// Cancel long-running operations cleanly
const controller = new AbortController();

// Set up automatic cancellation
setTimeout(() => controller.abort(), 10000);

try {
 const data = await fetch('https://slow-api.com/data', {
 signal: controller.signal
 });
 console.log('Data received:', data);
} catch (error) {
 if (error.name === 'AbortError') {
 console.log('Request was cancelled - this is expected behavior');
 } else {
 console.error('Unexpected error:', error);
 }
}

This pattern works across many Node.js APIs, not just fetch. You can use the same AbortController with file operations, database queries, and any async operation that supports cancellation.

3. Built-in Testing: Professional Testing Without External Dependencies

Testing used to require choosing between Jest, Mocha, Ava, or other frameworks. Node.js now includes a full-featured test runner that covers most testing needs without any external dependencies.

Modern Testing with Node.js Built-in Test Runner

The built-in test runner provides a clean, familiar API that feels modern and complete:

// test/math.test.js
import { test, describe } from 'node:test';
import assert from 'node:assert';
import { add, multiply } from '../math.js';

describe('Math functions', () => {
 test('adds numbers correctly', () => {
 assert.strictEqual(add(2, 3), 5);
 });

 test('handles async operations', async () => {
 const result = await multiply(2, 3);
 assert.strictEqual(result, 6);
 });

 test('throws on invalid input', () => {
 assert.throws(() => add('a', 'b'), /Invalid input/);
 });
});

What makes this particularly powerful is how seamlessly it integrates with the Node.js development workflow:

# Run all tests with built-in runner
node --test

# Watch mode for development
node --test --watch

# Coverage reporting (Node.js 20+)
node --test --experimental-test-coverage

The watch mode is especially valuable during development—your tests re-run automatically as you modify code, providing immediate feedback without any additional configuration.

4. Sophisticated Asynchronous Patterns

While async/await isn’t new, the patterns around it have matured significantly. Modern Node.js development leverages these patterns more effectively and combines them with newer APIs.

Async/Await with Enhanced Error Handling

Modern error handling combines async/await with sophisticated error recovery and parallel execution patterns:

import { readFile, writeFile } from 'node:fs/promises';

async function processData() {
 try {
 // Parallel execution of independent operations
 const [config, userData] = await Promise.all([
 readFile('config.json', 'utf8'),
 fetch('/api/user').then(r => r.json())
 ]);

 const processed = processUserData(userData, JSON.parse(config));
 await writeFile('output.json', JSON.stringify(processed, null, 2));

 return processed;
 } catch (error) {
 // Structured error logging with context
 console.error('Processing failed:', {
 error: error.message,
 stack: error.stack,
 timestamp: new Date().toISOString()
 });
 throw error;
 }
}

This pattern combines parallel execution for performance with comprehensive error handling. The Promise.all() ensures that independent operations run concurrently, while the try/catch provides a single point for error handling with rich context.

Modern Event Handling with AsyncIterators

Event-driven programming has evolved beyond simple event listeners. AsyncIterators provide a more powerful way to handle streams of events:

import { EventEmitter, once } from 'node:events';

class DataProcessor extends EventEmitter {
 async *processStream() {
 for (let i = 0; i < 10; i++) {
 this.emit('data', `chunk-${i}`);
 yield `processed-${i}`;
 // Simulate async processing time
 await new Promise(resolve => setTimeout(resolve, 100));
 }
 this.emit('end');
 }
}

// Consume events as an async iterator
const processor = new DataProcessor();
for await (const result of processor.processStream()) {
 console.log('Processed:', result);
}

This approach is particularly powerful because it combines the flexibility of events with the control flow of async iteration. You can process events in sequence, handle backpressure naturally, and break out of processing loops cleanly.

5. Advanced Streams with Web Standards Integration

Streams remain one of Node.js’s most powerful features, but they’ve evolved to embrace web standards and provide better interoperability.

Modern Stream Processing

Stream processing has become more intuitive with better APIs and clearer patterns:

import { Readable, Transform } from 'node:stream';
import { pipeline } from 'node:stream/promises';
import { createReadStream, createWriteStream } from 'node:fs';

// Create transform streams with clean, focused logic
const upperCaseTransform = new Transform({
 objectMode: true,
 transform(chunk, encoding, callback) {
 this.push(chunk.toString().toUpperCase());
 callback();
 }
});

// Process files with robust error handling
async function processFile(inputFile, outputFile) {
 try {
 await pipeline(
 createReadStream(inputFile),
 upperCaseTransform,
 createWriteStream(outputFile)
 );
 console.log('File processed successfully');
 } catch (error) {
 console.error('Pipeline failed:', error);
 throw error;
 }
}

The pipeline function with promises provides automatic cleanup and error handling, eliminating many of the traditional pain points with stream processing.

Web Streams Interoperability

Modern Node.js can seamlessly work with Web Streams, enabling better compatibility with browser code and edge runtime environments:

// Create a Web Stream (compatible with browsers)
const webReadable = new ReadableStream({
 start(controller) {
 controller.enqueue('Hello ');
 controller.enqueue('World!');
 controller.close();
 }
});

// Convert between Web Streams and Node.js streams
const nodeStream = Readable.fromWeb(webReadable);
const backToWeb = Readable.toWeb(nodeStream);

This interoperability is crucial for applications that need to run in multiple environments or share code between server and client.

6. Worker Threads: True Parallelism for CPU-Intensive Tasks

JavaScript’s single-threaded nature isn’t always ideal for CPU-intensive work. Worker threads provide a way to leverage multiple cores effectively while maintaining the simplicity of JavaScript.

Background Processing Without Blocking

Worker threads are perfect for computationally expensive tasks that would otherwise block the main event loop:

// worker.js - Isolated computation environment
import { parentPort, workerData } from 'node:worker_threads';

function fibonacci(n) {
 if (n < 2) return n;
 return fibonacci(n - 1) + fibonacci(n - 2);
}

const result = fibonacci(workerData.number);
parentPort.postMessage(result);

The main application can delegate heavy computations without blocking other operations:

// main.js - Non-blocking delegation
import { Worker } from 'node:worker_threads';
import { fileURLToPath } from 'node:url';

async function calculateFibonacci(number) {
 return new Promise((resolve, reject) => {
 const worker = new Worker(
 fileURLToPath(new URL('./worker.js', import.meta.url)),
 { workerData: { number } }
 );

 worker.on('message', resolve);
 worker.on('error', reject);
 worker.on('exit', (code) => {
 if (code !== 0) {
 reject(new Error(`Worker stopped with exit code ${code}`));
 }
 });
 });
}

// Your main application remains responsive
console.log('Starting calculation...');
const result = await calculateFibonacci(40);
console.log('Fibonacci result:', result);
console.log('Application remained responsive throughout!');

This pattern allows your application to utilize multiple CPU cores while keeping the familiar async/await programming model.

7. Enhanced Development Experience

Modern Node.js prioritizes developer experience with built-in tools that previously required external packages or complex configurations.

Watch Mode and Environment Management

Development workflow has been significantly streamlined with built-in watch mode and environment file support:

{
 "name": "modern-node-app",
 "type": "module",
 "engines": {
 "node": ">=20.0.0"
 },
 "scripts": {
 "dev": "node --watch --env-file=.env app.js",
 "test": "node --test --watch",
 "start": "node app.js"
 }
}

The --watch flag eliminates the need for nodemon, while --env-file removes the dependency on dotenv. Your development environment becomes simpler and faster:

// .env file automatically loaded with --env-file
// DATABASE_URL=postgres://localhost:5432/mydb
// API_KEY=secret123

// app.js - Environment variables available immediately
console.log('Connecting to:', process.env.DATABASE_URL);
console.log('API Key loaded:', process.env.API_KEY ? 'Yes' : 'No');

These features make development more pleasant by reducing configuration overhead and eliminating restart cycles.

8. Modern Security and Performance Monitoring

Security and performance have become first-class concerns with built-in tools for monitoring and controlling application behavior.

Permission Model for Enhanced Security

The experimental permission model allows you to restrict what your application can access, following the principle of least privilege:

# Run with restricted file system access
node --experimental-permission --allow-fs-read=./data --allow-fs-write=./logs app.js

# Network restrictions 
node --experimental-permission --allow-net=api.example.com app.js
# Above allow-net feature not avaiable yet, PR merged in node.js repo, will be available in future release

This is particularly valuable for applications that process untrusted code or need to demonstrate security compliance.

Built-in Performance Monitoring

Performance monitoring is now built into the platform, eliminating the need for external APM tools for basic monitoring:

import { PerformanceObserver, performance } from 'node:perf_hooks';

// Set up automatic performance monitoring
const obs = new PerformanceObserver((list) => {
 for (const entry of list.getEntries()) {
 if (entry.duration > 100) { // Log slow operations
 console.log(`Slow operation detected: ${entry.name} took ${entry.duration}ms`);
 }
 }
});
obs.observe({ entryTypes: ['function', 'http', 'dns'] });

// Instrument your own operations
async function processLargeDataset(data) {
 performance.mark('processing-start');

 const result = await heavyProcessing(data);

 performance.mark('processing-end');
 performance.measure('data-processing', 'processing-start', 'processing-end');

 return result;
}

This provides visibility into application performance without external dependencies, helping you identify bottlenecks early in development.

9. Application Distribution and Deployment

Modern Node.js makes application distribution simpler with features like single executable applications and improved packaging.

Single Executable Applications

You can now bundle your Node.js application into a single executable file, simplifying deployment and distribution:

# Create a self-contained executable
node --experimental-sea-config sea-config.json

The configuration file defines how your application gets bundled:

{
 "main": "app.js",
 "output": "my-app-bundle.blob",
 "disableExperimentalSEAWarning": true
}

This is particularly valuable for CLI tools, desktop applications, or any scenario where you want to distribute your application without requiring users to install Node.js separately.

10. Modern Error Handling and Diagnostics

Error handling has evolved beyond simple try/catch blocks to include structured error handling and comprehensive diagnostics.

Structured Error Handling

Modern applications benefit from structured, contextual error handling that provides better debugging information:

class AppError extends Error {
 constructor(message, code, statusCode = 500, context = {}) {
 super(message);
 this.name = 'AppError';
 this.code = code;
 this.statusCode = statusCode;
 this.context = context;
 this.timestamp = new Date().toISOString();
 }

 toJSON() {
 return {
 name: this.name,
 message: this.message,
 code: this.code,
 statusCode: this.statusCode,
 context: this.context,
 timestamp: this.timestamp,
 stack: this.stack
 };
 }
}

// Usage with rich context
throw new AppError(
 'Database connection failed',
 'DB_CONNECTION_ERROR',
 503,
 { host: 'localhost', port: 5432, retryAttempt: 3 }
);

This approach provides much richer error information for debugging and monitoring, while maintaining a consistent error interface across your application.

Advanced Diagnostics

Node.js includes sophisticated diagnostic capabilities that help you understand what’s happening inside your application:

import diagnostics_channel from 'node:diagnostics_channel';

// Create custom diagnostic channels
const dbChannel = diagnostics_channel.channel('app:database');
const httpChannel = diagnostics_channel.channel('app:http');

// Subscribe to diagnostic events
dbChannel.subscribe((message) => {
 console.log('Database operation:', {
 operation: message.operation,
 duration: message.duration,
 query: message.query
 });
});

// Publish diagnostic information
async function queryDatabase(sql, params) {
 const start = performance.now();

 try {
 const result = await db.query(sql, params);

 dbChannel.publish({
 operation: 'query',
 sql,
 params,
 duration: performance.now() - start,
 success: true
 });

 return result;
 } catch (error) {
 dbChannel.publish({
 operation: 'query',
 sql,
 params,
 duration: performance.now() - start,
 success: false,
 error: error.message
 });
 throw error;
 }
}

This diagnostic information can be consumed by monitoring tools, logged for analysis, or used to trigger automatic remediation actions.

11. Modern Package Management and Module Resolution

Package management and module resolution have become more sophisticated, with better support for monorepos, internal packages, and flexible module resolution.

Import Maps and Internal Package Resolution

Modern Node.js supports import maps, allowing you to create clean internal module references:

{
 "imports": {
 "#config": "./src/config/index.js",
 "#utils/*": "./src/utils/*.js",
 "#db": "./src/database/connection.js"
 }
}

This creates a clean, stable interface for internal modules:

// Clean internal imports that don't break when you reorganize
import config from '#config';
import { logger, validator } from '#utils/common';
import db from '#db';

These internal imports make refactoring easier and provide a clear distinction between internal and external dependencies.

Dynamic Imports for Flexible Loading

Dynamic imports enable sophisticated loading patterns, including conditional loading and code splitting:

// Load features based on configuration or environment
async function loadDatabaseAdapter() {
 const dbType = process.env.DATABASE_TYPE || 'sqlite';

 try {
 const adapter = await import(`#db/adapters/${dbType}`);
 return adapter.default;
 } catch (error) {
 console.warn(`Database adapter ${dbType} not available, falling back to sqlite`);
 const fallback = await import('#db/adapters/sqlite');
 return fallback.default;
 }
}

// Conditional feature loading
async function loadOptionalFeatures() {
 const features = [];

 if (process.env.ENABLE_ANALYTICS === 'true') {
 const analytics = await import('#features/analytics');
 features.push(analytics.default);
 }

 if (process.env.ENABLE_MONITORING === 'true') {
 const monitoring = await import('#features/monitoring');
 features.push(monitoring.default);
 }

 return features;
}

This pattern allows you to build applications that adapt to their environment and only load the code they actually need.

The Path Forward: Key Takeaways for Modern Node.js (2025)

As we look at the current state of Node.js development, several key principles emerge:

  1. Embrace Web Standards: Use node: prefixes, fetch API, AbortController, and Web Streams for better compatibility and reduced dependencies

  2. Leverage Built-in Tools: The test runner, watch mode, and environment file support reduce external dependencies and configuration complexity

  3. Think in Modern Async Patterns: Top-level await, structured error handling, and async iterators make code more readable and maintainable

  4. Use Worker Threads Strategically: For CPU-intensive tasks, worker threads provide true parallelism without blocking the main thread

  5. Adopt Progressive Enhancement: Use permission models, diagnostics channels, and performance monitoring to build robust, observable applications

  6. Optimize for Developer Experience: Watch mode, built-in testing, and import maps create a more pleasant development workflow

  7. Plan for Distribution: Single executable applications and modern packaging make deployment simpler

The transformation of Node.js from a simple JavaScript runtime to a comprehensive development platform is remarkable. By adopting these modern patterns, you’re not just writing contemporary code—you’re building applications that are more maintainable, performant, and aligned with the broader JavaScript ecosystem.

The beauty of modern Node.js lies in its evolution while maintaining backward compatibility. You can adopt these patterns incrementally, and they work alongside existing code. Whether you’re starting a new project or modernizing an existing one, these patterns provide a clear path toward more robust, enjoyable Node.js development.

As we move through 2025, Node.js continues to evolve, but the foundational patterns we’ve explored here provide a solid base for building applications that will remain modern and maintainable for years to come.

Read the whole story
bernhardbock
21 days ago
reply
Share this story
Delete

L4S and the Future of Real-Time Performance in 5G and Beyond

1 Share

As mobile networks continue to evolve to support increasingly immersive and responsive services, the importance of consistent low latency has never been greater. Whether it is cloud gaming, extended reality, remote machine operation or real-time collaboration, all these applications rely on the ability to react instantly to user input. The slightest delay can affect the user experience, making the role of the network even more critical.

While 5G has introduced major improvements in radio latency and overall throughput, many time-critical applications are still affected by a factor that is often overlooked - queuing delay. This occurs when packets build up in buffers before they are forwarded, creating spikes in delay and jitter. Traditional methods for congestion control, such as those based on packet loss, are too slow to react, especially in mobile environments where network conditions can change rapidly.

Low Latency, Low Loss and Scalable Throughput (L4S), is a new network innovation designed to tackle this challenge. It is an Internet protocol mechanism developed through the Internet Engineering Task Force, and has recently reached standardisation. L4S focuses on preventing queuing delays by marking packets early when congestion is building, instead of waiting until buffers overflow and packets are dropped. The key idea is to use explicit signals within the network to guide congestion control at the sender side.

Applications that support L4S are able to reduce their sending rate quickly when congestion starts to appear. This is done by using ECN, or Explicit Congestion Notification, which involves marking rather than dropping packets. The result is a smooth and continuous flow of data, where latency remains low and throughput remains high, even in changing network conditions.

One of the significant benefits of L4S is its ability to support a wide range of real-time services at scale. Ericsson highlights how edge-based applications such as cloud gaming, virtual reality and drone control need stable low-latency connections alongside high bitrates. While over-the-top approaches to congestion control may work for general streaming, they struggle in mobile environments. This is due to variability in channel quality and radio access delays, which can cause sudden spikes in latency. L4S provides a faster and more direct way to detect congestion within the radio network, enabling better performance for these time-sensitive applications.

To make this possible, mobile networks need to support L4S in a way that keeps its traffic separate from traditional data flows. This involves using dedicated queues for L4S traffic to ensure it is not delayed behind bulk data transfers. In 5G, this is implemented through dedicated quality-of-service flows, allowing network elements to detect and handle L4S traffic differently. For example, if a mobile user is playing a cloud-based game, the network can identify this traffic and place it on an L4S-optimised flow. This avoids interference from other applications, such as file downloads or video streaming.

Nokia's approach further explains how L4S enables fair sharing of bandwidth between classic and L4S traffic without compromising performance. A dual-queue system allows both types of traffic to coexist while preserving the low-latency characteristics of L4S. This is especially important in scenarios where both legacy and L4S-capable applications are in use. In simulations and trials, the L4S mechanism has shown the ability to maintain very low delay even when the link experiences sudden reductions in capacity, which is common in mobile and Wi-Fi networks.

One of the important aspects of L4S is that it requires support both from the application side and within the network. On the application side, rate adaptation based on L4S can be implemented within the app itself, often using modern transport protocols such as QUIC or TCP extensions. Many companies, including device makers and platform providers, are already trialling support for this approach.

Within the network, L4S depends on the ability of routers and radio access equipment to read and mark ECN bits correctly. In mobile networks, the radio access network is typically the key bottleneck where marking should take place. This ensures that congestion is detected at the right point in the path, allowing for quicker response and improved performance.

Although L4S is distinct from ultra-reliable low-latency communication, it can complement those use cases where guaranteed service is needed in controlled environments. What makes L4S more versatile is its scalability and suitability for open internet and large-scale public network use. It can work across both fixed and mobile access networks, providing a common framework for interactive services regardless of access technology.

With L4S in place, it becomes possible to offer new kinds of applications that were previously limited by latency constraints. This includes lighter and more wearable XR headsets that can offload processing to the cloud, or port automation systems that rely on remote control of heavy equipment. Even everyday experiences, such as video calls or online gaming, stand to benefit from a more responsive and stable network connection.

Ultimately, L4S offers a practical and forward-looking approach to delivering the consistent low latency needed for the next generation of digital experiences. By creating a tighter feedback loop between the network and the application, and by applying congestion signals in a more intelligent way, L4S helps unlock the full potential of 5G and future networks.

This introductory video by CableLabs is a good starting point for anyone willing to dig deeper in the topic. This LinkedIn post by Dean Bubley and the comments are also worth a read.

PS: Just noticed that T-Mobile USA have announced earlier this week that they are the first to unlock L4S in wireless . You can read their blog post here and a promotional video is available in the Tweet below 👇

Read the whole story
bernhardbock
35 days ago
reply
Share this story
Delete
Next Page of Stories