matvp

A JIT Encryption Proxy for Streaming

Last Friday, during a hackathon at DPG Media, we set out to answer a question that had been nagging us for a while: do we really need to store a separate encrypted copy of every asset for every encryption scheme and key configuration?

Turns out, we don't. In a single day, we built a proxy service in Bun that sits between S3 (+ CDN) and the player, intercepts CMAF segments, and encrypts them just-in-time with DRM protection. The key ingredient? A format-agnostic manifest model called HAM.

Why JIT Encryption?

The traditional approach to DRM is straightforward: encrypt your assets upfront and store the encrypted copies. But the math gets ugly fast. You need to support CENC for pre-2020 TV's, CBCS with multiple key systems, single-key for simple setups, multi-key for premium content. Each combination is a separate encrypted copy of the same asset sitting on S3. Multiply that across a catalog of thousands of titles and you're looking at serious storage costs.

JIT encryption sidesteps this entirely. You store one plain (unencrypted) copy of each asset on S3, and a lightweight proxy encrypts segments on the way out based on what the requesting device needs. The manifest still advertises DRM protection to the player, so from the client's perspective, nothing changes. The scope of the hackathon was VOD, though the same approach applies to live streams too.

Why This Took a Day, Not a Month

This project builds on HAM, a format-agnostic manifest model I wrote about previously. HAM handles manifest generation, CPIX key resolution, and format conversion to DASH and HLS. That meant the proxy itself is only about 75 lines of meaningful logic. We wrote the whole thing with Claude Code, and never once had to consult the DASH or HLS spec. No SegmentTemplate attributes, no EXT-X-MAP tags, no playlist formatting rules - just presentations, tracks, and segments. That's a massive reduction in complexity when implementing features like this.

The part that matters here: DRM protection lives at the switchingSet level. Each switching set can carry its own protection object with a key ID, encryption scheme, and key system metadata (Widevine PSSH boxes, PlayReady PRO, etc).

// A protected switching set in HAM
{
  protection: {
    scheme: "cenc",
    defaultKid: "650F6A2E-A661-4BC9-8787-2A86D362C0BA",
    keySystems: {
      "com.widevine.alpha": { pssh: "AAAAenBzc2gA..." }
    }
  },
  tracks: [
    { type: "video", bandwidth: 400000, initUrl: "https://cdn.com/init.mp4", segments: [...] },
    { type: "video", bandwidth: 800000, initUrl: "https://cdn.com/init_hd.mp4", segments: [...] }
  ]
}

All tracks in the same switching set share one protection config, so when we need to encrypt a segment, we know exactly which key to use based on the switching set it belongs to.

Starting Simple: Single Key Encryption

For the first iteration, we kept things straightforward. One key encrypts everything. The flow:

  1. Load the HAM manifest and a CPIX document (an industry standard for exchanging encryption keys)
  2. Rewrite segment URLs so players fetch through our proxy
  3. When a segment request comes in, fetch the clear segment from S3, encrypt it, and serve it

Rewriting the Manifest

First, HAM gives us applyCpixToHam which merges CPIX key information into the manifest. This injects protection metadata (PSSH boxes, key system info) into each switching set that needs encryption:

import { applyCpixToHam, hamToDash } from "@matvp91/ham";

// Apply CPIX keys to the manifest, this injects protection
// metadata (PSSH, key systems) into the HAM switchingSets.
const ham = applyCpixToHam(rawHam, cpixText);

Now we need to rewrite segment URLs so the player fetches them through our proxy instead of directly from S3. We only touch protected tracks since unprotected ones (like plain audio) don't need encryption:

// "https://s3.amazonaws.com/bucket/video/seg_001.m4s" becomes "/proxy/video/seg_001.m4s"
for (const presentation of ham.presentations) {
  for (const selectionSet of presentation.selectionSets) {
    for (const switchingSet of selectionSet.switchingSets) {
      if (!switchingSet.protection) {
        // Unprotected sets can be fetched directly from S3.
        continue;
      }
      for (const track of switchingSet.tracks) {
        track.initUrl = `/proxy${new URL(track.initUrl).pathname}`;
        for (const segment of track.segments) {
          segment.url = `/proxy${new URL(segment.url).pathname}`;
        }
      }
    }
  }
}

When a player requests /manifest.mpd, we call hamToDash(ham) and it produces a valid DASH manifest with all the ContentProtection elements in place. The player sees a fully DRM-protected asset, even though the segments haven't been encrypted yet.

The Proxy Route

When the player requests a segment through /proxy/*, we need to figure out if it requires encryption. This is where HAM's structure really shines. We walk the manifest to find which switching set (and therefore which protection config) owns a given segment path:

function createSegmentInfoLookup(ham: Ham) {
  // Closure so the HAM reference is captured once at startup.
  return (path: string) => {
    for (const presentation of ham.presentations) {
      for (const selectionSet of presentation.selectionSets) {
        for (const switchingSet of selectionSet.switchingSets) {
          // Unprotected switching sets don't need encryption, skip them.
          if (!switchingSet.protection) {
            continue;
          }
          if (trackContainsSegment(switchingSet.tracks, path)) {
            // We need both: protection tells us which key to use,
            // presentation tells us which origin to fetch from.
            return {
              protection: switchingSet.protection,
              presentation,
            };
          }
        }
      }
    }
    return null;
  };
}

If we get a match, we know the segment is protected and we have everything we need to encrypt it. If not, it's an unprotected segment (like a plain audio track) and we pass it through as-is.

Encrypting with mp4encrypt

Once the lookup tells us a segment needs encryption, we need to resolve the actual key. HAM's resolvePrivateContentKeyFromCpix handles this, but there's a format mismatch: CPIX stores keys in base64 while mp4encrypt expects hex. The key ID also has dashes that mp4encrypt doesn't want.

import { resolvePrivateContentKeyFromCpix } from "@matvp91/ham";

function resolveKey(protection: Protection, cpix: string) {
  // HAM resolves the right key from CPIX based on the key ID.
  const { key, iv } = resolvePrivateContentKeyFromCpix(protection, cpix);

  // mp4encrypt expects hex, CPIX gives us base64.
  return {
    key: base64ToHex(key),
    iv: base64ToHex(iv),
  };
}

With the key resolved, we can encrypt the segment:

function encryptSegment(data: ArrayBuffer, protection: Protection, cpix: string) {
  const { key, iv } = resolveKey(protection, cpix);
  // mp4encrypt expects KID without dashes, but HAM stores it as a UUID.
  const kid = stripDashes(protection.defaultKid);
  // Match the encryption scheme the player expects.
  const method = protection.scheme === "cbcs" ? "MPEG-CBCS" : "MPEG-CENC";

  // Calls mp4encrypt under the hood.
  return encrypt(data, method, key, iv, kid);
}

Under the hood, encrypt shells out to Bento4's mp4encrypt:

mp4encrypt \
  --method MPEG-CENC \
  --key 1:<hex-key>:<hex-iv> \
  --property 1:KID:<hex-kid> \
  input.mp4 \
  output.mp4

With single-key encryption, this was enough. Every protected segment gets the same key, same IV, same KID. It worked, and we had a proof of concept running within a couple of hours.

Going Multi-Key

Single key is a good start, but major studios often require different keys per video resolution. SD might use one key with a software-based security level (like Widevine L3), while HD and UHD use separate keys that require hardware-backed decryption (Widevine L1). This is where multi-key support comes in.

HAM's presentation model maps naturally to this. Each presentation can have its own set of switching sets, each with its own protection config and its own key ID. The CPIX document carries all the keys, indexed by KID.

Presentation A (pre-roll):
  video switching set -> protection { kid: "aaa-...", scheme: "cbcs" }

Presentation B (main content):
  video switching set -> protection { kid: "bbb-...", scheme: "cbcs" }

CPIX document:
  kid "aaa-..." -> key: <base64>, iv: <base64>
  kid "bbb-..." -> key: <base64>, iv: <base64>

The lookup function we built earlier already handles this. It returns the specific protection object for each segment, so resolveKey naturally picks the right key from CPIX. No special multi-key logic needed. And since protection is per switching set, we have full control over what gets encrypted and what doesn't. Want to encrypt the main feature but leave ads and bumpers in the clear? Just don't add protection to those switching sets. Want everything encrypted with different keys? Add a key per presentation to the CPIX. The proxy adapts to whatever the HAM describes.

Per-Presentation Origins

The wrinkle we didn't anticipate: different presentations can live on different origins. A pre-roll ad might come from one S3 bucket, while the main content sits in another. We need to resolve the right origin before fetching upstream.

function createOriginLookup(ham: Ham) {
  // Each presentation may live on a different S3 bucket / CDN origin,
  // so we need to know where to fetch segments from per presentation.
  const origins = new Map<Presentation, string>();
  for (const presentation of ham.presentations) {
    const url = presentation.selectionSets[0].switchingSets[0].tracks[0].segments[0].url;
    origins.set(presentation, new URL(url).origin);
  }
  return origins;
}

This has to run before we rewrite URLs to /proxy/*. The startup order matters and we spent a solid chunk of time debugging this before landing on the right sequence:

const ham = applyCpixToHam(rawHam, cpixText);   // 1. Inject DRM metadata
const origins = createOriginLookup(ham);          // 2. Capture origins (URLs still absolute)
alterHam(ham);                                    // 3. NOW rewrite URLs to /proxy
const findSegmentInfo = createSegmentInfoLookup(ham);

Swapping steps 2 and 3 means you're trying to extract origins from URLs that already start with /proxy. We learned that the hard way.

PSSH Box Injection

One more challenge with init segments. For the player to trigger license acquisition, the init segment needs PSSH (Protection System Specific Header) boxes embedded in it. HAM already carries these as base64-encoded blobs in protection.keySystems, so we extract them and pass them along during encryption:

function encryptInitSegment(data: ArrayBuffer, protection: Protection, cpix: string) {
  // ... (see encryptSegment)

  // Extract PSSH boxes from HAM's keySystems (Widevine, PlayReady, etc).
  // Each PSSH box is a binary blob: [size][pssh][version+flags][systemId][dataSize][data]
  // The return type is an array of { systemId: string, data: Buffer }
  const psshBoxes = parsePsshBoxes(protection.keySystems);

  // Encrypt the init segment and inject the PSSH boxes so the player
  // knows how to acquire a license when it encounters this segment.
  return encrypt(data, key, iv, method, kid, { psshBoxes });
}

For media segments (non-init), mp4encrypt needs the corresponding init segment as a reference to understand the track structure. Init segments are small, so fetching them alongside the media segment adds negligible overhead. In theory, since VOD segments are immutable, you could aggressively cache the encrypted output on a CDN so the proxy only encrypts each segment once.

What We Ended Up With

A Bun + Hono service that:

The whole thing runs in a single process, with no external dependencies beyond mp4encrypt. HAM's clean structure meant most of the "hard" DRM plumbing was already solved. The proxy just had to walk the model, resolve keys, and call mp4encrypt. And the best part: one plain asset on S3 serves every device and encryption scheme combination.

What's Next

This was a hackathon prototype, and there's room to grow. Key rotation at segment boundaries, support for multiple DRM license servers, and proper load testing under concurrent requests are all on the list. And while we scoped the hackathon to VOD, the same approach works for live streams too. The core idea holds: treat encryption as a proxy concern, keep one plain copy on storage, and suddenly a lot of flexibility opens up without the storage bill to match.

Stay tuned.