One Model for DASH and HLS

04 Mar, 2026

_{Disclaimer: HAM is currently in private alpha. It is unclear whether the current implementation will result in an open-source release.}

If you've ever worked on a streaming platform, you've felt this pain: you build a feature for DASH, test it, ship it, and then build it again for HLS. Multi-CDN failover? Twice. DRM key rotation? Twice. Track filtering, dynamic ad insertion, manifest manipulation? All twice.

CMAF was supposed to fix this. And it did, for containers. A CMAF segment works identically whether a DASH manifest or an HLS playlist points to it. The media processing pipeline is truly unified. But the manifests themselves? Still two completely different formats. Two parsers, two generators, two sets of bugs.

We kept asking ourselves: if the segments are the same, why are we maintaining two separate manifest toolchains?

That question led us to HAM, a format-agnostic model for adaptive streaming manifests. Build features once, output DASH and HLS. Learn how to use the CLI and TypeScript library.

Building a unified model over streaming formats has been a long-standing goal of mine. HAM is that model, inspired by community-driven fundamentals.

What is HAM?

HAM (the Hypothetical Application Model) is a format-agnostic JSON representation of adaptive streaming content. It's based on the CMAF spec (ISO/IEC 23000-19, Chapter 6) and the DASH-HLS Interoperability Specification. The idea is straightforward: represent streaming concepts once (quality ladders, DRM, segment addressing) independent of format syntax. DASH and HLS become output formats, not separate worlds.

Think of HAM as an intermediate language. DASH and HLS are dialects of the same underlying streaming model. HAM is that model, made explicit.

The data hierarchy maps closely to what both DASH and HLS already express:

Ham
└── Presentation          a time window (DASH Period)
    └── SelectionSet      mutually exclusive options (language choices)
        └── SwitchingSet  a quality ladder (DASH AdaptationSet)
            └── Track     a single variant (resolution, bitrate, codec)
                └── Segment one chunk of media (URL + duration)

A Presentation is a time-based division: the main content, an ad break, a bumper. A SelectionSet groups alternatives the player chooses between (English audio vs. Spanish audio). A SwitchingSet is an ABR quality ladder where the player switches between its tracks based on bandwidth. And each Track is a concrete variant with resolved segment URLs ready for playback.

Tracks are typed. A video track has width, height, and frameRate. An audio track has channels and sampleRate. A text track is just that. TypeScript's discriminated unions make this type-safe at compile time:

function logTrack(track: Track) {
  if (track.type === "video") {
    console.log(`${track.width}x${track.height} @ ${track.bandwidth}bps`);
  }
  if (track.type === "audio") {
    console.log(`${track.channels}ch ${track.language}`);
  }
}

Why this matters

The value isn't in the data model itself. It's in what you can build on top of it.

Build once, output everywhere. A track filter written against HAM works for both DASH and HLS output. DRM key injection, multi-period stitching, bandwidth capping: all implemented once. When we add a new feature, it works for both formats immediately.

One test suite. Round-trip tests convert DASH to HAM and back, verifying lossless fidelity. The same HAM feeds HLS generation tests. No more validating the same logic twice against two different spec documents.

AI-friendly. Language models can reason about HAM's clean JSON structure (quality ladders, segment lists, protection schemes) without needing to parse XML namespaces or M3U8 tag syntax. More on this below.

What this unlocks in practice

Server-side ad insertion. Stitch ad content into the main stream by merging presentations. One operation, and both DASH and HLS output get the ads in the right place.

Subscription tiers. Cap video bandwidth for free-tier users by filtering tracks above a threshold. Premium users get the full quality ladder. Same HAM source, different filters.

Multi-CDN failover. Swap CDN URLs across all segments in one pass. No need to rewrite XML templates or M3U8 playlists separately.

Regional content packaging. Strip tracks or presentations by language or content window. Package region-specific manifests from a single global HAM source.

The hidden complexity of streaming manifests

Most developers think of DASH and HLS as "manifest files that point to video chunks." That's true at a high level, but the reality underneath is surprisingly complex.

DASH: XML with layers of indirection

A DASH manifest (MPD) isn't just a list of URLs. To describe a single video quality, you need four levels of nested XML elements:

<MPD xmlns="urn:mpeg:dash:schema:mpd:2011"
     xmlns:cenc="urn:mpeg:cenc:2013">
  <Period duration="PT10S">
    <AdaptationSet mimeType="video/mp4">
      <SegmentTemplate media="video_$RepresentationID$_$Time$.m4s"
                       timescale="25" />
      <Representation id="1" bandwidth="5000000"
                      width="1920" height="1080" />
      <!-- ... more quality levels ... -->
    </AdaptationSet>
  </Period>
</MPD>

A few things to notice. The XML namespaces (urn:mpeg:dash:schema:mpd:2011, urn:mpeg:cenc:2013) are required for the document to be valid. Properties inherit downward: a Representation inherits the mimeType from its parent AdaptationSet, which inherits timing from its parent Period. You can't understand a representation without reading all three of its parent elements. And the segment URLs aren't real URLs. They're templates with variables like $RepresentationID$ and $Time$ that need to be expanded at runtime using timescale arithmetic.

HLS: simple on the surface

An HLS playlist looks deceptively simple:

#EXTM3U
#EXT-X-TARGETDURATION:4
#EXT-X-MAP:URI="init.mp4"
#EXTINF:4.000,
segment-001.m4s
#EXTINF:4.000,
segment-002.m4s

But the simplicity hides real complexity. Tags are positional: they apply to the next URI in the file, creating an implicit state machine. A master playlist references multiple media playlists by filename, with no inline structure. And HLS has its own distinct way of signaling DRM, codec information, and timeline discontinuities that doesn't map cleanly to how DASH does it.

Two different ways of thinking

These aren't just different file formats. They're fundamentally different ways of representing the same content. DASH uses hierarchical XML with property inheritance and template-based addressing. HLS uses flat text playlists with positional semantics and direct URLs. Bridging them means reconciling mismatches in timing models, DRM signaling, segment addressing, and content grouping.

Why this is an information overload problem

The MPEG-DASH spec spans 330 pages. HLS adds 60 more. CMAF adds another 142. That's over 500 pages of interconnected technical specifications that must be cross-referenced for every structural decision.

Designing a format that unifies both requires holding all three specs in mind simultaneously: where they align, where they conflict, where one is expressive and the other is ambiguous. A human simply cannot keep 500+ pages of interconnected technical detail in working memory at once. The information overload makes it nearly impossible to spot the patterns and conflicts needed to design a clean abstraction.

Production parsers demonstrate this complexity. Shaka Player's DASH parser requires 7,600+ lines of code, and that's a one-way parser that only reads manifests. hls.js needs 1,800+ lines for HLS. A conversion system like HAM requires bidirectional transformation plus a normalization layer.

Why AI changes the equation

About 70% of this project was AI-assisted. That's not a gimmick; it's a reflection of the problem's nature. An AI can hold the full spec context, systematically implement each transformation, and catch edge cases through test-driven iteration. It can cross-reference the DASH spec, the HLS spec, and the CMAF spec in a single pass, something that would take a human constant context-switching.

And HAM's clean JSON structure is inherently more tractable for AI than raw XML or M3U8. Compare the DASH XML above to its HAM equivalent:

{
  "type": "video",
  "width": 1920,
  "height": 1080,
  "bandwidth": 5000000,
  "codecs": ["avc1.640028"],
  "segments": [
    { "url": "https://cdn.example.com/video_1_0.m4s", "duration": 4000 },
    { "url": "https://cdn.example.com/video_1_96.m4s", "duration": 4000 }
  ]
}

No namespaces. No inheritance. No template variables. All URLs are fully resolved. All durations are in integer milliseconds. An AI (or a human) can read, reason about, and transform this without any format-specific knowledge.

Using the CLI

The following sections are aimed at developers. If you're here for the concepts, the story above covers the essentials.

The ham CLI converts between formats and manipulates manifests. Here's the workflow: convert to HAM, transform the JSON, convert to your target format.

Convert a DASH manifest to HAM

ham dash-to-ham https://cdn.example.com/manifest.mpd output.ham.json

The CLI prints a summary of what it parsed:

HAM (3 presentations, 927 segments)
├── Presentation 0 (10.000s)
│   ├── SelectionSet 0
│   │   ├── SwitchingSet 0 · video, 5 tracks
│   │   │   ├── 412x232 avc1.64000d 400kbps 25fps
│   │   │   ├── 640x360 avc1.64001e 800kbps 25fps
│   │   │   ├── 960x540 avc1.64001f 1600kbps 25fps
│   │   │   ├── 1280x720 avc1.640020 3200kbps 25fps
│   │   │   └── 1920x1080 avc1.640028 5000kbps 25fps
│   │   └── SwitchingSet 1 · audio nld, 1 track
│   │       └── nld 2ch mp4a.40.2 192kbps 48000Hz
├── Presentation 1 (677.640s)
│   └── ...
└── Presentation 2 (10.000s)
    └── ...

Convert HAM back to DASH or HLS

ham ham-to-dash output.ham.json regenerated.mpd
ham ham-to-hls output.ham.json hls-output/

The HLS command generates a master playlist and individual media playlists (video-0.m3u8, audio-5.m3u8, etc.) in the output directory.

Filter tracks with jq

The edit command lets you transform HAM using jq filters. Remove all audio tracks:

# filter.jq
.presentations[].selectionSets[].switchingSets[].tracks |= [.[] | select(.type != "audio")]

ham edit input.ham.json --filter-file filter.jq --output no-audio.ham.json

Keep only English audio:

.presentations[].selectionSets[].switchingSets[].tracks |= [
  .[] | select(.type != "audio" or .language == "eng")
]

Cap video bandwidth at 2 Mbps:

.presentations[].selectionSets[].switchingSets[].tracks |= [
  .[] | select(.type != "video" or .bandwidth <= 2000000)
]

Apply DRM keys

If you have a CPIX document with your encryption keys:

ham apply-cpix keys.cpix input.ham.json --output protected.ham.json

This injects Widevine, PlayReady, or FairPlay protection data into the HAM structure, ready to be serialized as DASH ContentProtection elements or HLS EXT-X-SESSION-KEY tags.

Using the library

For programmatic workflows, the TypeScript library gives you the same capabilities in code.

Basic conversions

import { dashToHam, hamToDash, hamToHls } from "@matvp91/ham";

// DASH to HAM
const mpd = loadFile("input.mpd");
const ham = dashToHam(mpd, { sourceUrl: "https://cdn.example.com/manifest.mpd" });

// HAM to DASH
const xml = hamToDash(ham);
saveFile("output.mpd", xml);

// HAM to HLS
const { master, medias } = hamToHls(ham);
saveFile("master.m3u8", master.text);
for (const media of medias) {
  saveFile(media.fileName, media.text);
}

The sourceUrl option tells the parser how to resolve relative URLs in the manifest. All URLs in HAM are fully resolved, with no templates or relative paths. This keeps downstream logic simple.

Merging presentations

Stitching a pre-roll ad onto the main content is just array concatenation:

import { normalizeHam } from "@matvp91/ham";

const ad = loadHam("ad.ham.json");
const main = loadHam("main.ham.json");

const { result } = normalizeHam({
  presentations: [...ad.presentations, ...main.presentations],
});

Each presentation becomes a DASH Period or an HLS discontinuity boundary. The normalization step ensures tracks are aligned across presentations.

Applying DRM

CPIX is the industry standard for communicating DRM keys. One function call:

import { applyCpixToHam } from "@matvp91/ham";

const cpixXml = loadFile("keys.cpix");
const result = applyCpixToHam(ham, cpixXml);

This parses the CPIX document, extracts key IDs and PSSH boxes, and attaches Protection objects to the matching switching sets in HAM.

Putting it together: a fixture server

Here's where it gets interesting. We built a small Hono server that serves both DASH and HLS from a single HAM source, on the fly.

import { Hono } from "hono";
import { findCommonPresentationPrefix, hamToDash, hamToHls } from "@matvp91/ham";

const app = new Hono();

app.get("/manifest.mpd", (c) => {
  return c.body(hamToDash(ham), 200, {
    "Content-Type": "application/dash+xml",
  });
});

app.get("/master.m3u8", (c) => {
  const { master } = hamToHls(ham);
  return c.body(master.text, 200, {
    "Content-Type": "application/vnd.apple.mpegurl",
  });
});

app.get("/:playlist{.+\\.m3u8$}", (c) => {
  const fileName = c.req.param("playlist");
  const { medias } = hamToHls(ham);
  const media = medias.find((m) => m.fileName === fileName);
  if (!media) return c.notFound();
  return c.body(media.text, 200, {
    "Content-Type": "application/vnd.apple.mpegurl",
  });
});

One HAM object. Two format endpoints. The server even filters presentations before serving, using findCommonPresentationPrefix() to identify and remove ad content by its URL pattern:

ham.presentations = ham.presentations.filter((p) => {
  const prefix = findCommonPresentationPrefix(p);
  return !prefix.includes("/tm/");
});

This is the promise of HAM in practice: manipulate content at the semantic level, then let the format converters handle the syntax. No XML templates. No M3U8 tag assembly. Just data in, manifests out.

What's next

HAM is exploratory work. The core DASH to HAM round-trip is lossless, and HAM to HLS generation covers the major use cases. Open questions remain around HLS parsing (HLS to HAM), live edge handling, and deeper multi-period scenarios, all of which are on the roadmap.

The foundation is solid: a clean abstraction that turns format-specific complexity into a tractable data problem. If you're building streaming infrastructure and tired of maintaining parallel codebases, we think it's worth a look.

Standing on the shoulders of, in no particular order, the people and projects that made this possible:

The streaming team at DPG Media for letting us loose with AI tooling.
CML (@littlespex, @cjpillsbury & co), as the original HAM implementation that inspired this project.
Qualabs for laying the fundamentals of HAM through community-driven work.
Shaka team for support with the playback side of things.
A closing line acknowledging the broader ecosystem: SVTA, DASH-IF, the teams behind the DASH and HLS specs, the Interoperability spec, and a nod that there are too many people to name individually.