• About
    • About WebM
    • Press Info
    • FAQ
    • Discuss
    • Supporters
    • VP9 Codec
  • Developer
    • Overview & Code Repos
    • Contribute
    • Submitting Patches
    • Code Reviews
    • Workflow
    • Conventions
    • Continuous Integration
    • Bug Reporting
    • Build Prerequisites
    • Repository Layout
    • Releases
    • Roadmap
  • Docs
    • Draft VP9 Bitstream Format
    • VP Codec ISO Media File Format Binding (MP4)
    • VP9 Levels and Decoder Testing
    • RFC 6386: VP8 Data Format
    • WebM Container Format
    • WebM Encryption
    • VP8 RTP Proposal (Draft)
    • Encoder Examples
    • Wiki
  • Tools
  • Home >
  • Docs >
  • Webm Encryption

WebM Encryption

Last modified: 2016-09-19
Author: Frank Galligan

Show Contents

  • Objective
  • Background
    • 1.0 Definitions
      • 1.1 AES
      • 1.2 Block Cipher
      • 1.3 Counter Block
      • 1.4 CTR
      • 1.5 Initialization Vector
      • 1.6 Live Streaming
      • 1.7 CENC
      • 1.8 VOD
    • 2.0 Use Cases
      • 2.1 Playback of Encrypted Content Over a Network
      • 2.2 Playback of Encrypted Content from a Storage Medium
      • 2.3 Out of Order Decryption
    • 3.0 Goals
      • 3.1 Primary Goals
  • Design
    • 4.0 WebM Common Encryption with Integrity Checking
      • 4.1 Common Encryption Format
      • 4.2 New Matroska/WebM Elements
      • 4.3 Supported Matroska Encryption Elements
      • 4.4 Unencrypted Block Format
      • 4.5 Full-sample Encrypted Block Format
      • 4.6 Subsample Encrypted Block Format
        • 4.6.1 Sample Partitions
      • 4.7 Signal Byte Format
      • 4.8 Initialization Vector
        • 4.8.1 Incrementing Initialization Vector
      • 4.9 CTR Counter Block Format Generation
      • 4.10 Excess Key Stream Data
      • 4.11 Examples
        • 4.11.1 Three Encrypted Frames
      • 4.12 Fast Startup Recommendation
    • 5.0 Lacing
    • 6.0 Revision History

Objective

Define a mechanism for supporting AES encryption in the WebM video container specification.

Background

There is a W3C proposal to add extensions for encrypted media. In order for WebM to be supported, it requires a system-independent way of encrypting the files.

Matroska has support for encrypting certain elements with AES (ContentEncryption element), but does not define how they are encrypted.

1.0 Definitions

1.1 AES

Advanced Encryption Standard

1.2 Block Cipher

An encryption algorithm that works on fixed length blocks of data.

1.3 Counter Block

This is the block used to generate the keystream with AES-CTR.

1.4 CTR

A mode of AES encryption that uses Counter Blocks to generate a key stream that is then XORed with the plaintext to produce the ciphertext.

1.5 Initialization Vector

A non-secret auxiliary input to cryptographic algorithms used to prevent certain classes of attacks. Fixed size input to the cryptographic algorithm.

1.6 Live Streaming

Media that is captured and sent to users at a specific time.

1.7 CENC

MPEG Common Encryption (ISO/IEC 23001-7)

1.8 VOD

Video on demand. Previously recorded media files that are watched when a user decides to watch them.

2.0 Use Cases

2.1 Playback of Encrypted Content Over a Network

In this use case, a content distributor wants to serve protected content to users. The users want to watch the encrypted content, while also seeking to other times within the media.

2.2 Playback of Encrypted Content from a Storage Medium

In this use case, the user wants to playback the encrypted content from local storage.

2.3 Out of Order Decryption

In this use case, encrypted frames may arrive to a client out of order. The client may want to decrypt the frames as soon as they arrive. An example of this use case is WebRTC, which decodes out of order video frames.

3.0 Goals

3.1 Primary Goals

3.1.1 Use the smallest possible number of encryption parameter combinations, ideally one.

3.1.2 Add as little overhead to the stream data as possible.

3.1.3 Support seeking within VOD files.

3.1.4 Minimize added latency after a seek.

3.1.5 Support live streaming.

3.1.6 Strive compatibility with CENC.

3.1.7 Lowest possible startup latency.

Design

4.0 WebM Common Encryption with Integrity Checking

Having one common encryption for WebM benefits both the delivery side and client comsumption.

4.1 Common Encryption Format

The WebM common encryption algorithm is AES. The key size is 128 bit. Information on how the blocks are encrypted is stored in the Track element and interleaved with the Block’s data.

4.2 New Matroska/WebM Elements

A master element named ContentEncAESSettings is added as a sub-element of the ContentEncryption element, which contains elements representing the features of AES. ContentEncAESSettings contains one sub element. AESSettingsCipherMode conveys the block cipher mode used with the AES encryption. AESSettingsCipherMode contains one value, CTR.

Element Name L ID D T Description
ContentEncryption 5 [50][35] - m Settings describing the encryption used. MUST be present if the value of ContentEncodingType is 1 and absent otherwise.
ContentEncAESSettings 6 [47][E7] - m Settings describing the encryption algorithm used. If ContentEncAlgo != 5 this MUST be absent.
AESSettingsCipherMode 7 [47][E8] 1 u The cipher mode used in the encryption. Predefined values: 1 - CTR
Cells in orange = Additions to Matroska
L = Level
ID = Matroska/Webm Element ID
D = Default
T = Type

With these new elements, clients should be able to decode frames encoded with AES.

4.3 Supported Matroska Encryption Elements

The following Matroska elements and values are added to the WebM specification.

  • ContentEncryption
  • ContentEncAlgo (Supported AES value = 5)
  • ContentEncKeyID
  • ContentEncAESSettings
  • AESSettingsCipherMode (Supported CTR value = 1)
4.4 Unencrypted Block Format

The payload of unencrypted Blocks is comprised of two parts. The first part is the Signal Byte. The last part is frame data.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Signal Byte  |                                               |
+-+-+-+-+-+-+-+-+                                               |
:               Bytes 1..N of unencrypted frame                 :
|                                                               |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4.5 Full-sample Encrypted Block Format

The payload of a Full-sample Encrypted Block is comprised of three parts. The first part is the Signal Byte. The second part is the IV. The last part of an Encrypted Block payload is frame data. The only part of the Block that is encrypted is the frame data.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Signal Byte  |                                               |
+-+-+-+-+-+-+-+-+             IV                                |
|                                                               |
|               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               |                                               |
|-+-+-+-+-+-+-+-+                                               |
:               Bytes 1..N of encrypted frame                   :
|                                                               |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4.6 Subsample Encrypted Block Format

The Subsample Encrypted Block format extends the Full-sample format by setting a "partitioned" (P) bit in the Signal Byte. If this bit is set, the EncryptedBlock header shall include an 8-bit integer indicating the number of sample partitions (dividers between clear/encrypted sections), and a series of 32-bit integers in big-endian encoding indicating the byte offsets of such partitions.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Signal Byte  |                                               |
+-+-+-+-+-+-+-+-+             IV                                |
|                                                               |
|               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               | num_partition |     Partition 0 offset ->     |
|-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
|     -> Partition 0 offset     |              ...              |
|-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
|             ...               |     Partition n-1 offset ->   |
|-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
|     -> Partition n-1 offset   |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
|                    Clear/encrypted sample data                |
|                                                               |
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
4.6.1 Sample Partitions

The samples shall be partitioned into alternating clear and encrypted sections, always starting with a clear section. Generally for n clear/encrypted sections there shall be n-1 partition offsets. However, if it is required that the first section be encrypted, then the first partition shall be at byte offset 0 (indicating a zero-size clear section), and there shall be n partition offsets.

Please refer to the "Sample Encryption" description of the "Common Encryption" section of the VP Codec ISO Media File Format Binding Specification for more detail on how subsample encryption is implemented.

4.7 Signal Byte Format
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|X|   RSV   |P|E|
+-+-+-+-+-+-+-+-+
Extension bit (X)
If set, another signal byte will follow this byte. Reserved for future expansion (currently MUST be set to 0).
RSV bits (RSV)
Bits reserved for future use. MUST be set to 0 and MUST be ignored.
Encrypted bit (E)
If set, the Block MUST contain an IV immediately followed by an encrypted frame. If not set, the Block MUST NOT include an IV and the frame MUST be unencrypted. The unencrypted frame MUST immediately follow the Signal Byte.
Partitioned bit (P)
Used to indicate that the sample has subsample partitions. If set, the IV will be followed by a num_partitions byte, and num_partitions * 32-bit partition offsets. This bit can only be set if the E bit is also set.
4.8 Initialization Vector

The IV MUST be unique for every frame for a given key. The IV SHOULD start with a random value on the first encrypted frame.

4.8.1 Incrementing Initialization Vector

The IV MUST be increased by 1 for every encrypted frame. The IV MUST be stored as a raw stream of bytes. Incrementing of the IV should be treated as an unsigned 64 bit number, i.e., if the IV value of the current encrypted frame is 0xFFFFFFFFFFFFFFFF, then the IV value of the next encrypted frame should be 0.

4.9 CTR Counter Block Format Generation

The Counter Block Format generation is only valid if the stream has a ContentEncAlgo=5 and a AESSettingsCipherMode=1. If the stream has any values that are different then this, Counter Block Format generation MUST NOT be used.

Every encrypted frame MUST reinitialize the decryptor with a unique Counter Block. Each Counter Block MUST be unique within the same stream for the same encryption key. All Counter Blocks MUST be 16 bytes.

The most significant 8 bytes of the Counter Block is the IV, which is set from the IV data in the encrypted Block. The least significant 8 bytes is the Block Counter that is initialized to 0.

4.10 Excess Key Stream Data

After encrypting a frame there may be excess key stream data. This data MUST be discarded before the next frame is encrypted.

4.11 Examples
4.11.1 Three Encrypted Frames
IV = 0xFFFFFFFFFFFFFFFE
Block Counter = 0x0000000000000000
Counter Block = 0xFFFFFFFFFFFFFFFE0000000000000000
IV = 0xFFFFFFFFFFFFFFFF
Block Counter = 0x0000000000000000
Counter Block = 0xFFFFFFFFFFFFFFFF0000000000000000
IV = 0x0000000000000000
Block Counter = 0x0000000000000000
Counter Block = 0x00000000000000000000000000000000
4.12 Fast Startup Recommendation

Acquiring keys for the decryption may take longer than some clients deem acceptable. To speed startup, it is recommended to create Tracks that have the first number of frames unencrypted.

5.0 Lacing

Lacing is not supported.

6.0 Revision History

Version Comment
1.1 Add subsample encrypted block and partitioning scheme.
1.0 Initial public release.
0.5 Changed storing of IV values to be a raw stream of bytes.
0.4 Removed HMAC.
0.3 Frames may be encrypted or unencrypted. Adding signal byte to every frame. Adding Use Cases.
0.2 Changing IV prepended to every frame.
0.1 First released revision. All frames encrypted. HMAC prepended to every frame. IV derived from Block timestamp.
About
  • About WebM
  • FAQ
  • Discuss
  • Supporters
More
  • Tools
  • Licenses
  • Downloads
  • VP8 Cross-License
Developer
  • Overview
  • Contribute
  • Submitting Patches
  • Code Reviews
  • Workflow
  • Conventions
  • Bug Reporting
  • Build Prerequisites
  • Repository Layout
  • Releases
  • Roadmap
Docs
  • WebM Codec SDK
  • WebM Container Format
  • WebM Encryption
  • VP8 RTP Proposal (Draft)
  • RFC 6386: VP8 Data Format
  • Encoder Examples
  • Wiki
Copyright 2010 -
The WebM Project
HTML5 Powered with CSS3 / Styling, and Semantics
webmaster@webmproject.org