- Technical Articles
- | - Particle Systems
- | - The Mystery of std::vector<bool>::operator
- | - ASN.1 BER Basics
- Language Problems
ASN.1 BER Basics
I've noticed a lack of short and easy-to-read literature and examples on how ASN.1 encoding works. In that vein, I've decided to write something up. There are several different forms of ASN.1 encoding; a description of the different encodings may be found on wikipedia. This document describes BER. Note that in BER encoding, there may be multiple different ways to encode the same data. DER encoding is basically BER encoding except that only one representation, usually the smallest possible multiple of an octet, is allowed.
Everything in ASN.1 BER is encoded as a tag-length-value triplet (referred to as TLV going forward). Structures are formed by embedding additional TLVs inside the value portion of a TLV. The resulting structure is a tree.
The first octet or octets of a TLV represent the tag. The bits of the first octet are interpreted as:
76 - Class. 00=Universal, 01=Application, 10=Context, 11=Private.
5 - Constructed. If 1 then the value itself contains a nested TLV.
43210 - Tag number. If this is 11111 then the octets following the tag contain the tag number instead. Those octets are decoded by appending the 7 lowest bits to a bit string, then repeating that process for the next octet if the highest bit is set.
The tag tells you what the value represents. Universal tags are primitive data types, which wikipedia has a nice summary of. In terms of encoding/decoding a tree, the only tags of importance are universal tags 16 (0x10, sequence) and 17 (0x11, set), which encode the same way. These two tags (which will also have the constructed bit set) indicate that the value contains a sequential list of TLVs. Representations of specific data types are not covered here.
After the tag, the next octet or octets tell you the length, or number of octets that make up the value portion of the TLV. If the highest bit is clear, then the lower 7 bits is the length. If the highest bit is set, then the lower 7 bits are the number of octets after the current one that make up the length.
Immediately following the length are the octets that make up the value.
Most technical documentation for ASN.1 structures use a human readable textual notation that has a direct translation to a TLV tree. That notation is not covered here. Instead some simple structures will be described and a corresponding ASN.1 BER representation will be provided.
Here is a sequence containing an integer of value "119" (hex 0x77) and the string "Greatness" (whose ascii encoding is 0x47 0x72 0x65 0x61 0x74 0x6E 0x65 0x73 0x73):
2A 0E 02 01 77 1B 09 47 72 65 61 74 6E 65 73 73 Sequence of length 14. Integer of length 1. Value 119. General string of length 9. Value Greatness.
A common practice is to use a pair of tags, one nested inside the other, to both specify what a piece of data is used for and what type it is. Context tags are used for this. Let's we want to store a structure that has a name, telephone number, and the structure from previous example in it. We will label the name as 0, the telephone number as 1, and the other data as 2. In the example below, the name "Bob" and number "180012345678" is used.
2A 22 A0 05 1B 03 42 6F 62 A1 07 02 05 29 E9 92 69 4E A2 10 2A 0E 02 01 77 1B 09 47 72 65 61 74 6E 65 73 73 Sequence of length 34. Context 0 of length 5. General string of length 3. Value Bob. Context 1 of length 7. Integer of length 5. Value 180012345678. Context 2 of length 16. Structure is broken down in the previous example.