MIME Base64 Encoding/Decoding

Email messages, their headers and their attachments are frequently subjected to a form of encoding that converts all data into a form that can be represented on screen/paper using a sequence of printable characters. This is known as MIME encoding. A detailed discussion of such issues can be found here. A commonly used for technique accomplishing this is known as Base64 encoding. In Base64 all bytes in the original data stream are mapped onto alphanumeric characters, i.e.

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789

and the characters +/. By taking six bits at a time from the original data one gets a number in the range 0..63. This number is treated as the index of a character in the reduced character set table shown below

IndexCharacter IndexCharacter IndexCharacter IndexCharacter
0A16Q32g48w
1B17R33h49x
2C18S34i50y
3D19T35j51z
4E20U36k520
5F21V37l531
6G22W38m542
7H23X39n553
8I24Y40o564
9J25Z41p575
10K26a42q586
11L27b43r597
12M28c44s608
13N29d45t619
14O30e46u62+
15P31f47v63/

The operation of Base64 encoding is best understood by studying a simple example. Consider the encoding of the word MIME illustrated below

CharacterMIME#0#0
Hex Code4D494D4500
Binary Code 0100110101001001 0100110101000101 0000000000000000
MIME Code 010011010100100101001101010001010000000000000000
MIME Value 19203713171600
MIME Char TUlNRQAA

While the technique is largely self-explanatory there are several points that require emphasis

The reader will be able to perform a similar analysis for data with a length of 3n + 2. Decoding is a relatively simple affair. However, some precautions are required to ensure that the string provided for decoding does not contain characters that do not belong to the Base64 subset. Once again, byte order constraints may impose the need for some extra gymnastics. These issues are discussed at length in comments to the sourcecode available for download from the link below.

Download
Jump To...

Colophon