An ID3 Parser in Factor
Sunday, May 24, 2009
A new contributor, Tim Wawrzynczak, wrote an ID3 parser as his first Factor program a couple of months back. The code looked pretty good, so it was easy to refactor the way ID3v1 and ID3v2 tags are represented and to add some utility words for managing directories of MP3s. The library still needs some work, but now it can take a directory tree and recursively parse all of the ID3 headers. I also realized that Factor’s mmap implementation always tried to map a file read/write, so for ID3 parsing I added a read-only mmap. The finished code is here.
ID3v1 format
ID3 tags come in two flavors – the old ID3v1 format, and the newer ID3v2 one. ID3v1 has a fixed length of 128 bytes and, if present, is the last 128 bytes of an MP3 file. The bytes begin with “TAG” and follow with the song title (30 bytes), artist (30 bytes), album (30 bytes), year (4 bytes), comment (30 bytes), and the genre (1 byte). The problem with this approach is that you are limited in the length of all the fields and in which data you can represent.
To implement the parser, we use the Memory-mapped files vocabulary to open the file. You can treat the file as an array of bytes that obeys the sequence protocol. Checking if a file has ID3v1 headers becomes:
: id3v1? ( mmap -- ? )
{ [ length 128 >= ] [ 128 tail-slice* "TAG" head? ] } 1&& ;
The logic here is straightforward: the sequence (mmapped file) is
checked to make sure it’s long enough to contain a header, and then the
last 128 bytes are checked against the magic bytes “TAG”. Since this is
a short-circuit combinator, if the length test fails then the
computation will end early. In this way, you can string together long
computations that use short-circuit and and or. Factor’s usual
and
/or
words take two already-evaluated objects, so the
short-circuit behavior is implemented as a library of macros instead.
ID3v2 format
The newer standard, ID3v2, has more room for metadata and can store anything up to 256 MB. Its use is indicated when the MP3 begins with “ID3” or when the bytes “3DI” are present 10 bytes from the end of the file or 10 bytes before the ID3v1 header. For parsing the header, the first two bytes are the version, then a flag byte, and finally four bytes for the size of the header. The size bytes are encoded as a synchsafe number, which means that the top bit is discarded and the lower seven bits are the data. In our case, 28 bits of data are the length, which is why the maximum length is 256MB.
The actual data we want to parse is stored in frames. Each frame has a
tag, which is four ascii or utf16 bytes, followed by another 4 byte
synchsafe integer for the length, two bytes of flag data, and the frame
data. There are many different types of frames, but some of the more
common ones are tagged TALB (album title), TIT2 (title), TRCK (track),
COMM (comment), TPE1 (performer). All of the frame are added to a
hashtable and keyed by the tag. Looking up the title frame becomes as
easy as "TIT2" find-id3-frame
on an ID3 tuple.
Future work
The ID3v1 “TAG+” format is not supported yet and apparently it’s hardly ever used. ID3v2 tags are only looked for at the beginning of an MP3 but may be present at the end too as of version four. Most of the ID3 tags are not parsed into meaningful data besides the raw bytes. Lastly, writing out ID3 tags is not implemented yet and would be a good first program to write in Factor. Volunteers?