Книга: Practical Common Lisp
Extracting Information from an ID3 Tag
Extracting Information from an ID3 Tag
Now that you have the basic ability to read and write ID3 tags, you have a lot of directions you could take this code. If you want to develop a complete ID3 tag editor, you'll need to implement specific classes for all the frame types. You'd also need to define methods for manipulating the tag and frame objects in a consistent way (for instance, if you change the value of a string in a text-info-frame
, you'll likely need to adjust the size); as the code stands, there's nothing to make sure that happens.[279]
Or, if you just need to extract certain pieces of information about an MP3 file from its ID3 tag—as you will when you develop a streaming MP3 server in Chapters 27, 28, and 29—you'll need to write functions that find the appropriate frames and extract the information you want.
Finally, to make this production-quality code, you'd have to pore over the ID3 specs and deal with the details I skipped over in the interest of space. In particular, some of the flags in both the tag and the frame can affect the way the contents of the tag or frame is read; unless you write some code that does the right thing when those flags are set, there may be ID3 tags that this code won't be able to parse correctly. But the code from this chapter should be capable of parsing nearly all the MP3s you actually encounter.
For now you can finish with a few functions to extract individual pieces of information from an id3-tag
. You'll need these functions in Chapter 27 and probably in other code that uses this library. They belong in this library because they depend on details of the ID3 format that the users of this library shouldn't have to worry about.
To get, say, the name of the song of the MP3 from which an id3-tag
was extracted, you need to find the ID3 frame with a specific identifier and then extract the information field. And some pieces of information, such as the genre, can require further decoding. Luckily, all the frames that contain the information you'll care about are text information frames, so extracting a particular piece of information mostly boils down to using the right identifier to look up the appropriate frame. Of course, the ID3 authors decided to change all the identifiers between ID3v2.2 and ID3v2.3, so you'll have to account for that.
Nothing too complex—you just need to figure out the right path to get to the various pieces of information. This is a perfect bit of code to develop interactively, much the way you figured out what frame classes you needed to implement. To start, you need an id3-tag
object to play with. Assuming you have an MP3 laying around, you can use read-id3
like this:
ID3V2> (defparameter *id3* (read-id3 "Kitka/Wintersongs/02 Byla Cesta.mp3"))
*ID3*
ID3V2> *id3*
#<ID3V2.2-TAG @ #x73d04c1a>
replacing Kitka/Wintersongs/02 Byla Cesta.mp3
with the filename of your MP3. Once you have your id3-tag
object, you can start poking around. For instance, you can check out the list of frame objects with the frames
function.
ID3V2> (frames *id3*)
(#<TEXT-INFO-FRAME-V2.2 @ #x73d04cca>
#<TEXT-INFO-FRAME-V2.2 @ #x73d04dba>
#<TEXT-INFO-FRAME-V2.2 @ #x73d04ea2>
#<TEXT-INFO-FRAME-V2.2 @ #x73d04f9a>
#<TEXT-INFO-FRAME-V2.2 @ #x73d05082>
#<TEXT-INFO-FRAME-V2.2 @ #x73d0516a>
#<TEXT-INFO-FRAME-V2.2 @ #x73d05252>
#<TEXT-INFO-FRAME-V2.2 @ #x73d0533a>
#<COMMENT-FRAME-V2.2 @ #x73d0543a>
#<COMMENT-FRAME-V2.2 @ #x73d05612>
#<COMMENT-FRAME-V2.2 @ #x73d0586a>)
Now suppose you want to extract the song title. It's probably in one of those frames, but to find it, you need to find the frame with the "TT2" identifier. Well, you can check easily enough to see if the tag contains such a frame by extracting all the identifiers like this:
ID3V2> (mapcar #'id (frames *id3*))
("TT2" "TP1" "TAL" "TRK" "TPA" "TYE" "TCO" "TEN" "COM" "COM" "COM")
There it is, the first frame. However, there's no guarantee it'll always be the first frame, so you should probably look it up by identifier rather than position. That's also straightforward using the FIND
function.
ID3V2> (find "TT2" (frames *id3*) :test #'string= :key #'id)
#<TEXT-INFO-FRAME-V2.2 @ #x73d04cca>
Now, to get at the actual information in the frame, do this:
ID3V2> (information (find "TT2" (frames *id3*) :test #'string= :key #'id))
"Byla Cesta^@"
Whoops. That ^@
is how Emacs prints a null character. In a maneuver reminiscent of the kludge that turned ID3v1 into ID3v1.1, the information
slot of a text information frame, though not officially a null-terminated string, can contain a null, and ID3 readers are supposed to ignore any characters after the null. So, you need a function that takes a string and returns the contents up to the first null character, if any. That's easy enough using the +null+
constant from the binary data library.
(defun upto-null (string)
(subseq string 0 (position +null+ string)))
Now you can get just the title.
ID3V2> (upto-null (information (find "TT2" (frames *id3*) :test #'string= :key #'id)))
"Byla Cesta"
You could just wrap that code in a function named song
that takes an id3-tag
as an argument, and you'd be done. However, the only difference between this code and the code you'll use to extract the other pieces of information you'll need (such as the album name, the artist, and the genre) is the identifier. So, it's better to split up the code a bit. For starters, you can write a function that just finds a frame given an id3-tag
and an identifier like this:
(defun find-frame (id3 id)
(find id (frames id3) :test #'string= :key #'id))
ID3V2> (find-frame *id3* "TT2")
#<TEXT-INFO-FRAME-V2.2 @ #x73d04cca>
Then the other bit of code, the part that extracts the information from a text-info-frame
, can go in another function.
(defun get-text-info (id3 id)
(let ((frame (find-frame id3 id)))
(when frame (upto-null (information frame)))))
ID3V2> (get-text-info *id3* "TT2")
"Byla Cesta"
Now the definition of song
is just a matter of passing the right identifier.
(defun song (id3) (get-text-info id3 "TT2"))
ID3V2> (song *id3*)
"Byla Cesta"
However, this definition of song
works only with version 2.2 tags since the identifier changed from "TT2" to "TIT2" between version 2.2 and version 2.3. And all the other tags changed too. Since the user of this library shouldn't have to know about different versions of the ID3 format to do something as simple as get the song title, you should probably handle those details for them. A simple way is to change find-frame
to take not just a single identifier but a list of identifiers like this:
(defun find-frame (id3 ids)
(find-if #'(lambda (x) (find (id x) ids :test #'string=)) (frames id3)))
Then change get-text-info
slightly so it can take one or more identifiers using a &rest
parameter.
(defun get-text-info (id3 &rest ids)
(let ((frame (find-frame id3 ids)))
(when frame (upto-null (information frame)))))
Then the change needed to allow song
to support both version 2.2 and version 2.3 tags is just a matter of adding the version 2.3 identifier.
(defun song (id3) (get-text-info id3 "TT2" "TIT2"))
Then you just need to look up the appropriate version 2.2 and version 2.3 frame identifiers for any fields for which you want to provide an accessor function. Here are the ones you'll need in Chapter 27:
(defun album (id3) (get-text-info id3 "TAL" "TALB"))
(defun artist (id3) (get-text-info id3 "TP1" "TPE1"))
(defun track (id3) (get-text-info id3 "TRK" "TRCK"))
(defun year (id3) (get-text-info id3 "TYE" "TYER" "TDRC"))
(defun genre (id3) (get-text-info id3 "TCO" "TCON"))
The last wrinkle is that the way the genre
is stored in the TCO or TCON frames isn't always human readable. Recall that in ID3v1, genres were stored as a single byte that encoded a particular genre from a fixed list. Unfortunately, those codes live on in ID3v2—if the text of the genre frame is a number in parentheses, the number is supposed to be interpreted as an ID3v1 genre code. But, again, users of this library probably won't care about that ancient history. So, you should provide a function that automatically translates the genre. The following function uses the genre
function just defined to extract the actual genre text and then checks whether it starts with a left parenthesis, decoding the version 1 genre code with a function you'll define in a moment if it does:
(defun translated-genre (id3)
(let ((genre (genre id3)))
(if (and genre (char= #( (char genre 0)))
(translate-v1-genre genre)
genre)))
Since a version 1 genre code is effectively just an index into an array of standard names, the easiest way to implement translate-v1-genre
is to extract the number from the genre string and use it as an index into an actual array.
(defun translate-v1-genre (genre)
(aref *id3-v1-genres* (parse-integer genre :start 1 :junk-allowed t)))
Then all you need to do is to define the array of names. The following array of names includes the 80 official version 1 genres plus the genres created by the authors of Winamp:
(defparameter *id3-v1-genres*
#(
;; These are the official ID3v1 genres.
"Blues" "Classic Rock" "Country" "Dance" "Disco" "Funk" "Grunge"
"Hip-Hop" "Jazz" "Metal" "New Age" "Oldies" "Other" "Pop" "R&B" "Rap"
"Reggae" "Rock" "Techno" "Industrial" "Alternative" "Ska"
"Death Metal" "Pranks" "Soundtrack" "Euro-Techno" "Ambient"
"Trip-Hop" "Vocal" "Jazz+Funk" "Fusion" "Trance" "Classical"
"Instrumental" "Acid" "House" "Game" "Sound Clip" "Gospel" "Noise"
"AlternRock" "Bass" "Soul" "Punk" "Space" "Meditative"
"Instrumental Pop" "Instrumental Rock" "Ethnic" "Gothic" "Darkwave"
"Techno-Industrial" "Electronic" "Pop-Folk" "Eurodance" "Dream"
"Southern Rock" "Comedy" "Cult" "Gangsta" "Top 40" "Christian Rap"
"Pop/Funk" "Jungle" "Native American" "Cabaret" "New Wave"
"Psychadelic" "Rave" "Showtunes" "Trailer" "Lo-Fi" "Tribal"
"Acid Punk" "Acid Jazz" "Polka" "Retro" "Musical" "Rock & Roll"
"Hard Rock"
;; These were made up by the authors of Winamp but backported into
;; the ID3 spec.
"Folk" "Folk-Rock" "National Folk" "Swing" "Fast Fusion"
"Bebob" "Latin" "Revival" "Celtic" "Bluegrass" "Avantgarde"
"Gothic Rock" "Progressive Rock" "Psychedelic Rock" "Symphonic Rock"
"Slow Rock" "Big Band" "Chorus" "Easy Listening" "Acoustic" "Humour"
"Speech" "Chanson" "Opera" "Chamber Music" "Sonata" "Symphony"
"Booty Bass" "Primus" "Porn Groove" "Satire" "Slow Jam" "Club"
"Tango" "Samba" "Folklore" "Ballad" "Power Ballad" "Rhythmic Soul"
"Freestyle" "Duet" "Punk Rock" "Drum Solo" "A capella" "Euro-House"
"Dance Hall"
;; These were also invented by the Winamp folks but ignored by the
;; ID3 authors.
"Goa" "Drum & Bass" "Club-House" "Hardcore" "Terror" "Indie"
"BritPop" "Negerpunk" "Polsk Punk" "Beat" "Christian Gangsta Rap"
"Heavy Metal" "Black Metal" "Crossover" "Contemporary Christian"
"Christian Rock" "Merengue" "Salsa" "Thrash Metal" "Anime" "Jpop"
"Synthpop"))
Once again, it probably feels like you wrote a ton of code in this chapter. But if you put it all in a file, or if you download the version from this book's Web site, you'll see it's just not that many lines—most of the pain of writing this library stems from having to understand the intricacies of the ID3 format itself. Anyway, now you have a major piece of what you'll turn into a streaming MP3 server in Chapters 27, 28, and 29. The other major bit of infrastructure you'll need is a way to write server-side Web software, the topic of the next chapter.
- Structure of an ID3v2 Tag
- Defining a Package
- Integer Types
- String Types
- ID3 Tag Header
- ID3 Frames
- Detecting Tag Padding
- Supporting Multiple Versions of ID3
- Versioned Frame Base Classes
- Versioned Concrete Frame Classes
- What Frames Do You Actually Need?
- Text Information Frames
- Comment Frames
- Extracting Information from an ID3 Tag
- Инструкция INSERT INTO ... FROM ... UNION ...
- Information request
- The final stage of our NAT machine
- Creating CDs from the Command Line
- Instagram как платформа для реализации.
- На всех дисках моего компьютера есть папка System Volume Information. Для чего она нужна?
- Chapter 2. Four Puzzles From Cyberspace
- Convection Currents of Information
- Installing from CD or DVD
- Starting X from the Console by Using startx
- Use Essential Commands from the
- Logging In and Out from a Remote Computer