Several days ago, I set out on an epic quest to organize my music library of roughly 10,000 songs. It's a daunting task, and one I've put off for a while, because my files are in several different places in several different formats with several different ID3 tag/naming standards. And there are more problems too: identifying mislabeled files I got from other places, weeding out corrupted files, etc, etc etc. Suffice it to say that it's been an intense few days with me in a zen-like state of focus trying to get everything in its right place. So here's the short version of how I did it.
The first (major) problem I faced was the massive swaths of files that had incorrect ID3 tags. For this problem I needed a competent ID3 tagger. About a year ago I
asked on ubuntu-users what the list's tool of choice was. Suggestions ranged from
Quod Libet to
Foobar2000 on
WINE (outstanding reviews: "best program ever" and "the only program i miss since switching to ubuntu") to
gtkpod to
EasyTag (mixed reviews: "tolerable" to "great") to
id3v2 (command line based) to normal music players like
amarok and
banshee. There is also a
Python module for accessing and editing ID3 tags, if I felt like scripting some things myself.
Eventually I settled on
Picard, the tagger that
Musicbrainz (a great music metadata site) recommends. Thanks to Ubuntu and the Picard team, installation is as simple as adding
deb http://ftp.musicbrainz.org/pub/musicbrainz/users/luks/ubuntu edgy musicbrainz to /etc/apt/sources.list (even if you're not running edgy), importing the PGP key with
wget http://ftp.musicbrainz.org/pub/musicbrainz/users/luks/public.key -O- | sudo apt-key add - and then a
sudo aptitude update and
sudo aptitude install picard. There are only a few
limited operations Picard can perform, but they are very powerful. Picard interacts with the Musicbranz database to identify tracks in an album-based scheme. It correctly identified and summarily tagged more than 95% of the CDs I threw at it, and I listen to some pretty obscure/out there music. The ones it couldn't recognize were mostly tracks that didn't have a lot of existing context to go with them, and although I don't know exactly what information Picard looks at to determine the ID3 tags, I suspect that was the problem. There are a few problems with Picard, however. First of all, the UI sucks. Second, there are very limited capabilities. I would like to see, say, options that move files around on the file system depending upon how Picard tags (or fails to tag) them. The albums that have their ID3 tags written by Picard don't seem to be organized in any meaningful way, except in the current running instance of the program, where they are added to the bottom of the list (which is also annoying because of the need to keep scrolling to the top to add files to tag and then to the bottom to click and drag the files it missed and back again). Increased automation in the program would be nice, too, as I found myself performing the same operations over and over again. There are also minor issues like it opening up a new browser tab on each database lookup (leading me to have firefox open with ~100 or more tabs at some points). Fortunately for the future Picard user, I saw most of these issued acknowledged and marked for future improvement in the
documentation.
The second (major) task ahead of me was conversion of all of my files into the near-universal
mp3 format (yes, I know I should be gunning for
ogg, but you can't use those files on your iPod... well, without the free
firmware you can load onto it... another day, another day). For
m4a files, I used the script found at the end of
this thread which happily copied my ID3 tags as well. For ogg files, I would need something more general. At first glance,
SoX seemed like a reasonable tool, but it needs to be recompiled for mp3 support. I also found some other
scripts, but those didn't copy the ID3 tags, which made them unacceptable for my purposes. I asked in the
freenode #ubuntu channel regarding my quandary, and someone mentioned Soundconverter as an option.
Soundconverter is a simple tool, but gets the job done. I installed it with a simple
sudo aptitude install soundconverter. It provides a nice drag-and-drop interface that doesn't require knowledge of ID3 tag layouts or encoding/decoding details. Most of the options are located in the
Edit->Preferences menu; the rest of the program is idiot-proof. Just drag the files, and hit
Convert. It does do some annoying things, like saving files in
URL-encoded names (example: files in folder "Cannibal Corpse" with the option to save in the same folder on were saved in a newly created folder, "Cannibal%20Corpse"), but generally works well.
Things were a bit more complicated than this, of course. I was making nightly backups of my changes to a local FTP server, wrote up some Python scripts to assist me in my task of keeping track of where everything was going and manually edited some outlying tags in amarok -- but now my music library is (almost) officially organized. Success! Now to just incorporate all of my friends' libraries...
Some cool links I found along the waySome relevant command line tricks- find -type f ! -iregex '\(.*.flac\|.*.ogg\|.*.m4a\|.*.mp3\)' -delete -or -empty -delete