Hacker News new | past | comments | ask | show | jobs | submit login

You need to give it a little more time. It does not sound like you have spent much time looking through the documentation. Dont forget to check out the open and closed issues for your hiccups. Its worth pointing out that its not a mind reader, you do have to put in a little effort for corner cases.

I am not 100% sure what you mean about merge. You can just `cp LATEADDITIONS Music/artist/ablum` and then "reimport the album" with `beet import Music/Artist/Album`.

As far as cd1/cd2 goes you are only "fairly well screwed" if you can not type `$ mv cd1/* cd2 ; beet import cd2`. I think there is another way to do this but I am not a beets guru. I seem to recall there is a discussion of this in one of the issues on github. But lets face it `mv CD1/* CD2/ ; beet import CD2` is not that difficult.

If you have a big directory full of unsorted albums there is also this new gem:

   The importer has a new interactive option (*G* for "Group albums"),
    command-line flag (``--group-albums``), and config option
    (:ref:`group_albums`) that lets you split apart albums that are mixed
    together in a single directory. Thanks to geigerzaehler.



I'm going to give it more time, I didn't mean to sound like I was giving up. I spent the last hour reading through the importer code to see how it might handle this case but there's really no way of doing it in the tool (there's an open issue for it [0] [1]).

I used "fairly well screwed" in the context of trying to automate the cleanup of 60,000+ songs. One album isn't an issue - 10,000 or so where you need to keep dipping in and out of the tool to manually move stuff around isn't really ideal.

Just to confirm, since you seem to have some experience - you can reimport an album to clean it up again (say after adding additional files) by calling import on a path in your library? So it's ok to mess with the files in the library and then fix the db later by importing? From what I saw in the code I didn't see anything that would handle removing / moving files, do you know if it handles those cases?

Again, my language was probably a little strong, and mostly only applies to the more extreme situation I'm in.

ps your other comment about doing the import in 2 steps seems like a bit of a life saving hint. Will definitely go that way when / if I commit to this.

[0] https://code.google.com/p/beets/issues/detail?id=380

[1] https://github.com/sampsyo/beets/issues/112


The beets reimport functionality should really be thought of as a sanity check / rescan / quality control. So it will recognize the "untracked files" to borrow some git terminology. That is for adding files. As far as removing, moving, modifying files (i am assuming you mean metadata?) goes take a look at the move/remove/update beet commands.

What follows is a slightly rambly not well edited description of my importing. I apologize for the length/grammar.

I think I was in a similar situation as you. I started with 5 or 6 big collections of music that had diverged over the years; flac/shn bootlegs, old itunes installations, xmms libraries, etc and a couple of small collections from netbooks:

  /music/itunes
  /music/olditunes
  /music/olditunes-ibookg3
  /music/current
  /music/blah
  ...


Before touching beets I ran rdfind over the itunes directories and then over the non itunes directories. rdfind leaves the copy in the directory listed earliest so I listed the directories in order of most recently used. This essentially wiped out the oldest directory from the itunes and non-itunes sets. I then ran rdfind against all of the directories (once again in order of most recently used). The amount of space I saved was insane.

Once I had manually cleaned up the low hanging duplicate fruit I did the two step import. I got hit with the CD1/CD2 quirk the first time I did the two step import. Because I had the backup directory I just blew away the beet library and started over. This time before I ran the quiet import I tried to do a best effort of consolidating the cd1/2 albums. `tree` came in handy for finding the problem albums:

  $ tree -i -f -d --prune /music |grep -i "cd1\|disc1\|cd\ 1\|disc\ 1""
I did the `mv cd1/* cd2/` by hand. It did not take as long as I thought it would. I gave it my best effort and reran the quiet import. This solved a ton of the problems. Before doing the interactive import I sorted /music with

  $ du --max-depth=3 -h /music |sort -h 
Because I used the move import instead of copying it was easy to see if there were any recurring problems. The only things that are left are the things beets could not ID by itself. My biggest import hiccup was concerts from archive.org's etree archive. I still have to move them in. I am still deciding how I want to handle the one hit wonder songs. Right now my config has the following for paths:

  paths:
      default: $albumartist/$album%aunique{}/$track $title
      singleton: 0xSingles/$genre/$artist/$title
      comp: 0xCompilations/$genre/$album%aunique{}/$track $title
The singleton and compilation defaults are:

    singleton: Non-Album/$artist/$title
    comp: Compilations/$album%aunique{}/$track $title
This was going to mean my non-album and compilation album directories had hundreds of directories.

Feel free to followup with any more questions.


ARGH! I just saw this on the man page:

  If you have an album that's split across several directories under a common
  top directory, use the --flat option.  This takes all the music files under the
  directory (recursively) and treats them as a single large album instead of as
  one album per directory. This can help with your more stubborn multi-disc albums.
That might be easier than doing all the mv cd1 cd2. I have never used it. It will increase the interactivity of the import.


Amazing - thanks for taking the time to write up the process you went through. My situation sounds very similar and I can definitely reuse a large chunk of your approach.

I'd planned on doing de-duplication first. In my case I have legacy copies of libraries - but also a lot of duplication in within the main itunes library (due to exploding itunes dbs some time ago - that was the point I switched to spotify so my ocd didn't kill me). Should be easy enough to script something to handle that scenario. I did something similar a while ago to sort out my iphoto lib(s).

I saw the flat thing (I noticed it in the code). I don't think it would work in the particular case I ran into as the two discs are at the same level. As you say though, if I do a bit of grepping I could probably catch and adjust many of those before I start.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: