Hacker News new | past | comments | ask | show | jobs | submit login

I've found Jesse Ruderman's Lithium very useful for this: http://www.squarefree.com/lithium/

It actually works on pretty much any filetype too (with the --chars flag)




For certain values of works, that's true. It will take a long time and won't produce very good results.

structureshrink more or less came out of frustration with what a bad job most of the existing shrinkers do. In particular the algorithm that lithium uses is more or less structureshrinks last ditch "Welp, everything else has failed, might as well try this" approach.


Lithium'll fail on anything where you need to make correlated deletions, true. But you'd have similar issues with eg input that contains sizes, pointers, checksums...which aren't going to work with a generic strategy.

As to speed...well the kind of problem I was using it on was reducing an excessive dataset that was used by the entire test suite of an app; so each iteration was a complete test run of a minute or two. I was under no illusions that this was going to be fast, so I left the script to do its thing and came back to check on it after lunch. In terms of time I didn't have to waste hacking that data myself, it was a win.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: