Hacker News new | past | comments | ask | show | jobs | submit | farbarg's comments login

There are a lot of tools in this space. Shameless plug to follow.

I helped build and use Disdat, which is a simple data versioning tool. It notably doesn't have the metadata capture libraries MLFlow has for different model libs, but it's meant to a lower-layer on which that can be built. Thus you won't see particulars about tracking "models" or "experiments", because models/experiments/features/intermediates are all just data thingies (or bundles in Disdat parlance). For the last 2+ years we've used Disdat to track runs and outputs of a custom distributed planning tool, and used Disdat-Luigi (an integration of Disdat with Luigi to automatically consume/produce versioned data) to manage model training and prediction pipelines (some with 10ks of artifacts). https://disdat.gitbook.io/disdat-documentation


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: