Launch HN: Humanloop (YC S20) – A platform to annotate, train and deploy NLP

ZeroCool2u · on July 29, 2020

This looks pretty great, though the SaaS model is an absolute non-starter for my own usage unfortunately. We've been pretty prolific users of Explosion AI's (Makers of SpaCy) Prodigy [1] and actually the interfaces look very similar. What would you say the core differences are between Humanloop and Prodigy?

1: https://prodi.gy/

razcle · on July 29, 2020

Thanks! Prodigy is a good tool and we definitely were inspired by some of their UX decisions. Reducing each decision to a small atomic unit and avoiding context switching makes a lot of sense.

Our starting place is similar to Prodigy in that we also see active learning as a key piece of the puzzle but we think to make active learning work reliably really does need taking into account parameter uncertainty. As far as I know Prodigy doesn't do this. We are also working to make our active learning work at the level of batches and be cost-aware. Often the most valuable examples to label for the model are the most time consuming for humans and we work to trade this off.

A few other differences are that we do offer a cloud hosted solution so getting set up is much faster and it's more natural for us to be able to accomodate team annotation and quality assurance. By providing a hosted model we also give you the option of deploying features very quickly and continuing to improve them post deployment.

I'd be curious to know the barriers that Saas introduces for you?

ZeroCool2u · on Aug 3, 2020

Good to know! I work with a lot of highly classified information. Much to my teams frustration, the notion that any of which will be cloud hosted on GCP or AWS was laughable until somewhat recently. It will probably be years until we, and I imagine many institutions like ours, can take advantage of something like this. I appreciate the effort with the VPS / private cloud hosting option, but without an on-prem deployment option, this wouldn't even make it passed initial discussions.

gauravsc · on July 29, 2020

https://jacobbuckman.com/2020-01-17-a-sober-look-at-bayesian...

"But in practice, BNNs do generalize to test points, and do seem to output reasonable uncertainty estimates. (Although it’s worth noting that simpler approaches, like ensembles, consistently outperform BNNs.)"

Maxen2020 · on July 31, 2020

It looks awesome!

I see the snorkel logo on the website and they recently also launched snorkel flow for data annotation and model training. There isn't much detail on that, but I wonder is there any advantage humanloop has over that?

On the same track, prodigy also has a prodigy team version that is being ready for launch forever. So glad you guys are few steps ahead.

I am also building a labeling interface myself because I couldn't find the right product for my needs(I have tried tools like label studio, doccano, prodigy, dataturks and ml annotate). They just miss one thing or the other. I really wish there is one place that features like HTML support, hierarchical labels, active learning, batch labeling, project tracking, multi user management and most important the UI/UX are all well put together.

razcle · on Aug 6, 2020

Hi, sorry I missed this is earlier! We're in many ways complementary to weak labelling techniques (like Snorkel) and are actually working to include them in the tool. We think weak labelling is a great way to overcome cold starts and active learning then helps you improve rapidly.

The big difference between us and Snorkel is our emphasis on active learning and HITL deployment. We think the existing paradigm of ML deployment is very waterfall and slow.

Would love to hear about the annotation interface your building. Agree there should be one place with all those features! (We're hoping it will be Humanloop ;) ).

Rickasaurus · on July 29, 2020

Is this something we will be able to buy and run on our servers? I don't think we're the only ones wary of working hard to develop IP for a different company.

Also predictions/month pricing is just really challenging and incompatible with many downstream business models. The value has to be really huge to justify that.

razcle · on July 29, 2020

The model you train and data you upload are yours to own, unique to you and don't get shared across users or reused for any other tasks so hopefully you shouln't feel too much like your building our IP ;)

In terms of deployment options, we're trying to lead with cloud hosting by default but know that for a lot of people the whole reason they're annotating in house is privacy so we've been exploring deploying in your VPC and for larger enterprises on-prem.

Interested to hear more of your thoughts on the pricing model, this is something we're still iterating on so I'd be interested what you think would be most compatible with your use cases?

ianbutler · on July 29, 2020

Neat how do you compare yourself on the annotation capabilities with Datasaur.ai which launched in the last YC batch?

In terms of training the models for deployment -- do we own the artifact? Can I move that into my own model repository?

Also how do you feel this compares to using fine tuning on a publicly available BERT family model which is already fairly fast and easy not requiring a huge corpus, speaking from experience of recently having done so?

Are the benefits more from the tight feedback loop and already standing infrastructure?

jordn · on July 29, 2020

All great questions!

Datasaur are great. I hope Ivan would think it's fair that I'd describe their current product as as a modern, cloud-hosted Brat (https://brat.nlplab.org/ – this remains very popular!) with the features to make that work with teams. As you point out we're focusing on the tight integration of annotation and training enabling you to move faster and iterate on NLP ideas... essentially trying for move a waterfall ML lifecycle to a an agile one.

Fine tuning on BERT is the way to go. It's what we do, and that already reduces the data annotation requirements by an order of magnitude. Doing that offline in a notebook is still wanted by some (you can use our tool just as the annotation platform, and download the data and you'll still get the efficiency benefit through active learning) but integrating or deploying that model is still a time-suck. Having the model deployed in the cloud immediately has a load of supplementary benefits (easy to update, can always use the latest models etc) too, we hope.

(edit: typos)

julvo · on July 29, 2020

Firstly, congrats on the launch! Active learning is a super interesting space.

You say it's possible to download the data and use Humanloop for annotation only while still benefitting from active learning. I'm curious about your experience with how much active learning depends on the model. Are the examples that the online model selects for labelling generally also the most useful ones for a different model trained offline?

jordn · on July 29, 2020

Cheers. It's a good thing to be wary of. Poor use of active learning will end up biasing the data according to the model it's trained on – so that data won't be the best X samples to train on a different model. Most of this issue comes from bad active learning selection methods. If you have well calibrated uncertainty estimates and sample for diversity and representiveness too, it's far less of a concern.

ianbutler · on July 29, 2020

Great answers, thank you very much!

flyx · on July 30, 2020

Congrats to the Humanloop team for the launch!

Ivan chiming in from Datasaur here. As jordn pointed out, Datasaur does view itself as a full labeling platform, which encompasses an optimized labeling interface, a workforce management tool in addition to intelligence and active learning. Unlike Humanloop, we are focused solely on the labeling step of the process and do not offer a trained model at the end of the process. Our users have separate pipelines for this. Thanks for the question!

foobaw · on July 29, 2020

#1 and #2, if they work as advertised, are great features but a lot of other companies claim to do this but have failed.

One of the biggest problems I have is image annotation using CVAT - the tool works when the task is simple annotation but outputting the annotation data and integrating it has been a pain-point. Also CVAT has a tool is great but has a lot of missing features :/

anthonysarkis · on July 30, 2020

Related: Diffgram is working in the Vision (Image & Video) space. Not NLP yet.

Integration paint points are mentioned often. We are working on solutions here, eg: https://www.youtube.com/watch?v=w7yiW5wpnMg&t=128s Imagine adding bucket event triggers as next step here

Some really exciting features coming soon that make this even better. https://diffgram.readme.io/docs/what-is-diffgram

Can try shared platform and do private install for actual https://diffgram.com/user/new

We would love your feedback on missing features please feel free to email me directly anthony+hn@diffgram.com

yeldarb · on July 30, 2020

Shameless plug: this is exactly why we built Roboflow.

We’re trying to eliminate the one-off python scripts between labeling and training that everyone currently has to reinvent for themselves: https://roboflow.ai

epberry · on July 30, 2020

This workflow is still so complex to get right. Really excited to see more tools for it and try it out ourselves!

At visitorX we're building a fairly large bank of comments and a tagging system and Humanloop looks really great for that.

an_ml_engineer · on July 29, 2020

Cool! I'm curious, how do you compare your service to Scale (scale.com)?

razcle · on July 29, 2020

Hi,

Raza here (one of the other co-founders). Good question! I think our visions are quite different even if our starting points look similar.

Scale has always positioned themselves as an API to human labour and their goal is to abstract the labelling task away from the end user as much as possible. So scale works really well when you can easily outsource your annotation task.

Our ultimate goal is to try and give domain experts the ability to teach ML models themselves. We're much more focussed on NLP and on tasks that require domain expertise and are hard to outsource. For people where deep domain expertise matters or their are privacy concerns, Scale isn't really an option and we're building tools for them.

On another point, Scale makes its money by charging per annotation so we think they aren't as incentivised to reduce how much you need to label.

thanks!

caiobegotti · on July 29, 2020

Is it English-only or true NLP that would work with multiple languages? Congrats for the launch!

razcle · on July 29, 2020

We wrap a lot of popular frameworks and have implementations of most SOTA models. By default we use a multilingual BERT model so it should work out of the box on different languages.

jeffbarg · on July 29, 2020

Humanloop is such a great name for an AI platform :) Congrats on the launch!

jordn · on July 29, 2020

haha so great to hear! For a while google search kept trying to auto correct it to 'human poop'

alihabib123 · on July 29, 2020

This is really cool! Wish you all the best of luck!!

stuartaxelowen · on July 29, 2020

Do you allow for on-premise inference?

peadarohaodha · on July 29, 2020

Our default deployment option is cloud first for both training and inference at the moment, but we have thought about the ability for users to export a trained model. Either exporting the model parameters in some standardised format, or a compiled predict function, or a docker image that encapsulates a full inference service, etc. So if you could use this kind of export within your application, this would allow on-premise inference. This is something we could probably make available pretty quickly if necessary for your use case.

haffi112 · on July 29, 2020

What type of annotations do you offer?

jordn · on July 29, 2020

Right now document level classification and span tagging within text documents. These can also be combined (as in the landing page screenshot) so that for a given input, you're learning multiple tasks at once as you annotate.

The core of this platform should generally be independent of the data input type and the output labels, so we're building out other annotation options for our business customers. If there's a use case you would like it to support, it would be great to chat jordan[at]humanloop.com :)

hbcondo714 · on July 29, 2020

>> text documents

Congrats on the launch! Would Humanloop be able to support HTML files or URLs? A client of ours has a need to annotate verbose web pages.

razcle · on July 29, 2020

At the moment we dont support the ability to render the HTML but it is something that has come up before. One of the teams we're speaking to wants to classify blog posts and would like to be able to preserve their formatting. If this is something that's important to you we would consider adding it so maybe drop me an email at raza[at]humanloop.com and we can discuss?

hbcondo714 · on July 29, 2020

Thank you for your reply. Yes, preserving the formatting is important for us too.

ml_basics · on July 29, 2020

Great stuff!