Hacker News new | past | comments | ask | show | jobs | submit login

Yup, right now we use GROBID, do some post processing and combine the output with other extraction techniques. For instance, we use a model to extract document figures[1], so that we can render them in the resulting HTML document.

Also, we're working hard on a new extraction mechanism that should allow us to replace GROBID [2].

There's a lot of really smart people at AI2 working on this, I'm excited to see the resulting improvements and the cool things (like this) that we build with the results!

[1]: https://api.semanticscholar.org/CorpusID:4698432

[2]: https://api.semanticscholar.org/CorpusID:235265639




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: