Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Pisa – Performant Indexes and Search for Academia (github.com/pisa-engine)
17 points by amallia on Aug 13, 2019 | hide | past | favorite | 9 comments



Does it have the ability to convert PDFs to searchable databases?

I mean you have millions of PDF articles, so can you search them from a search engine to find appropriate materials?


Yes, you can index any content as long as it is parsable. You can easily plug a new PDF parser and start indexing your PDF articles...


It would really help if there was an example in the README


Yeah I'm pretty confused as to the use case. How does it allow researchers to "experiment with state-of-the-art techniques" or enable "rapid development"?


Thanks for the feedback!


Why do people choose such names for their product? It's s city and its one of the most widely reported works in education (programme for international student assessment) with millions of news articles. How do you want to be findable with a name like this?


We rely on different user intents. Search for "PISA engine" or "PISA search" and you will find this tool and not the city. Being found by the keyword `PISA` on Google is not necessarily our main goal. We are not selling anything, the community is just building a product for IR research. We are willing to gain interest from skilled people in the end. I don't think it really matters if there are `news` about the city, we are not targeting the news anyway.

Also, think about `Windows` or `Go`, there is no way you get confused once these words are in the right context.


While poor search recall is a bit ironic for the name of a search engine, I guess it's because Antonio has studied at the University of Pisa and later worked as researcher for the CNR institute for research, again in Pisa. (Me too, hi!).


Good to see you Marko!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: