Hacker News new | past | comments | ask | show | jobs | submit login

Please do not try to do this while attempting to mirror Elastic Search capabilities.

I know what I am talking about. Back in 2000's I was asked to build a search engine. Parsing data from image EXIF information and indexing that into a taxonomy - three levels down and with counts. In MySQL 3.x.

Before that, the company went through multiple vendors who charged fortunes and were not capable of doing this properly, quite shockingly. One was Autonomy, and that thing just straight up could not do a taxonomy even at the top level.

It was 6 weeks of doing the impossible, writing very fragile SQL queries where performance was different literally if you rearranged the SELECT columns. We did it, amazingly, but this is not something I will ever do again. Databases are essentially the same, but search engines have come a long way.

As an intellectual exercise, please go head. "You just tokenize and then you are done!"

No, a search engine does a LOT more than just splitting your corpus of text into tokens. Soon after you are "done", new requirements come in. Taxonomy navigation? Multiple languages support? Automatic synonyms? Spellcheck "Did you mean" functionality? Performance at massive scale?

You will engineer yourself right into a corner. Just use a search engine for your own sanity.

Finally, there are things for syncing PG and ES data - ZomboDB, PGSync.




I think this comment highlights the two separate discussions going on in this thread. If you're building a customer-facing search engine, avoid reinventing the wheel by leveraging powerful tools like elastisearch.

On the flip side, if you're a data analyst or developer who has a large database with one or more text columns they want results from in a more flexible way than using "LIKE/ILIKE" SQL queries, it's probably easier and faster to create an FTS index/table in that database to get them 90% of the way there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: