Hacker News new | past | comments | ask | show | jobs | submit login

I've been working on news recommendation for the past few years and totally agree with your point. Often time you can't have enough history for CF to work.

I'd encourage you to go further that tf-idf, can improve a lot keyword extraction and based on that improve your overall recommendation.

A simple, basic approach is to create a taxonomy of "entities" that you know are relevant for you. Often time these are so specific and particular that even if they appear a single time in the text they have to be considered keywords, wether tf-idf says so or not. It's clear that a text that simply reference that entity would have that as a keyword, which may be wrong, but most of the time you'll be correct.

As an example, I don't know, "coconut oil soap". Will tf-ids ever surface it? Hard to say. Is it relevant to your business and thus recommendation? I think more.

Happy to chat about this anytime, shut me an email.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: