Reposts are very common on Hacker News. I wonder if incorporating the performance of the same links with differing titles might've yielded a better training set.
Posting guidelines on HN generally require you to not editorialize titles, so most reposts have the exact same title (since it's the title of the original article).