Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Summarized Finance / Tech Newsletter Using Natural Language Processing (getthecrypt.com)
93 points by chidog12 on Feb 19, 2019 | hide | past | favorite | 6 comments



Amazing work chidog12, I've tied using TextRank for text summarization before, but the results are mostly hit or miss. I've found that, elements in the page like "Follow this reader", Share on twiiter, facebook frequently oocur in these articles and due to the voter based ranking of TextRank , they get picked up as high ranked sentences. Which of the algorithms mentioned in your site were most effective in extracting useful summaries?


Sorry for the late reply,

Honestly, we still need more testing.

However, our top 3 tend to usually be Luhn's Heuristic Method, Latent Sentiment Analysis, and TextRank.

On the other hand, we have yet to use a single summary produced solely using SumBasic... I'll need to read into that one more.


I'm curious about your semantic analysis process. Do you employ a particular threshold for neutrality, or does the team provide input?

No doubt neutral reporting is advantageous; personally I would find it helpful to also see both a highly positive and negative take. Coming at a topic from both high and low may, imho, improve accuracy of comprehension.


Hmm, I really like that last point you've made... both highly positive and negative takes, I'll look into that.

For the semantic analysis process, we are using Microsoft's API - scores from 0-1. And for the most part, it is pretty good based on our views on neutrality (which of course has some inherent biases).

0.5 is considered pretty indifferent, however, we allow for a threshold of 0.35 - 0.75 as neutral, but we specify it's lean on our end.

On Reddit, we've tested our summaries in the comments for tl;dr and we actually indicate the articles lean for example "This article is neutral with a negative lean on the topic".

In Finance, top publications do a great job on maintaining neutrality. However, if we ever get into politics... we'll need to take this a little more seriously!

Thanks for the comment.


I wonder how do you choose which news or story to deliver to users? From your recent link[1] it seems like the news is bit on the short side.

1. http://getthecrypt.com/recent/


lol yea, we are still working on our choosing process. Right now it's really just our preferences vs the criteria we've set up using NewsApi.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: