Hacker News new | past | comments | ask | show | jobs | submit login

Interesting, how are you planning to collect articles for the news "sagas", and how often do you want to update information? I'd say articles written on the same day as some event will always have more noise than summaries written later.



So I plan to collect every link I can, and then sort them into "events" that are based on something that actually happened.

This just moves the problem to being "what links to prominently display for an event". I'm not sure on the details, but I'll probably end up with some heuristics for what links are the best with an option to manually override it. I also want to implement a fair bit of user-controlled filtration (think along the lines of https://pcpartpicker.com/products/motherboard/, except for news articles instead of motherboards)

Organizing links is, I think, best done manually in the end. But with automation to make it easier. For example scrapping RSS feeds and using keywords to suggest where I might want to put them. I also want to make it possible for users of the site to add content (or for high profile articles, just suggest it, because spam sucks).

In some sense this is "realtime", but without any plan to race to make it happen as quickly as possible. Pushing updates to email of course less so. Triggering "this is a big enough of a change to send an email" will be a separate manual action. As for when that email will be sent, I'm thinking of setting it up so people can subscribe to a saga with daily (default)/weekly/immediately on change.


The BBC (in collaboration with The Guardian and PA) created an RDF ontology[0] a few years ago for describing "Story lines" stories that can span multiple articles over time.

That could be useful to use (or just take inspiration from).

[0] https://web.archive.org/web/20170719191914/http://www.bbc.co... - bbc.co.uk/ontologies is currently offline as a mitigation against the log4j incident, hence the archive.org link


You could run them through sentiment analysis too, categorise sources as left/right/middle etc. Flag obvious content farms/bias etc. I do think it's been done before but can't remember any details from the top of my head.


Would be great to have a timeline with "AP reported that 1) ..." with a list of say five facts and then "these 274 sites reported the same facts within the same 24h" or something similar with another early source identified. You could then add data like "Twitter user $USERNAME reported that ..." earlier in the timeline, or what have you.

I guess if it got popular you'd get news/syndication sites doctoring their publication times to always appear first on the timeline.

Would be useful to look at something like a timeline of political events spanning a decade, or a topic like the JET (Joint European Torus project) with all the stories that made mainstream news.

You could add mashups like what the main entertainment news was, or what the other main stories were, or stock tickers, alongside the news timelines.

I guess focus and where to prune the branching news stories will be hard.

The UK PM's handlers are constantly using the technique of burying 'boring' news by having him do something really stupid that is expected to fill news cycles for a few days. Their manipulation of the media has been masterful pulling in all the learning from the USA; I digress. The point being it would be interesting to plot a timeline of apparent buffoonery, or short-lived outrage stories, and the actually important also-ran news stories that the buffoonery was there to hide.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: