To elaborate on this, RSS-Bridge maintains custom extraction rules for many sites [1], but RSS-proxy attempts to do that for any site using some pretty nifty logic [2]. I tried it on a few pages and it seems to do its job accurately, if the HTML is good enough.
Semantic HTML5 can be so helpful, especially in regards to tools like this one, or crawlers, or generally accessibility-wise. Unfortunately, its syntax is not enforced by browsers (for good reasons), so many developers never leverage it - and we end up with many custom-built sites falling back to <div>s everywhere, JS click handlers on non-interactive elements etc. I really wish modern framework docs and guides would more strongly hint to accessibility concerns.
Semantic HTML5 can be so helpful, especially in regards to tools like this one, or crawlers, or generally accessibility-wise. Unfortunately, its syntax is not enforced by browsers (for good reasons), so many developers never leverage it - and we end up with many custom-built sites falling back to <div>s everywhere, JS click handlers on non-interactive elements etc. I really wish modern framework docs and guides would more strongly hint to accessibility concerns.
[1] https://github.com/RSS-Bridge/rss-bridge/tree/master/bridges [2] https://github.com/damoeb/rss-proxy/blob/master/packages/cor...