Hacker News new | past | comments | ask | show | jobs | submit login

Cool, works better than other services like this I've found. I tried it on Ars Technica's review of the Xoom tablet and it found all 10 pages. It didn't find the embedded video though. Also, all the formatting is stripped which makes it hard to differentiate section headers from content paragraphs, and all the images are in one list to the side, removed from their original context.

What I'd really love to see is a combination of the RSS API and the article API to produce full article RSS feeds for any site.




Right now, the API just returns back the raw text for simplicity's sake, but it would be possible to make an option for returning a bit of HTML structure, which would address the problem of sections, inline images, tables, etc.

The combination of the two APIs is a great idea.


It would be great if you could return a normalized and simplified version of the HTML structure. I know a lot of people who would be interested in this.


Yes, textile please!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: