Hacker News new | past | comments | ask | show | jobs | submit login

I'm a huge wget fan! It's the core tech inside my archiving tool: https://github.com/pirate/bookmark-archiver



Very nice. I've been using wget by itself to archive various government pages, but that approach is reaching its end-life with JS-heavy sites becoming more prevalent. You use headless Chromium for screenshots; is it possible to use it to execute a page's JS and save the resulting HTML?


Yes, headless chrome `--dump-dom` allows you to dump the <body> html after the page loads. I opted not to do that in bookmark-archiver since glueing it back to the <head> code to get a working static page was complicated and error prone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: