Hacker News new | past | comments | ask | show | jobs | submit login

I'm not entirely sure I understand what is happening, I have some grasp of the aim/purpose. Also have pulled pages using CURL before. I'm just curious, does this mean you have to analyze their page (like console) and find the links like the ones you posted above for cloudfront,etc... and then assemble it so you can format it? Not my field but I use ublock/adblock plus and it's not enough all the time, some videos have an overlay which I swear has a double-click counter where you have to open 2 ads before you can actually push the play button to play the video.

It is interesting to grab data and package it yourself through your own reader but I wonder if it loses the site's design/feel... but you're probably just after the information anyway.




"... I use adblock plus and it's not enough all the time..."

Next time this happens it could be useful to make a submission to HN about it.

We might be able to identify and/or solve the problem.

I do not use an ad blocker nor do I use python or youtube-dl yet I never see any ads and I download all videos before watching them. Indeed I access the web for the information not the inconsistent design/feel.

What I posted above illustrates examples of two alternative approaches:

1. Block sources of undesired resources: ads, tracking, etc. The "links" listed are domains used for undesired resources.

2. Only make requests to sources of desired resources: the article and its accompanying images and video. The script extracts only what we want from the html page.

I use approach #2 more than #1.

As far as I know ad blockers use blocking, #1, exclusively. They need to maintain a list of domains to block.


I wasn't sure if you were serious about posting to HN about adblock haha, I complain enough as it is about my pathetic life.

In the case of the videos, they're sourced from other domains, it's funny the site's argument is "Google does this too technically so why are we different?" But I think the embedded iFrame's that contain the video players have their own ads. I don't know, it's hard to read JS code when it's minfied (un-minify it).

You mentioned python?

Anyway thanks for the response.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: