Hacker News new | past | comments | ask | show | jobs | submit login

A few webcrawlers[1] out there follow HTTP redirect headers and ignore the change in schemas (this method is different of OP's but achieves the same goal).

So anyone can create a trap link such as

    <a href="file:///etc/passwd">gold</a>
Or

   <a href="trap.html">trap</a> 
once trap.html is requested the server issues a header "Location: file:///etc/passwd"

Then it's just a matter of seat and wait for the result to show up wherever that spider shows its indexed results.

[1] https://github.com/scrapy/scrapy/issues/457




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: