A few webcrawlers[1] out there follow HTTP redirect headers and ignore the change in schemas (this method is different of OP's but achieves the same goal).
So anyone can create a trap link such as
<a href="file:///etc/passwd">gold</a>
Or
<a href="trap.html">trap</a>
once trap.html is requested the server issues a header "Location: file:///etc/passwd"
Then it's just a matter of seat and wait for the result to show up wherever that spider shows its indexed results.
So anyone can create a trap link such as
Or once trap.html is requested the server issues a header "Location: file:///etc/passwd"Then it's just a matter of seat and wait for the result to show up wherever that spider shows its indexed results.
[1] https://github.com/scrapy/scrapy/issues/457