Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Is Someone Hijacking Google Images?
173 points by squarefoot on March 18, 2019 | hide | past | favorite | 72 comments
I was looking for ideas on how to build a simple network analyzer to test antennas, filters etc. so I typed "network analyzer schematic" (without quotes) on Google Images and it apparently returned some results I was expecting, but clicking on a lot of results from the first page opened some subscription only websites with suspicious names nagging me to create an account to see the actual images, some of which I'm 100% sure I already have seen on their original authors websites. Those websites are clearly made by the same entity, and to me it appears they're essentially hijacking Google Images results for their profit. Here are some of those results; many more on the 1st page. I had a hard time finding something that returned an actual loadable image or an article without asking for subscription. Note that they all return URLs containing "spectrum analyzer schematic" although I searched for "network analyzer schematic".

http://yyrfm.microdeo.de/spectrum_analyser_circuit_diagram.php

https://4.twizer.co/rf-spectrum-analyzer-schematic.html

http://5.6.gvapor.nl/rf_spectrum_analyzer_schematic.php

http://18.10.ulrich-temme.de/spectrum_analyzer_schematic_symbols.php

http://4.3.beckman-vitamin-d.de/vcr_tuner_based_rf_spectrum_analyzer_schematic.php

http://18.10.ulrich-temme.de/spectrum_analyzer_schematic_symbols.php

http://2.4.wohnungzumieten.de/gbppr_1_ghz_spectrum_analyzer_second_local_oscillator_schematic.php

http://5.6.gvapor.nl/rf_spectrum_analyzer_schematic.php

http://13.1.starpartybus.com/lm3915_spectrum_analyzer_schematic.php

Edit: it appears those pages are being slowly buried by legit results, but some of them still surface although much deeper.

Examples:

http://3.17.tierarztpraxis-ruffy.de/pna_x_block_diagram.php

http://10.10.artatec-automobile.de/block_diagram_power_antenna_wire.php

http://9.20.wohnungzumieten.de/logic_probe_with_sound_circuit_schematic.php

7.12.gvapor.nl/dds_function_generator_mcu_schematic.php

Note that I searched for the same exact phrase as above.




Isn't this basically what Pinterest is doing? All their images are stolen from elsewhere on the web (and the website behaves extremely obnoxiously when you try to actually view an image without an account), and yet Pinterest results consistently rank higher than the actual primary sources of the stolen images.


Pinterest should not even be allowed to display in google images, they force an app download in mobile then default hit you with nonsense notifications. Similar deal in desktop.

Google should get better at cataloging pinterest and stock photos in images, both break the normal rules for websites displaying in traditional search.

A recent googler ama pointed out increased improvements in google images, i hope that includes crushing this nonsense.


The whole Pinterest thing grinds my gears so much. It shows complete incompetence from Google as they let this kind of bullshit slide. Not to mention that Pinterest doesn't let you see the actual image without signing up. Like, seriously?


Pinterest allows you to create an account by using google account, I suspect that google gets a few users from them thats why they allow it


"please register so you can see the original image at 10% of its size with no link to the original site!"

How is this legal?



On desktop browsers, right click on the preview and 'View Image...' tends to get you the original full sized image.


Works sometimes, but not everystime.


There's a browser extension that brings back the old View Image button for anyone who misses that feature

https://github.com/bijij/ViewImage


That's what 'tends to get' means.


It's not, but since copyright infringement is a civil infraction, not a criminal one, it's up to the victim to sue.


In some places copyright infringement done as as part of trade makes it a criminal offence.

If these sites are offering paid registration, or are including advertising, they may be committing a criminal offence in some regions.


Pinterest is the absolute worst. They're basically plargists at this point. I really don't like rehosters.


Why is Pinterest plagiarist and Google is not? From your text it seems that Google owned that images in the first place.


The distinction is that Google (usually) is a useful tool; you can get to the source of the image from it.

It’s less about plagiarism, more about a dead end vs. a stepping stone.


Is Pinterest not a useful tool? I would say that it probably generates a lot of value for its hundreds of millions of users, and that value does not come from the source of the image, but of the image itself, and its relation to other similar images.


Pinterest is useful for the purpose you describe but what seems to have happened is they are dominating organic search for images in contravention of google's own guidelines.

All things being equal, Pinterest should have been penalised into non-existence by now?


It's not one bit useful for people busy searching an image that they SEO their way above other images, only to be met with a login wall I'm not gonna fill out.


I've just blocked them on my host... too many times have I fallen down a well of trying to find the origin of some image there. It's just not worth it.


I've had the same problem... It's super frustrating because sometimes an image on Pinterest is _exactly_ what you're looking for, but you need the related context to actually use it...


TinEye reverse image search is fairly useful in this scenario. Go to https://tineye.com/ , upload an image (or paste in a URL), and it'll show you other copies of that image all over the internet.

This doesn't always help you get to the bottom of things, but it is a next step to take.


I've never had that work for a Pinterest image that was giving me trouble.


I don't understand how they haven't been penalized.


They probably have some kind of a deal with Google. Same kind of a deal that LinkedIn has with Google... you can't view most of the LinkedIn profile pages that Google indexes without being logged in. If I served a different page to Google's search bot from pages that everyone else can see, I'd be banned from Google's index. But these big companies have special rules...


Currently I'm not near my computer to verify myself (I am on my phone, before bed), but can you get into walled-but-indexed LinkedIn/Pinterest content with "Googlebot" UA spoofing?

I'd want to assume that their systems are more advanced than that (or if they're really in cahoots, maybe Google crawls them with a unique, secret bot UA to prevent this sort of thing), but I just got done with another thread where an ex-Tumblr dev said all user content was in a single S3 bucket without MFA delete so my gut feeling is lending a bit more weight to the "everything is held together with rubber bands and duct tape and it's only a matter of time before the jig is up" parameter than the "large, powerful companies tend to hire smart people that communicate and design robust systems effectively" one.


Based on my experience in the nautical industry, where most things were quite literally held together using velcro, metaphorical rubber bands and duct tape seems quite likely for relatively younger web industries.


You might look at pinterest as a giant image classification system, essentially adding more meta data to images for free.

That's perhaps the kind of thing google would find quite useful.


at this point neural classifiers outperform humans


I'm assuming those classifiers have to be trained on something in the first place?


sure but its not like there is a lack of huge image datasets


There is a Chrome Extension called Unpinterested which removes pinterest from Google Search results.


Google has it's own plugin that you can use to block results from search. I have been using it to block Facebook and a few other websites for years https://chrome.google.com/webstore/detail/personal-blocklist...


Unfortunately it doesn't actually seem to work reliably (certainly not for Pinterest), which is why this and other such extensions exist.


https://chrome.google.com/webstore/detail/unpinterested/gefa...

Edit: weird, it just adds "-site:pinterest.*" to all of your google searches. Even non-image searches. That's a very heavy handed approach. Uninstalling.


why would you ever want pinterest results, image search or not?


That was why I installed it. I never come across it on normal Google results but there has to be better ways of doing it than modifying the DOM search input box directly.


Hmmmm, I dunno man. Sometimes a hammer just gets the job done.


I'm surprised they've made it this long. They shut us (eHow/Demand Media) down pretty quickly once we hit a certain point. Image search needs its own panda update. Also part of the problem is Google no longer direct links to images so now sites are fully taking advantage of it.


I cant believe this didnt get more traction.

I have been noticing a very painful decline in Google results with more content not labeled as ads but clearly placed due to a metric other than usefulness. I routinely find myself multiple pages deep in searches, which never used to happen.

Mark my words, this is purposeful on googles end, either directly through some kind of ordering affiliate or indirectly through some kind of seo partnership


FWIW, Google is about to put more ads[0] in the Image search results

https://www.blog.google/products/ads/shopping-google-images/


That would align their interests with sites where clicking on the image gives a register or paywall. Because more users will bounce back to more ads.


It’s affiliate marketing content locking. I did this in the past but not at this scale. My guess is millions are being made by a network or a very smart and ambitious Russian.


I'd say this is just aggressive SEO, with you looking for a specific digital “product” of schematics. I get similar experience in regular search when looking for manuals for appliances and such.

As for ‘spectrum analyzer’ instead of ‘network analyzer’, it's probably Google's wonderful synonym substitution, which considers ‘music’ and ‘noise’ interchangeable, or similarly with ‘Saint Petersburg’ and ‘Moscow.’


Just for comparison I uploaded this image of both duckduckgo & google for the query above: https://imgur.com/a/1Fci80P

The layouts are different & it’s hard to tell if that impacts whether or not the results are sorted by dimension so they can fit the maximum number of images in the viewport; or if they just have slightly different ranking.

Probably a little of both, but image search results are a little harder to compare between search engines — or harder to compare than “plain” results.

My point is, while it’s completely fair to question Google’s results and doubt what actually producing those results; it’s a lot more scientific to compare their results to at least one of their “competitors.”

Easier said than done, but until there are no comparisons available we should do this (maybe just to produce data so when you search “Which is better Google or ??????” there’s a result).


I think it must be happening on your end. When I search for the same thing I get normal images and none of the domains you list above.


It lasted for a while, then apparently Google started to filter those sites; not the domains though as some of them resurfaced much down the first page. I highly doubt it's on my side as I'm using Waterfox with the usual well known privacy oriented addons, some of which could block suspicious pages from loading but none of them adds content that wasn't in the search results. I just repeated the same search and got these:

http://2.8.combatarms-game.de/pna_x_block_diagram.php

https://two.ineedmorespace.co/2020_radar_block_diagram.php

http://3.17.tierarztpraxis-ruffy.de/pna_x_block_diagram.php

I had to take out doubles as some of these urls were returned identical although I was clicking on different images.


I can reproduce his results instead, using the same search terms. The first results do actually look legit, but after the first 10-20 I see those useless links too. This however doesn't seem to be new, at least for me... I've seen that already when searching manuals for example.


How does it happen on his end, when it is a google search?


Malware (specifically adware) browser extensions or malware on computer. Edit: I mentioned malware on ISP, but I guess Google has their TLS in order so the ISP shouldn't see or be able to mess with the content like they did before.

Not necessarily saying this happened on the user end though, I just outlined how it could happen.


You should try going to www.google.com/ncr first and then see if your results are different.


What is that? It redirects me to www.google.com.


"no country redirect", sets a cookie to prevent you getting the google domain for your locale


and doesn't work anymore like you think it works

it keeps you on google.com

but you still get your markets search results and your language (thx europes "right to be forgotten" for the decline of /ncr functionality)


Yeah this happens on search as well, whenever you search for something very specific then the only few pages that show up contain these weird php sites for >50% of the irrelevant results & are mostly cached versions of the original site's content.


I got the same thing here, but had to scroll just a little. It's always the same popup on each site. It leads to a Czech Republic company herbalfun which offers downloads of games and movies for $40/mo. It looks like someone may be buying up recently expired domains, uploading a bunch junk there and Google is falling for it. Going to just the main site of any of those links shows just a long link of links for various strings.


What they do is basic search result manipulation.

I'm genuinely confused why the crawler/analyzer part google doesn't see and blacklists this (Context:I used to work as a search quality rater for google)

Look at: http://3.17.tierarztpraxis-ruffy.de/pna_x_block_diagram.php Short it to: http://3.17.tierarztpraxis-ruffy.de/ (Weblink to admin@gmail.com - haha, funny)

Change it to http://www.tierarztpraxis-ruffy.de/ ... same But: http://webcache.googleusercontent.com/search?q=cache:DHZ9Sdt...

and this is just "blacklist and forget" stuff: Show one to the "user" and the other to the crawler. Maybe Matt should see this.


This happened to me the other day when looking for a high resolution image of a world map with country labels. For the life of me, I could not find one on Google Image search that led to a directly downloadable file. I finally gave up and used Bing (!) and got what I was looking for immediately (!!).


Yeah this happened to me too! I was looking for a simple outline drawing of the USA, with state-level outlines. All I found were pages and pages of sites letting me buy what I wanted for the low low price of `$morethaniwanttopay`

I don't recall what I did in the end, possibly DuckDuckGo'd it


The same thing is happening with PDFs. Google any popular business or finance book and append pdf and see the results for yourself. Google has been gamed by affiliate marketer - my guess it’s the same guy who was outed a while ago and had his address shared online. It was a HN headline not long ago.


Surprising many of the above redirect to this :

https://signup.peltmedia.com/en/html/sf/registration/eone.ht... page

Seems like this is like same old free download click here and when you click it says subscribe to download.


I am highly suspicious that google was instrumental in centralizing the web. At some point, it started to increasingly only show results from the same dozen or two websites. At this point, if you want to search for something that is not hosted on one of the popular 'centralized' sites, you are out of luck.


Maybe these are hijacked wordpress sites. Anyway, I'm using this https://www.sdr-kits.net/VA5-Antenna-Analyzer-Kit . It only goes to 600 MHz and is cheap, but my needs are simple.


I got none of these. Maybe someone from google saw this and manually cleaned up the search results.

However, everything after scrolling three pages down are people faces linking to "Sydex.net People Search"

My guess is "SEO" abuse. Google are always 3 years behind the SEO black hats.


I noticed something similar recently when searching for icon like files.


Right now, Google Images is useless for icon type images. I've had to switch to using Bing images, which is currently better for smaller, square images that can be used or modified for icons.

Top tip: although Bing Images doesn't have an 'Exact Size...' option, I've found that suffixing your query with <X>x<Y> (i.e. '256x256') seems to work.


Could this be an effect of the Google filter bubble? What if you open an incognito session, disable tracking, and try searching on Google again?


Most of the time you can inspect the element and get the url to the actual image.


i thought google images was for finding pinterest boards and stock photo sites


This is when you open up devtools and remove the pesky obstacles.


Are you searching through Google.com or a different domain suffix?


seems like mostly german results


unfortunately this has been going on with all of google search results since people figured out their algos could be gamed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: