Hacker News new | past | comments | ask | show | jobs | submit login

Sorting by relevance doesn't get you a representative sample - that bubbles PRs who say "amazing project" in the title to the top(which are indeed mostly spam).

If you restrict it to "in:title", you see that there's only 44: https://github.com/search?q=in%3Atitle+amazing+project+creat...

And if you sort by "newest" or "oldest", which provide a more representative sample, you can see that most of them are not spam: https://github.com/search?o=desc&q=amazing+project+created%3...




Why restrict in title though? From parent reply it is obvious that the phrase isn't in title yet it is undeniably spam.


I'm trying to get a lower bound on spam PRs filed. If you sort by oldest (instead of by "relevance" as the parent commenter did) and do a spot check, you'll see that a lot of the PR's are legitimate: https://github.com/search?o=asc&q=amazing+project+created%3A...

Certainly, restricting to the title misses plenty of PRs. But his search provides 252 PRs, and I'd estimate that easily <100 of them are spam.


Typically the PRs are titled something like "Improve docs" and the change is to add "an amazing project" somewhere in the README.md.

Here's a search that finds a lot: https://github.com/search?q=in%3Atitle+improve+created%3A%3E...


This search makes the same mistake I mentioned above - you think it finds a lot because you're sorting by "relevant" and sampling.

Sort by newest and you'll see that the vast majority of these PR's are legitimate: https://github.com/search?o=desc&q=in%3Atitle+improve+create...


The results I got show 4 spam PRs out of 10 on the first page. Same on the second and third pages. 60% is not a vast majority legitimate and 40% spam is not inconsequential when it's 86k PRs.


That doesn't match my experience. I looked at the first and 5th page from my link. No spam PRs.

https://i.imgur.com/VgOteKy.png

https://i.imgur.com/aAZ8xbo.jpg


Presumably the spam onslaught is slowing down with time from the video being posted so looking at the most recent results will give an ever fewer percentage of spam. If you look at page 100 of the results (for example, currently 18 hours old) there's a lot more spam.

For example: https://imgur.com/a/vxCHvcO




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: