Hacker News new | past | comments | ask | show | jobs | submit login

To me it reminds me of the advent of scholarly databases. The main effect I saw is that researchers started using exclusively those databases, sometimes publisher specific databases (so they were citing only from one publisher!) and were missing all the papers that were not indexed there. In particular a big chunk of the older literature that wasn't yet OCRed (it is better but still not fabulous). This led to so many "we discovered a new X" paper that the older people in the crowd in conferences were always debunking "that was known since at least the 60s". While those AI tools can clearly help with initial discovery around a subject, it worries me that it will reduce the search in other databases, or the digging into paper references. It is often enlightening to unravel references and go back in time to realize that all recent papers were basing their work on a false or misunderstood premise. Not talking about the cases where the citation was copied from another paper and either doesn't exist or had nothing to do with the subject. There was a super interesting article about the "mutations" of citations and how you could, by using similar tools to genetic alysis, generate an evolutionary tree of who copied on who and introduced slight errors that would get reproduced by the next one.

edit: various typos




Yes, but even the best scientists aren't born with knowledge of what came before. It has to be discovered, and where the discovery process is broken it needs to be fixed. On the individual level, "spend hours chasing rumors about the perfect paper that lives in the stacks, find out the physical stacks are on a different continent, and then sit down and struggle through a pile of shitty scans that are more JPEG artifact than text" makes sense because it's out of scope for a single PhD to fix the academic world, but on the institutional level the answer that scales isn't to berate grad students for failing to struggle enough with broken discovery/summarization tools, it's to fix the tools. Make better scans, fix the OCR, un-f** the business model that creates a thousand small un-searchable silos of papers -- these things need to be done.


I think this is the paper https://arxiv.org/abs/cond-mat/0212043


> Our estimate is only about 20% of citers read the original

Oh no

That's basically the same as the percentage of people who read news stories when responding to or sharing the headline


Really we do have numbers on that?


Here's an irony for you:

This link: https://insights.rkconnect.com/5-roles-of-the-headline-and-w... says "Only 22% only read the headline of an online news story, according to data from the Rueters Institue for the Study of Journalism."

But following the link to that study gets me to https://reutersinstitute.politics.ox.ac.uk/sites/default/fil... which… if it supports that claim, I can't seem to find where :P


I love it. Thanks!


That's the one.

Tbere is also https://www.researchgate.net/publication/323202394_Opinion_M...

and there was yet another one but I can't find it


Can you share the article you’re alluding to?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: