Hacker News new | past | comments | ask | show | jobs | submit login
How to (seriously) read a scientific paper (2016) (science.org)
145 points by sebg 8 months ago | hide | past | favorite | 62 comments



This misses the best skimming trick I learned from an advisor: After you read the abstract, read the last couple lines/paragraphs of the introduction. That's where you'll find the authors best summary of the paper's contributions and novelty. For instance, the first paper that popped up with some random search terms [0]:

    The present paper attempts to provide a structured and comprehensive overview of state-of-the-art automated driving related hardware-software practices. [...] The aim of this paper is to fill this gap in the literature with a thorough survey.

    The remainder of this paper is written in eight sections. Section II is an overview [...] Details [...] are given in Section III. Section IV presents [...] etc.
When you have a meter high stack of papers for your lit review, comprehending long introductions and conclusions gets too expensive. You want to filter the irrelevant stuff as quickly as possible so you can spend time focused on the couple hundred papers you might actually read and cite.

[0] https://doi.org/10.1109/ACCESS.2020.2983149


Maybe I'm misunderstanding what you mean, but I don't think your trick was missed at all?

The second paragraph says:

>I start by reading the abstract. Then, I skim the introduction [...]

A few paragraphs down, another quote says:

>I always start with title and abstract. [...] I then read the introduction [...]

Another:

>I nearly always read the abstract first [...] I generally skim the introduction, reading its last paragraph

Is there something significantly different about what you are suggesting?


What I'm saying is similar to the third quote, where you start after the abstract by reading the last 1-5 lines of the introduction. Only if you want to proceed further do you dedicate the time to read or skim the entire intro, which can span multiple pages. You save tens of seconds and a lot of thinking this way, plus it helps frame things by giving a concise statement you can reference as you ask "how does what I'm reading build towards the paper's goals?"


In the early-2010s, computer vision papers started putting a bulleted summary of that paper’s contributions at the end of the intro. It’s an enormously helpful (and fairly recent) invention. I’ve been insisting on using that pattern in the papers I write since, not just because it helps readers understand the key ideas better, but because it also forces authors to clarify the main points to themselves.


That train's probably gone with the rise of LLMs? As in, in 2020s it would be more than often just delegated to the chatbot and called quits. At least that's my thought.


Maybe it depends on the literature. I almost never read the introduction, unless I am unclear on the motivation for the work from the abstract.

The fundamental problem with abstracts and introductions is that they're sales content. You can't trust them. The only thing you can do is decide if you wish to read further, and see if the paper says what they claim it says. For that, an abstract is sufficient about 99% of the time. Spending lots (or...any amount) of time on the intro before you read the data tables is basically asking to have time stolen from you.

For what it's worth, the proportion of papers where the body doesn't support the claims in the abstract is...high. Like way above 50%, in my experience (approaching nearly 100% for any random paper you find on HN or X.)

My algorithm is:

* read abstract. decide to continue.

* formulate my own version of the experiment in my head -- how would I go about answering this question?

* read all data tables, including supplemental materials. see if anything stands out as weird.

* look at all figures, including supplemental materials. see if anything stands out as weird.

* do I understand the data? if not, search for table/figure references from results, until I do (generally I end up reading most of the results, out-of-order).

* is this paper consistent with the way I'd have approached the problem? If not, is that important? what are the possible methodological flaws? what would they look like in the data?

* re-examine data to look for signs of flaws.

* does the data support the conclusion? if not, why not? what would I do differently or as a follow-up?

* finally, decide if the paper worth reading in detail.


In the literature I spend the most time with (archaeology/anthropology), the abstract is often focused on the broad applicability of the research to the field, which I either don't care about or already know (often both). The last line/paragraph of the introduction contains the specific contributions/findings of the paper. If and only if that's reasonable might it be worth investing the time to read further. I only rarely touch the data tables (assuming they even exist), because I don't want to spend an hour trying to understand whatever software nightmare the probably-not-computer-literate author(s) used and miss the details hidden outside the data in the explanatory text (e.g. samples were processed by throwing away all bits we couldn't identify!) that make the time spent useless.


Yeah, we come from very different fields. But I'll say this:

> the probably-not-computer-literate author(s) used and miss the details hidden outside the data in the explanatory text (e.g. samples were processed by throwing away all bits we couldn't identify!) that make the time spent useless.

This is super common in my own field, and it's top-five indication that the paper is bad. If I can't understand how they manipulated the data, unless I have some specific other interest, the entire paper goes in the trash. There's no excuse.

For example, it's pretty routine in biology and medicine to see high profile papers that invent some wild/complex "statistical pseudo-control", instead of actually doing a good experiment (which is hard, and probably why the question is unpublished). These papers are basically never worth reading.

See also: not using conventional statistical tests, not including uncertainty estimates, not using standard/vanilla models before jumping to crazy stuff. Not to be too glib, but "we entered the data into SAS/Excel/R/Matlab" is like a bright, flashing warning sign that you're about to see a clown show.


Unfortunately, most of the foundational papers in the field did exactly that, usually without explicitly stating. It was pretty much the standard of practice for all bone analysis prior to ~2010 or so when microstructure analysis became a practical thing. Similar issues in other types of samples exist and it's still common today, it was just what came to my mind as the first example.


Yeah it depends heavily on the field and sometimes on the subfield.

Also the approach I use depends on how much I know about the field. If I know less, I weight the introduction more. If I know a lot, I skip almost all the sales and stage setting content. E.g. if it's a psych paper, often I jump directly to the results and only read other content if the results look reasonable. And so on.

So, I guess my only real point is that it's hard to take advice for how best to read a paper from someone (like your advisor) who knows the field much better than you do. You should probably gather as many tricks from the advice as you can, but maybe don't treat it as gospel and do lots of experimenting.


It's certainly true that it depends on the subfield and somewhat on your level of experience in that field. But the approach I outlined works pretty well, generally, and only serves to underscore how bad many papers are.

For example, I'm not going to be able to pick up a paper from CERN and do most of the steps I outlined, but if I did do that, and found an obvious statistical anomaly in the first table, well...now I'd have something interesting to ask about and dig into.


oh yeah, I agree. I was agreeing and expanding on your comment, not disagreeing with it.


Still waiting for an open discussion forum for scientific papers.

Like StackOverflow, but every post is a paper. And there are experts of various gradations that help with questions.


Better yet (IMHO), like HN. In my field, I know where to find papers, etc. I'm interested in other fields, and not every paper; just rank them by a HN-like algorithm.

I said a few days ago, in my dream HN would be mostly papers. That's where the richest, most accurate, most intellectually curious information is - far beyond most things on HN.


most of the most intellectually curious things I've ever seen on the Internet were not research papers so I don't know how you can make that claim.


It's a matter of opinion, of course.

But what are you reading? I'm very interested in finding more and better!

I find research papers to be an order of magnitude richer, more accurate, more thoughtful than almost anything else. Everything else is too slow and doesn't answer most of the questions I have. Research papers (and books) not only answer the questions, they take me much further than what I knew existed - what I'd hope for from experts.


I absolutely love a good book, but there are absolute gems of blog posts, videos from conferences, etc you deny yourself that way.


Still need some solutions: How do you find the absolute gems?

That they exist - of course - is no more helpful than pointing someone to a library of 100 million books and telling them there are some gems in there.

I think people are happy to scroll/browse around. Our time is very limited; if we read the 0.001% 'best' (however defined) material, we'd never come close to finishing. I don't want to spend much time on anything remotely average.


What is "remotely average" to you may be completely exceptional to someone else.

I just finished a book about Andrée de Jongh, a 23 year old Belgium woman who lead the Comet Line, one of the largest escape networks in Europe during WW2 helping return some of the men left behind at Dunkirk and other downed pilots and aircrew.

She personally took ~24 round trips from Brussles, Belgium to Gibralter, Spain thorugh Nazi occupied France. A one-way distance of over 1300 miles, crossing the Pyrenees.

I found the actual book not the best, but the story was amazing. I've since done a deep dive on the Comet Line and more about de Jongh.

Would you have read something like this? What % best would this be to you? I don't think I would be able to find anything similar in a research paper or anything in academia. Its the most interesting thing I've consumed in the last few months.

To answer your main question though:

> How do you find the absolute gems?

I consume a lot, and of those, some are excellent. Some are bad. I wouldn't know though unless I checked myself.


I understand your approach, and I think everyone does that to some extent, and most people do it to a great extent. I try to impose more discipline on my content choices, not always succeessfully. For me, it's paid off very well.

> What is "remotely average" to you may be completely exceptional to someone else.

I think that's taking relativism to a point of paralyzation. While judgments will differ between people, that clearly doesn't make them useless.

> I found the actual book not the best, but the story was amazing. I've since done a deep dive on the Comet Line and more about de Jongh.

IMHO, that curiosity and exploration is the most important thing.

> Would you have read something like this? What % best would this be to you? I don't think I would be able to find anything similar in a research paper or anything in academia. Its the most interesting thing I've consumed in the last few months.

Honestly, I'm tempted by the story, but because you said it wasn't the best, probably not. Also, I work hard to limit my history and biography to serious, scholarly sources: I want to understand and learn as much of the reality of things as possible; we never actually perceive reality, of course, even in front of our noses (or especially then), but I find a lot of popular histories/etc are sensationalized or more biased.

You'd be amazed what you can find in scholarly sources. It's incredibly rich, fertile, beautiful, exciting stuff - if you're the curious type, far more than the popular sources. People just used the tools their habituated to, and those lead to the 'popular' stuff - that was my situation too. Fortunately, I knew I just needed new habits and it would be just as easy.

So here's some tips if you are interested or if anyone is (written assuming no familiarity):

* For browsing books, look at what university presses publish - which includes the pinnicales of the most brilliant people's life works, and which covers all sorts of fantastic ground you hardly know about. You can usually find reviews.

* Also, to learn about something in particular, use Google Scholar to look up research. Start with literature reviews - the expert reviews all that is relevant and presents it to you. From the reader's point of view, it's incredible - they do your homework for you, and they are experts in the field. There are entire review articles (Google Scholar has a filter for them), and the beginning of any scholarly paper has a literature review - just pick a recent one. Then you will know the landscape and can proceed from there.

* To skim a book, etc.; join the Internet Archive's lending library (free, quick signup), Hathi Trust, and Libby (via your local library) - all offer immediate, free checkout of electronic versions of books.


I think limiting yourself to research papers alone will deprieve you of ideas which are just developing, which may not have an official body of research or even any published papers about. Ideas we as a society or culture might only beginning to tackle.

I find the discussion around a thing to be sometimes more interesting than the thing itself - a research paper might spark way more interesting discussions than the paper itself.

Take the "Attention Is All You Need" paper. How many have actually read it vs. how much has been written about it, or about the things it led to?

There is a lot packed into those 10 pages, but I've found the thousands of discussions around transformers and GPT to be way more interesting.


I'm not sure where the absolute statements came from - only research papers, etc.

> find the discussion around a thing to be sometimes more interesting than the thing itself

Here we differ. The discussions are mostly BS, mostly misinformation, ignorance, jokes, etc. It takes a lot of reading to find a few gems. What an expert writes, in their own domain, about something they've specifically studied in detail, is far more valuable IMHO, filled with beautiful things.

I don't rule out all discussion (obviously!). But think of the papers as comments in forums - or as blog posts - but instead of a misinformed hot take, the commenter did a bunch of research in the existing literature, carefully constructed an experiment, and tried out their idea themself - and furthermore, the commenter is an expert themself, like some people who post to HN.

It would still be great to learn where you find good material, of any kind.



Individual subreddits have usually been my go-to place to discuss and understand the nuances of academic papers. Love LocalLLama for discussions on generative AI papers.


I’m playing around with something similar here https://thetrecs.com/ . But no experts yet. Even though I do science, I have a hard time understanding abstracts from adjacent fields so I’m using llms to make the abstracts easier for me and my friends to understand


I think researchgate tried to do that, but I don't know how successful they've been.


https://openreview.net/ ?

Open Review allows public comments. It's open, you can see revisions, authors can link repositories, datasets, and websites. You can even submit papers there outside of review. But it is expected that you are posting your own work. So kinda like arXiv but a bit more. Here's an example with Mamba.

https://openreview.net/forum?id=AL1fq05o7H


It would be quickly overwhelmed, especially for complicated math papers


Maybe arXiv style rules? Low bar, but is a form of vetting. You keep it as a place for researchers and leave places like Reddit for the general masses.


Why is that?


I think this is what https://pubpeer.com/ wants to be


There's too many paper


Reddit tried.


Yes please. That's the peer review of my dreams. What an extraordinarily valuable resource that would be.


> As editor-in-chief of Science....Generally, I start with the corresponding editors' summaries, which are meant for someone like me: a science generalist who is interested in everything but dives deeply only into one field. Next, I check to see if someone wrote a News article on the paper. Third, I check to see if there is a Perspective by another scientist.

This says so much about the volume of absolute crap that gets published in Science. No competent editor-in-chief should be looking to the news media to give background on a research topic.


I guess you didn’t read the article very closely. The Science editor was talking about Science News (note the capitalization) which is a well respected information resource.


Alas, I believe you two are not communicating!

Marcia McNutt, the (now former) editor-in-chief of Science, was certainly talking about the News section of Science magazine, the magazine of the AAAS.

That is, McNutt is talking about: https://www.science.org/news

This is not the same as "Science News" -- https://www.sciencenews.org/sn-magazine

The reason I think this is that the next sentence refers to a "Perspective" (also with initial capital, just like she used earlier with "News"):

> Third, I check to see if there is a Perspective by another scientist.

If you read Science regularly, you know there is a News section that has some "hot topic" stuff, like "Curiosity rover discovered ancient lakebeds on Mars", or "Arecibo telescope collapsed." It's right up front.

And following the News section of Science magazine, there is a Perspectives section that has contextual overviews of less "front-page" technical stuff that is still important, but more niche. ("A major advance in modeling of superconductors that could yield new materials in upcoming years" type thing.)

Perspectives are tied to a technical article, and News is sometimes, but not always.

I read the print version of Science during these years. It was outstanding. My recollection is that they didn't usually do both a "News" and a "Perspective" on the subject of a technical article. It was typically one or the other.

*

My personal opinion is that this order (I look at News and then at Perspectives) is perfectly fine. In particular (@timr), Marcia McNutt is not talking about reading "the news", as in "the newspaper."

Slagging Marcia McNutt as some kind of lightweight is all kinds of wrong. She's the president of US National Academies of Science, and that's just the top line (https://www.nasonline.org/member-directory/members/52683.htm...).

This kind of callow dismissal is one of the least attractive aspects of HN.


It isn't a "callow dismissal". I'm talking about science.org/news, just like you. It's written by career journalists [1]. It's fine for toilet reading.

And as far as credentialism goes...I don't care if McNutt is the Nobel-winning reincarnation of Carl Sagan. She's literally saying that her go-to for reviewing any new paper submitted to Science is to look at the work of journalists, to see if the article is noteworthy enough to continue. That's just wrong, if for no other reason than Science journalists don't cover stuff that isn't in a top journal (which is typically...Science!)

I didn't misquote her, and I'm not misinterpreting the meaning of her words. It's quite plain. All of the other responses were variations on...actually reading the paper. Hers is about using proxies to read the paper for her.

[1] Here are the bios of the writers for science.org/news:

https://www.science.org/content/author/eric-hand

I grant that a few have technical backgrounds or undergrad degrees, but all have subsequently become professional writers, or "science communicators".


> It was outstanding.

Fond memories. I've wondered if it, and Nature's equivalent, might be used for llm training? Hoping to get correct stories, let alone insightful and integrated ones, from textbooks, reddit, and wikipedia... seems unlikely to end well. Perhaps also survey papers. And parts of doorstop-tome-on-topic's. Maybe the intro and related work sections of research papers? Pity research talk intros and Q&As are rarely captured.


OK, first...Science News is trash. Absolute trash. Maybe it's better than e.g. the current NY Times Science reporter (who is so incompetent that it's mind-blowing), but it's still just not worth reading, except maybe on the toilet. They can (and do) say nonsensical stuff all the time.

(edit: I just realized that you might be suggesting that "News" in the quote refers only to "Science News". I don't agree, but even if so, it doesn't change anything about my argument. If you are editor-in-chief of a scientific journal, you shouldn't be delegating your analysis to reporters, no matter where they work. If you aren't experienced/knowledgable enough to see why a paper is important from the work in front of you, give it to someone who can. And if you don't want to do that, then reject it.)


Well, I don’t agree with your opinion of Science News, but that’s OK. What I was pointing out is that she was indeed referring to Science News (and also Science Perspectives) as sources for her reading and understanding of a paper.


Maybe that was implied by the capitalization of "News", but regardless, it's the same argument.

Reporters are not scientsts. The whole thing is like a hilarious public announcement that the editor in chief of Science has Gell-Mann Amnesia.


If I'm interpreting it right, no, it's not the same argument. It sounds like she said she basically checks if it's been talked about in Science besides as an existing paper. Seems reasonably for a publication to consider it's own standards of publishing as good enough.

Science reporters are probably, hopefully, scientifically literate.


Science News (the one in Science magazine) is written by journalists.


I wouldn't rule it out entirely - a good news article can add context (e.g. via interviews with authors) that is otherwise not generally part of the paper itself


but.... why is that? if scientific papers were written well (and i'm implicating the science publishing guidelines + accepted culture of scientific writing in that, not just individual authors), shouldn't they self-contain their context to the level that they are comprehensible? note, i'm not talking about spin - that doesn't belong. but if basic comprehension of the paper requires reading secondary sources, isn't that a fundamental problem in how scientific papers are written? (i'm asking this as a person with a phd and post-doc, working as a data scientist now but i have read many many scientific papers). I'll note - older papers (1960's, 1970's, etc.), do seem to include more color and more explanation. some modern papers do too. but many are dry and kind of incomprehensible.


I see a lot of people who basically skip to the figures, and take them at face value, and read nothing more about the paper.

When I read a paper I start with the abstract and go to the methods. then I inspect the figures carefully for the common tricks scientists play- axes tricks, overfitting, etc, looking to see whether I'm c onvinced the figures actually came from running an experiment, or just from copying and pasting other gels (this is extremely common).

I used to do: read the whole paper in order, recursively visiting each citation (often 2-3 levels deep); within a few hours, I was convinced the entirety of science is build on somewhat shoddy foundations.


The methods section is my major judgement call. That's where you set off they have a clue what they're doing and care enough to worry about ensuring the work can be received. Which, after all, should be the main reason a to publish a scientific paper based on experiments


Yep. That's pretty much exactly my process for the sections.

Read Abstract.

Look over Methods.

Skim Conclusions.

And finally take a closer look at the Supplementary data (which is usually where I end up very happy or dissatisfied with the researchers)


What do you mean copy pasting? Like straight up fraud and fabricating results of a blot?


Yes, apparently copying gel images is fairly common. And the folks who do it don't give much thought to how easy it is to detect, by skilled human eye or by computer. https://www.nytimes.com/interactive/2022/10/29/opinion/scien... https://www.nytimes.com/2024/01/22/health/dana-farber-cancer... https://www.nytimes.com/2023/07/19/us/stanford-president-res...

I should say, when I learned about Elisabeth Bik it was a revelation because I never thought other people saw copy patterns in papers, although I always assumed it just happened accidentally (I've run gels before and the whole process is terribly manual and prone to mistakes).


What do you generally look for in the Methods section?


Specific details about how the data collection and processing was done. For example in my field there were a lot of fussy details that if they were not handled correctly (and most people did not handle them correctly) would lead to false positives/false negatives.

If something is outside my area of expertise, then I have limited ability to judge a paper based on its methods.


1. Read the abstract

2. Read the last paragraph of the intro

3. Read the first paragraph of the conclusion

4. Read any diagrams or figures

5. Inspect the citations

6. If all else fails, read the paper itself


I pretty much do this, but I also read the "Methods" section after the conclusion. You'd be surprised how quickly you can filter out a paper based on how they actually collected the data.

It's not uncommon for papers to use crappy methods at either sampling, or they use an unreliable method for measuring the key data point. Or they use the right method, but you can see they made a mistake in the process.

At least in my field (biochemistry) I'd always look at the methods as a hard filter.


Related:

How to seriously read a scientific paper - https://news.ycombinator.com/item?id=24986727 - Nov 2020 (90 comments)


This is similar to how I used to do it when I was still an academic, 20 years ago. My mental process is answering questions: is this an interesting paper with something to tell? Do the authors know their field (references)? Are they able to write a coherent and well structured article? If that all checks out, you start reading the paper in more depth.

The 2024 version of this would of course be "hey chat gpt, summarize this article for me and list the key points and then criticize the work". I expect, article quality should improve a lot simply because authors are also using LLMs to help them polish their articles. There's no excuse for submitting unpolished work at this point.

A lot of what I did as an academic was reviewing papers for workshops, conferences and journals. That's a chore that is part of academic life and getting good at that is a core skill.

It's the really bad papers that are the most work. Because after you have a suspicion that something is garbage (this is the easy part) you actually have to read the damn thing in order to deconstruct it properly and provide detailed feedback to the authors that is both fair and actionable. A well written paper is much easier to review and more fun to read. Unfortunately, well written papers are quite rare. Mostly you are dealing with awkward grammar and style (inexperienced researchers, non native speakers, etc.), people who are unaware of key references they should have cited, a lot of bullshit conclusions and shoddy methodology, poorly structured arguments, etc.

You can learn a lot about writing by reading a lot of bad writing and having others criticize your work. I'm really grateful for all the people who took the time to deal with my early efforts. It wasn't great, I know that. I'm not a native speaker. I had to learn writing on the job.


You'll note that the people in this thread who claim to have higher levels of expertise reads the Methods section pretty closely.

For the scientifically-interested public this is where it can all go wrong since we lack the background to properly assess the methods since we've never done it before.

Once we've gone through a paper and it all seems reasonable to us, we have to really seek out expert opinion on the methods and see what critiques are out there--we're not going to be able to judge the paper's methods in a relative vacuum of actual experience.


Complementary: How to Write a Paper (University of Cambridge) http://www-mech.eng.cam.ac.uk/mmd/ashby-paper-V6.pdf


The correct way to read a paper is to start at the methods section and determine if their methodology was any good. If not, find another paper.


I think it really depends on where in your career you are? Like if you are near the beginning:

* read the whole intro, make sure to look up any parts of the problem description you don’t understand

* at least skim the body of the work. If you are trying to implement a code, make sure to understand all the algorithm blocks and note how each one corresponds to the overall work so you don’t end up implementing a special case.

* read pay extra attention to the conclusion and skim the results

If you are experienced, I guess you don’t need advice, but I figure it is something like:

* skim the first couple sentences (for the problem) and the conclusion of the intro

* skip to the results section to see if the person is screwing you around with gamed metrics

* go back if it looks good and skim their ideas


1. Read the conclusion

2. Does the conclusion agree with the beliefs and politics of both you and your benefactors? Then it's good science. Does it disagree? Then it's misinformation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: