It's also important that people understand what "control for" means in an observ...

vorpalhex · on June 2, 2022

Yet all you have to do is miss a factor and you've ended up with correlation.

As a simple question, what about people who have weak stomachs or hearts? My mother doesn't drink coffee because it "makes her heart beat too hard". How, with no actual medical data for "coffee makes my heart beat hard", do you control for that? Is that something to control for?

This is the "caffeine and healthy pregnancy" problem. We know women who consume less than ~200mg of caffeine tend to have healthier pregnancies.. but if you can drink 5+ cups of coffee while pregnant and not get overtaken with nausea, that might indicate something is already wrong.

canjobear · on June 3, 2022

An important limitation which is often overlooked is that when you "control for" something by entering it into a regression, as you describe, you are only controlling for the linear effect of that thing.

It seems to me that this problem is totally fatal for large-scale epidemiological studies with many factors, of which many are sure to have nonlinear effects.

t_serpico · on June 3, 2022

Well put. I ultimately view observational studies as hypothesis generators that can spur research into more targeted questions all the way down to the biochemical level.