Hacker News new | past | comments | ask | show | jobs | submit login

Is that statistics? I think of statistics as "given historical data, infer future data". But it seems like what you want is to know which decision is best, which involves many more things (like estimating the impact/utility of each outcome), that seems more like economics?



Arguably it's just quibbling over a trivial terminological difference, but I get the feeling that you're thinking more about "Decision Theory"[1] (or "Decision Science") as opposed to "just Statistics". Decision Theory, of course, uses Statistics, and I guess one could argue the question of whether one is just a subfield of the other, or argue exactly where the dividing line is.

[1]: https://en.wikipedia.org/wiki/Decision_theory


When I took statistics in college, we started with a rather basic definition, something to the effect of: "A statistic is a function performed on a set." Statistics studies what you can infer when you know something about a set, but not everything about it, namely its precise contents. Often, what you are told about a set is something about its probability distribution, thus linking probability and statistics together.

A useful parallel can be drawn with situations involving measurements and data, since data often have the same feature of telling us something but not everything. This is what I believe makes statistics so useful for science.


I've grown to the idea that statistics is a form of data compression. It isn't so much, "infer future data" as it is "if the data we have is representative of all data, what is a number/equation that represents this data?". Usually with a certain framing.


Economics is also statistics.


> Economics is also statistics.

Economics is not Statistics (and definitely not statistics).

Most of the discipline focuses on testing models and making inferences on observational data. The techniques for dealing with that sort of data, of course, build on Statistics, but their nature is different enough that there is Econometrics.

A large part of economics is not empirical at all -- despite the fact that people get Nobel prizes pretending this not to be the case.

Even in the context of experimental economics, since the behavior of the observed vary depending on the mode of observation, the contexts in which the most straightforward Statistical methods designed to apply to engineering/chemistry/biology experiment type situations are not directly applicable (although it is great when they agree with the fancier methods).


>A large part of economics is not empirical at all -- despite the fact that people get Nobel prizes pretending this not to be the case.

I'm not sure which parts of the field or which prize winners you are talking about. To be clear: you think economics is _not actually empirical_, but people are awarded Nobel Prizes for _pretending that it is_? That's a little odd. Let me know if that's not what you meant.

When you look at this list:

https://en.wikipedia.org/wiki/List_of_Nobel_Memorial_Prize_l...

Who satisfies that condition, in your mind? Who is getting the prize on the basis of pretending that economics is empirical?


> To be clear: you think economics is _not actually empirical_

That is a misrepresentation of what I said.

To be clear, I think what I said:

>>A large part of economics is not empirical at all

E.g., as an example, Kahneman's Nobel is solely a product of taking an axiomatic theory and designing experiments where regular people who are actually not being paid according to their performance are gently prodded into violating the axioms in weird settings. It is attractive to people who want to claim that clearly the plebes cannot be allowed to choose for themselves as they are not "rational".

The only meaning of "rational" in Economics is that individuals choose the best alternative according to their preferences among a constrained set of alternatives. Here an "alternative" or "bundle" is a point in the entire commodity space.

The only test of this is consistency with GARP: A choice is not rational if a feasible and more preferred alternative exists.


There are actually several economists on this list, like Victor Chernozhukov, Guido Imbens, and Susan Athey...


I think of it like this:

Suppose I want to make a decision about whether to hedge for a market crash right now. Statistics can tell me the likelihood of a crash, and how bad. But if the market crashes, and very badly, how might that affect my life? To make a good decision I would need to think of all the things that come with a market crash (job loss, savings loss). This is not statistics.

I could again use statistics to say what is the chance I lose my job given a market crash (say 70%). But then I would need to estimate the impact on my life should I lose my job (Stress, etc). This is not statistics. But it should very well factor into my ability to do back of the napkin math on whether I should hedge or not.


If your decision substantially involves or derives from making an estimate about a population based on a sample, it is statistics. "Making decisions under uncertainty" is well-studied in statistical literature, just like "quantifying uncertainty" is well-studied. It sounds like you think the latter is "actual statistics", but these things are both statistics.

In particular:

> But if the market crashes, and very badly, how might that affect my life? To make a good decision I would need to think of all the things that come with a market crash (job loss, savings loss). This is not statistics.

This is all statistics, not just the part where you're forecasting likelihood of the market crashing. The reason is because making decisions about the future under the constraints of uncertainty implicitly involves a forecast. When you decide how to diversify your personal investment portfolio, how much to allocate to your Roth versus traditional IRA or 401k, etc, you are making forecasts about which allocation will provide you with a more favorable outcome.

Stated more concisely: there is no rational reason to use statistics for forecasting market events but not for deciding what to do in the event specific market events occur.


This is exactly statistics. This is an expectation of a utility function with respect to some distribution.


> Statistics can tell me the likelihood of a crash

Statistics cannot tell you any such thing.


Do you mean to say that nothing can tell you such a thing?

What is a likelihood, but a statistic?

If there is any method to determine a statistic, it seems reasonable to me to say that that method involved statistics.

(Now, of course, except for possibly where quantum randomness is relevant, which might be quite often, I'm fairly confident that the only probabilities are subjective or relative to some set of assumptions, or something along those lines, because the future "already exists". But, given some fixed priors and some fixed evidence, there should in principle be a well defined probability of such a crash. So, insofar as peoples priors match up, there should, in principle, be a common well defined probability given "the information which is publicly available", or also, given whatever other set of evidence.)

Of course, that doesn't mean it is computationally tractable to compute such a probability.


> But, given some fixed priors and some fixed evidence, there should in principle be a well defined probability of such a crash.

:-)

How do you test this model?

It is easy to find things that fit one of the previous crashes.

Given that there is only one realization of history, the data we have is consistent with any model that puts a non-zero probability on a crash.


Well, what I gave isn't exactly a model of the market, so much as "a description of having a model of the world".

So, I'm not sure what you mean by "test this model".

You can refine your model-of/beliefs-about the world, by continuing to look at the world and make observations.

And obviously your beliefs should include a non-zero probability of a crash. That follows from non-dogmatism/Cromwell's rule.

And yeah, there is only one, (or, either that, or at least we can only observe one, which is practically the same thing) "realization of history". This doesn't produce any difficulty, because probability isn't defined by the proportion of trials in which the event occurred.

Probability is about degree of belief (or, belief and/or caring).

edit: I suppose you can also evaluate how calibrated your beliefs have been, which is kind of like testing a model.


> Probability is about degree of belief (or, belief and/or caring).

Not at all.

Probability is a countably additive, normalized measure over a sigma algebra of sets.

> This doesn't produce any difficulty, because probability isn't defined by the proportion of trials in which the event occurred.

You misunderstand the point.

Let's say you provide me a distribution of crash probabilities for every trading day for the next three months.

We all ought to know that P(event) = 0 does not mean event is impossible., Therefore, P(event) = 1 does not mean "not event" is not impossible.

What would allow one to state that your model is consistent/not consistent with the one observed history of events over the three months, regardless of whether there is a crash or not?

You have to come up with this criterion before observing the history.


Ok yes, that’s the definition of a probability measure. But I was talking about the concept of probability, in the world, contrasting with the “objectively defined via frequency in related trials”, which is something people sometimes claim. I misunderstood and thought that was the claim you were making.

Ok.

I would think that, if we have a continuous distribution, then the score should be the probability density of what is observed?

If you say beforehand “I think x will happen”, and I respond “I assign probability 1 that x will not happen”, and then x happens, then I’ve really messed up big time. I’ve messed up to a degree that should never happen.

(And, only countably many events can be described using finite descriptions, and a positive probability could, in principle, be assigned to each, while having the total probability still be 1, so that nothing that can possibly be specified happens while being assigned a probability of 0. Though this isn’t really computable..)

As a more practical thing, if I assign probability 0 to an event which you could describe in a few sentences in under 5 lines (regardless of whether you actually have described it), and it happens, then I’ve really messed up quite terribly, and this should never happen (outside of just, because I made an arithmetic error or something.)


> As a more practical thing, if I assign probability 0 to an event which you could describe in a few sentences in under 5 lines (regardless of whether you actually have described it), and it happens, then I’ve really messed up quite terribly, and this should never happen (outside of just, because I made an arithmetic error or something.)

I think this conversation has reached an impasse.

https://en.wikipedia.org/wiki/Cantor_set


I'm familiar with the cantor set, and I know it has 0 measure. Just because you can succinctly describe the cantor set, which has 0 measure, doesn't mean I've messed up. If I assign a uniform distribution over [0,1] to some number outcome in the world, and an element in the cantor set is the result, then I've messed up. But, when we measure numbers in the world, we don't measure specific real numbers, as all our measurements have some amount of error. So, that can't happen. We can measure that the result is in some interval, and that this interval contains some element of the cantor set, but the probability of what we observed, is not something that I assigned 0 probability to. Like, heck, every interval will have a rational number in it, and the rational numbers also have measure 0.

"the measured value is in the cantor set" isn't a thing that we can observe to have happened.

("the value, when rounded to the finite amount of precision that our measurement has, is in the cantor set" is something that would have positive probability, under the uniform distribution over the interval, and therefore something I shouldn't assign a probability of 0.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: