Hacker News new | past | comments | ask | show | jobs | submit login
Kelly Criterion (2007) (r6.ca)
164 points by kwikiel on Nov 19, 2018 | hide | past | favorite | 96 comments



The Kelly Criterion was the subject of an incomprehensibly bitter argument in the 1970s/1980s. Paul Samuelson, considered by many to be the greatest economist of the 20th century, believed the Kelly Criterion was wrong. And not just wrong but SO WRONG that anyone who believed it was an idiot.

The kind of idiot who could only understand single syllable words. So he wrote a paper in the Journal of Finance and Banking in words of only a single syllable saying why no one should use the Kelly Criterion.

http://www-stat.wharton.upenn.edu/~steele/Courses/434/434Con...


Samuelson's argument basically comes down to the fact that the investments that maximize growth don't maximize your utility unless your utility is log(wealth), and to him it was completely obvious that people should act to maximize their own utilities.

However, there have been a couple recent developments that somewhat undercut his line of thinking. First of all, to the extent that people's behavior can be described by utility theory, we all roughly have isoelastic utilities (https://en.wikipedia.org/wiki/Isoelastic_utility) with a parameter not too much greater than one. As a result, the ideal investments for most of us are near Kelly.

Second, there's a body of theory that asks what the optimal strategy for investment is given a certain number of years, and it turns out that the ideal strategy is to follow the Kelly approach until near the deadline and then to switch to the investments that maximize your own utility as the time to act grows short. This work is known as turnpike theory for reasons that never really made sense to me.

There are still concerns about Kelly investments, but Samuelson's argument isn't really one of them.


> investments that maximize growth don't maximize your utility unless your utility is log(wealth)

This needs a slight adjustment to be exactly true:

> investments that maximize growth don't maximize your utility at a specific time horizon unless your utility is log(wealth)

The reason is that in situations of repeated investment, Kelly's policy actually does maximize many time-indefinite notions consistent with linear utility -- for example, if you have a fixed wealth goal (retirement) that is enough successful wagers away, Kelly will minimize the expected time to get there [1].

For me, the important limitation for Kelly is that it is designed around timescales that involve many many repeated bets. Like most asymptotic results, Kelly can win marathons and lose sprints -- so it's important to consider which situation you're in.

[1] https://www.stat.berkeley.edu/~aldous/157/Papers/Good_Bad_Ke...


> Second, there's a body of theory that asks what the optimal strategy for investment is given a certain number of years, and it turns out that the ideal strategy is to follow the Kelly approach until near the deadline and then to switch to the investments that maximize your own utility as the time to act grows short. This work is known as turnpike theory for reasons that never really made sense to me.

Could you give some examples? I searched for "turnpike" and "kelly criterion" together and I couldn't find any examples of situations for which people had proven that the Kelly criterion was optimal even for nonlogarithmic utility functions.


Thank you for this insight! Would you be so kind as to provide some pointers to the concerns you’ve mentioned? Any pointers on turnpike theory would also be appreciated!


The irony is that using single syllable words makes it less understandable.


Interesting. I don't know much about economics or the Kelly criterion, but my understanding of the argument is that maximising the growth rate is distinct from maximising actual profit arbitrarily far into the future.

Would be cool if there was a simple toy example demonstrating a situation where these differ. Perhaps it even differs for the example in OP.


My understanding of the argument is that you have to assume you keep playing after you lose, so your stake is dictated by the risk. For the example in the article, the potential upside is 110% profit and the potential downside is 100% loss, so the optimal stake is relatively conservative in order to prevent any loss from impacting your future ability to invest.

Imagine you get the expected result of one win and one loss and had staked 5%: you'll end up with 1.055×0.95=1.002 wealth. If you had picked a 50% stake, you'd be sitting at 1.55×0.5=0.775 wealth instead, since the loss more than erases your winnings.

The unstated contrast is to models where e.g. there are only a limited number of investment opportunities, losses aren't total, or there are capital infusions.


Apologies for the confusion but I was talking about Samuelson's argument against the Kelly criterion.

If I was to make one gamble per day and wanted to maximise my profit after 50 years, I don't know if I'd use the Kelly criterion. I need to think about it.


If you just want to maximise the average amount of money that you end up with then you should bet your entire fortune every day. Of course you will be bankrupt if you ever lose your bet, which is very likely over the course of 50 years. But in the ridiculously unlikely event that you won every bet your fortune would be even more ridiculously large. So the average amount of money would be very large, even though you would very probably be bankrupt.


I see. What if I wanted to maximise the expected profit, given that the chance of bankruptcy should not exceed 1%? Does that sort of problem become difficult to solve?


Just always keep a cent in reserve. Then you can never go bankrupt and your average winnings are barely affected.


That's a good point.

Sorry for the late reply (don't know if you'll see this!) but I happened to be thinking about this again and the question popped into my head: what if I wanted to maximise a specific percentile (e.g. the 5th percentile) of my total winnings?


Fixing the percentile seems akin to the gambler's fallacy: since losses below a certain point don't matter, you just have to bet everything when you fall off, like you would if you thought after a big loss streak you were "owed" a win.


How is the growth rate determined? Since presumably it varies, is it averaged in some ways?


I wrote a reply, but accidentally made it top-level:

It's the expected growth rate, so yes it would vary in a real-world instance. I wondered for a while why they've taken the logarithm, but I think it's just because growth models are normally defined exponentially (with the rate being a parameter inside the exponential) - it shouldn't make a difference to the result.

As for varying, that Samuelson article says:

> For N as large as one likes, your growth rate can well (and at times must) turn out to be less than mine - and turn out so much less that my tastes for risk will force me to shun your mode of play.


If a big growth rate is a composition of a bunch of small changes with their own growth rate, then the overall change is the product of the individual ones, so geometric mean is appropriate.


This is a pretty weird straw man argument. The Kelly Criterion is not, like, a lifestyle. It's a way of exposing how opportunities to reinvest should impact your investment sizing... so how useful it is really depends on whether you are making investment decisions on a time horizon where reinvestment opportunity is a big part of the picture.

Does anyone know if Samuelson was responding to a particular strain of thought where people wanted to use Kelly for everything?


>> should make us change our views on gains and losses - should taint our tastes for risk.

Isn't losses 2 syllable ?


Did Samuelson ever apologize for embarrassing himself with such a rude and irrational insult. Did JFB ever publish a retraction or apology for publishing a bad-economics article written in incomprehensible language?


I find it pretty distasteful that someone should be able to publish a paper like that. Big names should be just as answerable to review and quality control as everyone else.

Also, 'Losses' is two syllables. Thanks for playing Paul.


Scientists and academics are not above humor: http://iopscience.iop.org/article/10.1088/1751-8113/44/49/49...


Humor is fine. Mean spirited incoherent rants that nothing to academic knowledge are not.


I read through it. Aside from the mention of 'flavors' I think I'm missing the humor.


The abstract simply reads "Probably not."


Oh. I guess I went straight to the blocks of text.


There is also the 2-syllable "better" halfway page 2...


Also, Q.E.D technically references 2 polysyllabic words.


Should have just used the square. QED is a bit pretentious anyways


The the main flaw of the kelly criterion (along with a number of other results in investment theory like Markowitz allocation) is that in practice it's extremely difficult to know the distribution of the result you're betting on.

The mean and variance of a prospective investment are not observable. But more to the point, if you try to use some sort of proxy like a sample mean or standard deviation, you'll get inconsistent results over time. We're a long way from the clean, simple, i.i.d world that theorists like to play in.


in practice it's extremely difficult to know the distribution of the result you're betting on.

This of course depends on your use case. For example, the Kelly Criterion is a staple of advantage gambling. For most casino games, probability distribution can be easily calculated with absolute accuracy. I'm sure there are other use cases for it where probability distribution is also fairly stable.


Does Kelly work when "ruin" is not such a clear cut permanent barrier? I can be "ruined" today but have more money next payday.


This would be less a Kelly issue and more an issue of the bankroll variable you're using. Advantage gamblers generally find the concept of "sessions" meaningless - essentially, one's entire gambling career is just one long session. So if for example you have $1K per week of expendable income that you wish to use for the bankroll, and you're willing to risk that amount for a year, then you'd likely want to approach the Kelly calculation with a $52K bankroll. You would have wild swings within that year, and very short sessions when you lost (because you couldn't play again until your next paycheck), but you would still be betting the correct amount for the total risk you're willing to take.


This is also an argument for finding investments that allow you to better define the downside risk of investments. I think that this is why static investments or hedges have such value; they may not change the expected value, they might even reduce it (based on mean estimates), but they reduce / eliminate the estimation error in your downside risk, allowing you a much more certain calculation of leverage.


Insurance in general has a reduced expected value while also hedging against downside risk.


In such cases what are models that are more effective?


I don't know of anything that's guaranteed to be effective out of the box. But as a rule, simpler is better. Any assumptions should be picked apart with a fine-tooth comb and tested for robustness. But in the end, there's not a whole lot that you can do: just cross your fingers and hope.


You could probably model something that uses sampled variance less aggressively


Something bayesian


One thing I find really interesting about the Kelly Criterion is that it exposes a very stealthy and fundamental "rich get richer" phenomenon.

Most real-life risks have a minimum and maximum investment amounts, meaning that you can't just size the bet exactly as Kelly says. So if your wealth is low, you cannot rationally participate in many risky but positive-expected-value investments.

Simply put, the poor can't take many worthwhile risks (think college!) without rising ruin (and sub-optimal growth). Conversely, the rich can come closer to maximizing EV in many risky markets at once, increasing income and growth while even decreasing variance.


Ex hedge fund guy here. You can extend this into continuous space and use that to tell you how much leverage you should have, given some Sharpe ratio.

Results may surprise you (it's a lot for even a modest Sharpe). But also most practitioners aren't going to use the full number. If you've overestimated it you are always worse off on the right side of that.


I've written a blog post about exactly that some time ago, I hope it's at least a bit enjoyable read: https://blog.millionintegrals.com/kelly-criterion-and-invest...


It's common to use some percentage of the Kelly-calculated stakes in betting, Kelly might be the optimal sized bets but it is extremely aggressive. IIRC, even if you are calculating your % edge correctly, Kelly staking means that any point, you've got a 50% chance of losing 50% of all your existing bank at some point in the future. Not many investors/gamblers can stomach that!


I mean, it's not acceptable to use it if there are no stakes below a certain point and no way to borrow more money. Or if the stakes below a certain point have worse returns (higher house take)


This. The article concludes:

> So what is the answer to the question, how much should stock should I buy? The answer is probably less than you think.

But in my experience, optimal log-wealth portfolios look like the craziest guy on your desk. Kelly accepts huge drawdowns that most people would get fired (or fire themselves) for. Half Kelly or quarter Kelly looks more like most professionals' instincts about acceptable risk.


And, given need to allow for errors in estimation and eventual deterioration of many alpha-generating strategies, adjusting investment fraction downward (a sort Bayesian prior, I suppose) is sensible on its own, sans vol sensitivity considerations.


The Kelly Criterion is the subject of an absolutely incredible book by William Poundstone called "Fortune's Formula".

In the course of discussing the formula, the book takes you through the birth of the MIT blackjack team, the genesis of statistical arbitrage, and mini biographies of people like Claude Shannon and Ed Thorpe. I can't recommend it highly enough.


Don't forget the organized crime connection. It was an absolutely fascinating read.

My dad was a daytrader (read: armchair gambler) but this helped him curb his trading - he wasn't ready to do all the statistical analyses to keep rigorously investing.


The most interesting thing to me about the Kelly criterion is that it demonstrates that the martingale system[1] is a bad strategy even if the odds are in your favor!

While it's immediately obvious that the martingale is bad if the odds are in the house's favor, it's less obvious that you are likely to go bankrupt with the martingale even if the odds are slightly in your favor (assuming the house's bankroll is much greater than yours).

1: https://en.wikipedia.org/wiki/Martingale_(betting_system)


This massively oversells the usefulness of the Kelly Criterion. The opening lines are

> One should buy stock when it is undervalued. What I have always wondered about is how much stock one should buy. A few months ago I stumbled upon the answer which is given by the Kelly criterion.

But the rest of the post analyses a mathematical game which has nothing to do with buying stocks, and is in fact only useful in theoretical situations where you know the precise distribution of outcomes.


Heh... The most value I got out of applying the Kelly Criterion was having the Exploration Manager for an oil and gas company admitting that he does not believe the estimates that came out of his own department (applying the Kelly Criterion to the numbers he provided would have led to the company carrying the exploration project internally, rather than looking for joint venture partners to share the risks).


>> But the rest of the post analyses a mathematical game which has nothing to do with buying stocks, and is in fact only useful in theoretical situations where you know the precise distribution of outcomes.

You might want to think about that. In the real world you don't know anything with precision which means you probably can't do better. One of the unrealistic aspects of it was the idea that you can play as many times as you like, but that's not possible in the real world. But we can move toward playing many times through diversification. That's a good lesson in itself.


I don't think diversification is much like playing multiple games at all.

One expands on the axis of time, with the game constant. The other expands on the number of unique games all played simultaneously


Always a good topic to discuss. Many applications to high-frequency trading due to the probabilistic nature of outcomes.

Here are 2 previous HN discussions:

https://news.ycombinator.com/item?id=13143821

https://news.ycombinator.com/item?id=2504222


I've only barely looked into the Kelly Criterion, but can someone explain the intuition behind maximizing the expected value of the log of your wealth? Trying the same derivation mentioned in the article but without the logarithm: the expected value comes out to 0.5×(1 + 1.1×f) + 0.5×(1 − f) = 1 + 0.05f which would make it seem that betting the entire fraction always maximizes your expected value. But why does this reasoning break down in the long term, and why does maximizing the log seem to make it work?


Maximising the expected return of a single throw is different to maximising the growth rate over many throws. This is perhaps a bit unintuitive, for the reason that the expectation doesn't tell you a great amount about a distribution. You could imagine a similar game in which one choice of bet results in an enormous return with a tiny probability, or else ruin. Whilst your expected return method would tell you to repeatedly make this choice, doing so is clearly insane.


The reasoning doesn't break down in the long term. If you repeatedly bet your entire fortune then you will have a very high probability of losing all of it, but you will also have a very small probability of having a huge amount of money. If your utility really is linear then this is worth it.


Thought experiment: Suppose you bet 100% of your bankroll every round. If at any point you lose a round, your bankroll is now $0. Any money you made from former rounds is for naught. Whoops.

  bankroll_final = (bankroll_initial)(round_1)(round_2)(round_3)(...)(round_n) 
  $0             = (bankroll_initial)(210%   )(210%   )(210%   )(...)(0%     ) 
Taking the "non-log" Expected Value would be optimal if your bankroll were "renewed" to the same constant each round. Because the outcome of each individual round would be independent from other rounds.

  Bankroll_final = (bankroll)(round_1) + (bankroll)(round_2) + (...) + (bankroll)(round_n)
But since the outcome of each round depends on previous rounds, we want to optimize for Expected Value of Growth. For which we'll use a geometric mean rather than an arithmetic mean. Also,

  EV[bankroll_final] = EV[(bankroll_initial)(round_1)(round_2)(round_3)(...)(round_n)]
is equivalent to

  EV[bankroll_final] = EV[(bankroll_initial) e^ln(r1 + r2 + r3 + (...) + r_n)]
which allows us to discuss wagers in terms of e^(x) and in terms of growthrates.


Betting everything maximizes your mean wealth. Using the Kelly Criterion maximizes your median wealth.

Let's say there are 1024 (2^10) people with $1 each, and a bet which is 50 / 50 of returning either 3x or 0x (i.e. you bet $1 and either get back $3 or $0). Let's further specify that there will be 10 rounds of betting.

Using the "bet everything" strategy, you end up with the following distribution of outcomes:

10 wins, 0 losses -- 1 person: $59,049 1 or more losses -- 1023 people: $0

The average ending balance is $57.67, while the median ending balance is $0.

Using the Kelly Criterion, one will bet 25% of the bankroll each time. You will end up with the following distribution of outcomes.

0 wins, 10 losses -- 1 person: $57.66 1 wins, 9 losses -- 10 people: $28.83 2 wins, 8 losses -- 45 people: $14.41 3 wins, 7 losses -- 120 people: $7.20 4 wins, 6 losses -- 210 people: $3.60 5 wins, 5 losses -- 252 people: $1.80 6 wins, 4 losses -- 210 people: $0.90 7 wins, 3 losses -- 120 people: $0.45 8 wins, 2 losses -- 45 people: $0.22 9 wins, 1 losses -- 10 people: $0.11 10 wins, 0 losses -- 1 person: $0.05

The average ending balance is $3.25, while the median ending balance is $1.80.

Incidentally, the choice of maximizing the median is somewhat arbitrary. You can also use a similar approach to maximize, say, the 25th percentile outcome, at the expense of average and median outcomes (You would bet 10% of the stake each time, yielding a 25th percentile of $1.08, a median of $1.25, and a mean return of $1.28).


If you're allowed to bet a different proportion of your money in each turn then you can push the median even higher than the Kelly criterion. For example consider the strategy that bets nothing if it has more than $2.98, whatever is needed to reach $2.98 on the next bet if it has less than $2.98, or bets everything if it can't reach $2.98 on the next bet. This strategy has a greater than 50% chance of ending up with $2.98 since it has a 50% chance of success on the first bet and a nonzero probability of success after that. So its median outcome is $2.98.

I don't know which strategy actually maximises the median, but I suspect it involves being conservative after you've been lucky and agressive after you've been unlucky.


I guess you are right. A minor issue is that you have to beat the 1.8 value of OP in (or in less than) ten rounds because if you have infinite rounds then median can grow infinitely (am I right?). It is not clearly visible in your solution, that it really happens in less then 11 rounds.

But it is easy to extend the strategy with this: First get half of the people above 1.8 by betting 0.400000000...1. Now half of the people have 1.80...1. They are finished.

Half of the people have .5999. They can do all-in twice, then 25% of them will be above 1.8. That is 3 rounds in total.

edit: the maximum of this exact 3 round strategy is betting 8/11$=0.72 in first round, then in round 3 we will have 62.5% of people at 2.45$ edit 2: got the numbers wrong in previous calculation, fixed

(Now someone should chime in, and present the actual optimal solution. ;)


The reason I chose $2.98 was that if you follow my strategy then that will lead to you betting $0.99 in the first round. If you win then you end up with exactly $2.98 and stop betting. If you lose then you end up with $0.01. If you were to bet all of it for the next 9 rounds then there is a 1/2^9 chance of you ending up with $0.01×3^9 = $196.83. Since this is more than $2.98 my strategy must have at least a 1/512 chance of achieving $2.98 even if it loses the first round. Since the chance of winning the first round is 1/2 the overall chance of ending up with $2.98 is greater than 50% and hence $2.98 is the median.

I think I can run a computer search to find the actual optimal strategy. I'll report back with the results.


Ok.

An additional thing is, if you find the optimal strategy in the form of: "in n round the best achievable median is x betting this and this way", you should check the strategy when n approaches infinity - I will not have been surprised if it turns out to be the Kelly Criterion after all - and OP will be right in this sense (though presented the fact somewhat badly by limiting the number of rounds)


Okay so I ran my program and I didn't manage to find the exact optimal median but I narrowed it down a lot. I'm assuming each bet is a whole number of cents.

I found that there was a strategy that achieved $6.84 with probability 1/2 + 1/2^10. So the median is at least $6.84 (which is 3.8 times the median achieved by Kelly criterion!).

I found that there was no strategy that achieved $6.87 with probability 1/2 or greater, so the median is less than $6.87.

I found that $6.85 and $6.86 could both be achieved with probability 1/2 but no greater.

So the median is somewhere in the range [$6.84, $6.855].

The optimal starting bets of these strategies show some interesting behaviour. For example if you want to have a greater than 1/2 chance of achieving $6.84 then your starting bet can be $0.22, $0.26, $0.31, $0.36, $0.40 or $0.50, but no other value. Make of that what you will.


"the median is somewhat arbitrary"

I don't know if it matters, but: in the bet everything scenario you end up with 1023 people who will be poorer than they would be in the Kelly scenario. In the 25th percentile maximizing case, you end up with 852 people who will be poorer than in the Kelly case.

Imagine you are running a bank, then in both of your alternative cases a large majority of people would be upset:

"If you used Kelly Criterion we would have been richer"


You're right if you only have one coin flip. But, this assumes a series of flips.

The way I understand the Kelly Criterion it is it's similar to a fancy Martingale. It's statistically inevitable that X number of coin flips will go against you. So, the smaller % of your capital at risk, the less your risk of ruin. And, if you have an edge in the coin flip, the more flips you get, the better you'll perform in the long run.

But, it's not that great of a proxy for real life as explained by other commenters in this thread.


Don't look at the expectation of a single throw. Look at the distribution of the outcome after many throws. Relate that to the product of each throw's relative return. Look at how the log expectation of the product of n random variables converges as n gets big, and you see what to optimize.


(too late for me to edit my post, so I'll reply to it)

The logic I posted above will show you that the log of your money X converges in probability to some value x that you can optimize. Optimizing x will optimize a monotonic utility f(x), which seems useful, but isn't. The problem is that just because X converges in probability to x, doesn't mean that f(X) converges in probability to f(x). Probability and convergence are weird.


It's the expected growth rate, so yes it would vary in a real-world instance. I wondered for a while why they've taken the logarithm, but I think it's just because growth models are normally defined exponentially (with the rate being a parameter inside the exponential) - it shouldn't make a difference to the result.

As for varying, that Samuelson article says:

> For N as large as one likes, your growth rate can well (and at times must) turn out to be less than mine - and turn out so much less that my tastes for risk will force me to shun your mode of play.


The real reason to take logarithms is that investment strategies repeatedly multiply our net worth by random factors. Taking the logarithm turns multiplication into addition, and we know a lot about the statistics of adding lots of random things together. (Thanks to the Weak Law of Large Numbers, the Strong Law of Large Numbers and the Central Limit Theorem.)


I think the log is just the utility function. You could substitute it with any non-linear utility function and it would've gave you another answer that makes sense for that utility.

The main takeaway is that a linear expected utility doesn't make sense.

It would've told you to bet all your wealth every game, which does result in a higher linear expected value, where you win (1+1.1)^N with probability 1/2^N at time N, but 0 otherwise. But no real human would take the bet of extreme high payoff at extremely rare chances with ruin otherwise.

Also see St. Petersburg paradox for a similar "paradox" resolved with expected utility theory.


I'm sorry, but you are plain wrong. The log has nothing to do with utility. And there is no chance of really understanding the result if you're confusing yourself with that bad idea.

To start, EVERY utility function that is both increasing and sublinear will agree that Kelly is the best strategy. Whether square root, log, or bounded - it doesn't matter. The details of your utility function are unimportant.

What matters is that each iteration of an investment strategy multiplies your net worth by a random factor. But log turns multiplication into addition. And statistics has very strong results about sums of independent variables.

The result is that with 100% odds, a player following Kelly will eventually wind up ahead of any other static strategy that you could choose. Both wind up ahead and eventually remain ahead. Which is why a wide variety of utility functions will conclude that Kelly is the optimal strategy.


> To start, EVERY utility function that is both increasing and sublinear will agree that Kelly is the best strategy. Whether square root, log, or bounded - it doesn't matter. The details of your utility function are unimportant.

This is simply false. This is easy to check for the sqrt utility case. You can calculate the optimal proportion for a single bet and note that it's different than for Kelly, and then you can calculate the utility-function-given-that-you're-about-to-make-a-bet and check that it's still proportional to sqrt. So by induction you are always going to bet the same proportion no matter how many bets you have to make, and this proportion is different from Kelly.

> The result is that with 100% odds, a player following Kelly will eventually wind up ahead of any other static strategy that you could choose.

This is true in the sense that the probability tends to 100% as the number of bets tends to infinity. But this doesn't make Kelly optimal, because in the event that the Kelly isn't ahead the expected utility of the other strategy could be much higher than Kelly.


For one iteration? Sure, you can get any answer. However attempting to apply induction to that is wrong because as the number of iterations increases, the range of likely rates of return for each strategy converges, and Kelly is the one that converges to the highest rate.

As for the 100% odds answer, what I said was true is true in the sense that it is actually true. No ands, ifs, or buts. With 100% odds, Kelly eventually wins over any other strategy. Period.

The question of whether this makes Kelly optimal is not the question that the theorem was trying to answer. And therefore is irrelevant. Now in fact this does make Kelly optimal for a wide range of utility functions. But far from all possible ones.

The point being that it is important to separate a mathematical point from our interpretation of what that point implies. When you confuse the two then you get yourself into an unnecessary muddle. Kelly is a statement about the probability of one strategy beating another. It isn't a statement about how you should bet.


Using ln as our utility, at time N we get:

  E[ln(X_1*...*X_N)] = sum E[ln(X)]
So kelly happens to maximize this expected utility by construction since it was derived by maximizing expected log of one round of betting (E[ln(X)]).

But if we use square root as our utility instead:

  E[sqrt(X_1*...*X_N)] = prod E[sqrt(X)]
We would maximize this expected utility by maximizing E[sqrt(X)]. Going through the same calculus we can see that we don't arrive at kelly.

Where did I go wrong?


The reasoning behind the Kelly Criterion was explored recently in a more broad context, showing that the logarithmic utility is not required: https://aip.scitation.org/doi/10.1063/1.4940236

Taleb has a good discussion here: https://medium.com/incerto/the-logic-of-risk-taking-107bf410...


There was a hedge fund manager named Mark Sellers who blew up his fund in 2008 by following the Kelly Formula exactly, which had told him to put 90% of his fund in a single stock, which was a small offshore oil/gas driller. It was real money too, over $200 million.


Nominative determinism? Although perhaps Mark Buyers would have been even more appropriate.


Most of the examples of Kelly criterion application are either concrete bets with discrete payoff/loss odds and values, or assumed to be normally distributed. This paper discusses how extremely skewed outcomes (eg, stock options) should affect the Kelly calculation: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2956161


I think the Kelly criterion doesn't apply as widely as people think it does. Its derivation is based on on maximising the growth rate of your fortune. But this is equivalent to assuming that money has logarithmic utility for you. If you don't value money in a logarithmic way then you shouldn't use the Kelly criterion.

Personally I feel that my utility function is sublogarithmic. If I'm just spending on myself then beyond a certain point additional money makes me absolutely no happier. Note that the usual justification of progressive taxation also assumes sublogarithmic utility. So based on this we should be more conservative than Kelly.

On the other hand, if I plan to give money to charity then my utility function is almost linear. Big charites can absorb a lot of money without becoming less effective. So in this case you should be maximally aggressive, betting everything at every opportunity.

Sometimes people say that because the Kelly criterion maximises growth rate it will be the best "in the long run" even if your utility function isn't logarithmic. But I've never seen any evidence of this. Does anybody know of a toy model where you can prove the Kelly criterion is optimal even if your utility is linear?


You are misunderstanding. Optimizing log wealth isn't because you have logarithmic utility... It's because doing so maximizes growth, which is more important that immediate return in a game of repeated investment. Kelly maximizes your long term absolute wealth in such games.

Edit: I see you last paragraph acknowledging but doubting this claim. Demonstrating an optimal strategy in a stochastic game is hard to make toy-like, because you have to somehow describe both the probability distributions and the entire range of strategies someone could take. In lieu, the end of this article does a good job of listing simple explanations of what that (provably) optimal strategy property entails: https://www.stat.berkeley.edu/~aldous/157/Papers/Good_Bad_Ke...


>Sometimes people say that because the Kelly criterion maximises growth rate it will be the best "in the long run" even if your utility function isn't logarithmic. But I've never seen any evidence of this. Does anybody know of a toy model where you can prove the Kelly criterion is optimal even if your utility is linear?

Here's one https://greek0.net/blog/2018/04/18/kelly_criterion3/

Basically it says that if you are making bets where money_(i+1) = f_i(money_i, x_i), such that your money always remains above zero, then you can apply the product form of the law of large numbers http://www.jams.or.jp/scm/contents/e-2006-6/2006-60.pdf

That means that over a long enough period any betting strategy that maximizes the geometric mean of the rates will beat any other bet with probability approaching 1. p(money(optimal strategy) > money(other strategy)) --> 1.

If your utility is monotonic (x > y implies that u(x) > u(y)) then I think this also implies that p(utility(money(optimal strategy)) > utility(money(other strategy))) --> 1.

Basically, you are eventually almost sure to have more money with this strategy than any other. If more money implies more utility, then you are eventually almost sure to have more utility with this strategy than any other.


> this also implies that p(utility(money(optimal strategy)) > utility(money(other strategy))) --> 1

That's true (at least as long as the other strategy also bets a constant proportion of wealth each turn). But it doesn't mean that Kelly is optimal. It could be that in the (increasingly unlikely) cases when the other strategy beats Kelly, the utility produced by the other strategy is much greater than that produced by Kelly. Then the other strategy could still be better overall.

In other words, even though we have

    p(utility(money(Kelly strategy)) > utility(money(other strategy))) --> 1
we also have

    Expectation[utility(money(Kelly strategy)) - utility(money(other strategy))] < 0
Here's a simplified example of what's going on: consider the bet where you get a penny with probability 1-1/n, and otherwise you lose $(2^n). I think this bet gets worse and worse larger n gets, but the probability of having higher utility if you take the bet tends to 1.


>That's true (at least as long as the other strategy also bets a constant proportion of wealth each turn).

Does it require that assumption? I don't think it even requires identically distributed returns on each component bet. It just needs money_(i+1) = f(money_i, x_i) to have strictly positive support right? Then you can just push it into log space and apply the law of large numbers, telling you to maximize the expected log of f(money_i, x_i) wrt x_i. Nothing about fractions or constant fractions shows up in that derivation.

The usual statement of the Kelly criterion is about fractions, but the more general question is whether maximizing expected log is (eventually) optimal, which seems to only require being able to apply the law of large numbers in log space.


Imagine we started with $100 and were betting on a fair coin with fair odds. There's no edge so Kelly says to bet $0, and hence the Kelly strategy stays at $100 forever. You can give yourself a very high chance of beating the Kelly strategy if you use a Martingale strategy. Bet $0.01. If you win then you have $100.01 and you should not bet again. If you lose then bet $0.02 next time, and then $0.04 and then $0.08 doubling each time until you win. When you win you will go up to $100.01, which beats the Kelly strategy. So Martingale beats Kelly unless you go bankrupt by losing too many times in a row before getting a win, which only happens with a very small probability. So there are strategies that have a high probability of beating the Kelly criterion.

For simplicity I gave the above example in the degenerate case where the edge is zero. But I think the analogous strategy works in all cases. If you aim for just a penny over the Kelly strategy then you have a high probability of success.


Fair point. And while it's too late to edit my post, I think I found the flaw in the math.

Just because X converges in probability to x, f(X) doesn't necessarily converge in probability to f(x). If that were so, then logarithmic utility would be sound, but it isn't.


The other thing is that in real life your investment options are basically never a series of biased coin flips.


Coin flipping was an example. One must estimate the probabilities for their actual bets and run the numbers for a specific opportunity. You'll very quickly realize that getting probabilities in real-world situations is difficult. But that doesn't negate the lessons from the concept.


Not sure I follow. Unless I'm misunderstanding, wouldn't Kelly apply especially for your case? As in, Kelly is concerned with avoiding absorbing barriers and increasing bets when they're on "house money" (given a known edge, of course).


Kelly isn't about either of those things. It's a consequence of Kelly that it never hits the absorbing barrier, but lots of other strategies also have that property. What I'm saying is that among all of those strategies, Kelly isn't optimal.


I misunderstood then, gotcha.


"If all the economists in the world were placed end to end they would not reach a conclusion" -Isaac Marcoson (attrib. 1933 by O.O. McIntyre)

http://www.systemicrisk.ac.uk/sites/default/files/downloads/...


A similar (but more useful) position sizing strategy is the Optimal F formula, described by Ralph Vince in his book Portfolio Management Formulas. But its real value is in showing you your 'cliff of death' curve: how close you can get to bankruptcy given your position sizing.

In my opinion, position sizing is more way more important (and less understood) than market timing.


The correct number of individually picked stocks you should buy is most probably zero: httpedmarkovich.blogspot.com/2013/12/why-i-dont-trade-stocks-and-probably.html?m=1


If you plan on tax loss harvesting through direct indexing, you can buy 500 stocks directly and sell one stock to buy a highly correlated one to decrease your tax payments in taxable accounts.


Wow, this is the 2nd time in two weeks that something has appeared on my probability homework and then made the front page of HN almost immediately.

Coincidence?

Who else is taking Stat 110?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: