Hacker News new | past | comments | ask | show | jobs | submit login
The expected value of the game is positive regardless of Ballmer’s strategy (gukov.dev)
193 points by gukoff 45 days ago | hide | past | favorite | 160 comments



Recent and related:

Steve Ballmer's incorrect binary search interview question - https://news.ycombinator.com/item?id=41434637 - Sept 2024 (240 comments)


This sort of misses the forest for the trees, although neat application.

Ballmer's argument is essentially about tail risk. Expected value is absolutely not a good way to make bets if you value survival, because you only get one shot. Same reason you wouldn't go all in every time you get a poker hand that's "expected" to win. Because you'll (very probably) be bankrupt in a few hands.

Sure the mean is +$0.07 or whatever, but the spread on that surely goes over the 0 line. So there may well be marginally more chance of winning than losing, on average, but you're only gonna get one outcome. So if the goal is to play to win, or else, then you probably shouldn't unless you like owing Ballmer money.

What would be more interesting is to monte carlo simulate this strategy and look at the win/loss distribution. Presumably the choice is then not so clear cut.

If you're allowed to play the game a few trillion times or so, then by all means bleed him dry :P


> Ballmer's argument is essentially about tail risk

Where are you getting that from? As far as I can tell, he makes no such arguments in the interview. The problem, and his explanation of the answer, are phrased purely in terms of expected value of a single iteration of the game. And the twist is the adversarial selection of the number, not risk of ruin.

It'd be an awful example of tail risk anyway. With the obvious strategy the tail is extremely fat.


> Expected value is absolutely not a good way to make bets if you value survival

Yes! The St. Petersburg "Paradox" shows that we intuitively know that. I put "paradox" in quotes because I don't think it's a paradox, it's just a sane reaction.

(Sam Bankman-Fried was a big fan of EV and famously declared that he would toss a coin where heads would double the "value" (?) of the world but tails would destroy it.)

In short, the St. Petersburg paradox goes as follows: a fair coin is tossed until heads come up, and the player wins $2^n, where n is the number of times the coin was flipped. So for example if heads come up on the first flip the player gets $2, if it comes up on the second they get $4, on the third, $8, on the tenth $1024 (2^10), etc. It's easy to show that the expected value of the game is infinite (approaches infinity).

Therefore, someone perfectly rational (?) should be willing to pay virtually any amount of money to play the game, because any finite amount of money is less than an infinite amount of money, and therefore the expected gain is always positive.

Yet you will probably not find many people (except SBF?) willing to pay millions of dollars to play that game.

It's only a paradox if we think it shows that people are not "rational". But I think it simply shows EV is not a good measure of risk, and everyone knows it.

Very complete and fascinating article about the St. Petersburg Paradox here:

https://plato.stanford.edu/entries/paradox-stpetersburg/


What idiot wouldn't put destruction of the world as '-infinity' of value?

Equating money with value is a simple trap as well. Who cares if you can win millions when a single loss wipes out all your savings? Since anything below a certain level of money leaves you trapped with no way out it could be argued that the value of being destitute is not 0 but -infinity which makes any risk of losing all money unacceptable. This is especially true in a world where people are willing to offer strange bets with arbitrarily high expected value as long as you have some money.


> What idiot wouldn't put destruction of the world as '-infinity' of value?

Literally anyone, you included. Every day, there is a chance that the roof collapse on you while you're sleeping; that's -Inf of value! Yet you don't put infinity resources into preventing that since you consider the probability low enough to not obsess over it.


No, I don't put infinity resources into it because that would make me die even earlier due to not having money or time to do what is necessary to stay alive.


> value of being destitute is not 0 but -infinity which makes any risk of losing all money unacceptable.

that's just loss aversion codified.


The context can also be very important. For instance: In case A, you have $50, and are offered to bet them against a fair coin flip, if you guess right you win another $50, if you guess wrong you lose your $50; in this case the most rational choice would be to refuse to play the game. In case B, you have $50, and are offered to bet them against a fair coin flip, if you guess right you win another $50, if you guess wrong you lose your $50; in this case the most rational choice would be to agree to play this game.

The missing context is that in case A, you need in 10 minutes to repay $50 debt to the Sicilian mafia, or else they'll kill you to make an example for others, and you have no other assets or other ways to make money in this short time. In case B, the situation is the same, but you owe $100 instead of $50.


The St. Petersburg "paradox" is not a paradox if we consider any real-world implementation of it. The EV accumulates at the rate of $1 per flip. So, if we want to make the EV at least $1,000,000, we must find a counterparty that is willing to pay at least $2^1000000 (or at least 2^1000000 units of "utility" if we're trying to avoid the depreciating utility effect). That's plainly unrealistic. As soon as the counterparty has any fixed upper limit to its ability to pay (or provide utility), the EV becomes finite.


It's interesting that in this context the assumption of infinite growth is obviously unrealistic - but in the context of trad econ, infinite growth is considered a bedrock assumption.

Putting them together suggests it's flagrantly irrational to apply naive toy models to the real world. Even if they do have a nice mathy sheen.

Engineers (mostly) know this, but for some reason gamblers and economists (mostly) act as if they don't.


When you're this far away from saturation, infinite growth is a good approximation.


I don't mean specifically the St. Petersburg paradox but in the context of most things that economists analyze that have an infinite growth assumption.


> It's only a paradox if we think it shows that people are not "rational". But I think it simply shows EV is not a good measure of risk, and everyone knows it.

There are standard arguments (e.g. the Von Neumann–Morgenstern utility theorem) that an agent with rational preferences, with remarkably weak definitions of the word “rational”, must have an utility function and a subjective probability function such that their behaviour is always governed by the EV of that utility with respect to that probability.


Yeah but that utility function will not be a linear function of money so you can't say EV(u($$)) == u(EV($$)). This is the mathematical formulation that tells you that it is irrational to risk all your money to make an extra dollar even if u(EV($$)) is positive EV(u($$)) can be negative.


This is interesting to me in the context of a post[1] yesterday about teaching logical thinking to children. One of the top comments was about how, yes, teach logical thinking but also teach other types too. SBF and those who think with a heavy bias towards EV show an extreme weakness towards reality. Of course SBF himself is a perfect example of this. I won’t pretend to know if his math was always right but one of his defenses on trial was essentially “just run the simulations a few more times and we’ll get all the money back” which clearly shows a lack of understanding of reality.

I’m glad to now know there’s a common example of that weakness.

[1]: https://news.ycombinator.com/item?id=41456472


Does he actually calculate expected values or does he just have a sort of winding way of justifying his gut takes?


I don't agree, I think he was just plain wrong.

Unlike most people here I actually think questions like this are a decent way to see how people think. I would expect people with math/stats/cs background to be able to at least start the conversation about this problem.

However when you hide hypotheses or add your own BS constraints as a gotcha without explicitly stating them is where you lose me.

If the question is "would you play this game" the reasonable mathematical translation is "determine if the expected value is greater than zero". If you're going to talk about tail risk you need to specify utility functions (possibly asymmetric for the two players!) and you need to explicitly say that's what you mean.


> If the question is "would you play this game" the reasonable mathematical translation is "determine if the expected value is greater than zero".

Not really! And that may be the point of the question. It's not testing if you can pattern match to plausible CS concepts.

If you get one play, and the goal is to win, do you take the chance? The whole question is about the difference in likelihood in the limit (expected value, infinite plays) and what is a likely outcome _of one round_.


> It's not testing if you can pattern match to plausible CS concepts.

But thats a problem then, isn’t it?

Because if you abandon that framework there are many other possible frameworks to think in.

You say the question is about the difference in likelihood in the limit (expected value, infinite plays) and what is a likely outcome _of one round_. (Which may I say is just an other plausible CS concept you pattern matched.)

One might also say that you should play the game because even if you are playing like a chump and has absolutely the worst luck you are at max out of pocket of a 100 dollar. A real hustler can turn a personal meeting with Balmer into much more lucrative deal and earn back that hundred dollar million-fold that way.

Or you could say that the answer is no, because Balmer stinks and it would be ruinous to your personal reputation to be seen with him.

Or you could say “yes” because you know you don’t have cash at all. So even if he wins good luck trying to squeeze his reward out of you.

Or you could say “yes”, because while you distract Balmer with this inane game your associate will lift his valet and car keys.

Or you could say yes because what is the worst which can happen? You will spend a bit of a money, and have an awesome story to tell later.

Or you can answer no, because clearly a rich businessman does not have your best interest in his mind, so there must be some trap.

So this is what happens when you abandon the framework to answer the question as a plausible CS concept. You open the door to all these other alternative frameworks and more.


If it were about winning or losing that round, there wouldn't be so many different monetary values associated with each outcome. Instead, each outcome would be win, lose, or draw. Because there are monetary values, we have to decide if it's a good bet. If somebody offers me a chance to wager $1 on the outcome of a die roll where the payoff for choosing correctly is $10, I will do it, even though it's more likely that I will lose than win.


I don't think this is true. Most people will not be bankrupt after losing a dollar. If this is true then Steve failed badly at communicating this context.

To be honest, I think Steve just didn't grasp the mathematical deepness of the problem.


While it does seem that Ballmer doesn't have an understanding of the deepness of the problem, in his defence, he outscored BillG on the math SAT with a perfect score of 800, and graduated Harvard with a degree in applied mathematics.

Which makes me wonder if it's related to another 'simple' game theory problem that came up in Matt Levine's money stuff:

"They made me do the math on 1000 coin flips. EV(heads) (easy), standard deviation (slightly harder), then they offered me a +EV bet on the outcome. I said “let’s go.”

They said “Wrong. If we’re offering it to you, you shouldn’t take it.”

I said “We just did the math.”

They said “We have a guy on the floor of the Amex who can flip 55% heads.”"

I like that anecdote and the takeaway, especially with regards to trading: if someone's offering you what seems obviously a +EV trade, why are they offering it to you and what are you missing? Whether that was Ballmer's intended lesson is another matter..

[0]https://www.bloomberg.com/opinion/articles/2024-05-14/amc-is...


Is the interview for an engineering position or for sales?

If you're hiring a software developer, I am going to assume all probabilities are about physical processes or data distributions or such, and there is no "if we're asking it means we have something up our sleeve". The data going to be sorted by merge sort is not going to have anything up its sleeve, or set any traps for me.


> Is the interview for an engineering position or for sales?

Either way. The coin-flip example and Ballmer's binary search game could apply with simple extensions to complicated processes like SLAs on cloud services.

> The data going to be sorted by merge sort is not going to have anything up its sleeve

That's a curious example, since one reason to use mergesort rather than quicksort is the latter's susceptibility to pessimal inputs.


Physical processes are very often long tailed on the wrong end and thus adversarial though.


You think they asked this on the SAT?


You know that even smart mathematicians got tripped up by the Monty Hall problem, right?

> Even when given explanations, simulations, and formal mathematical proofs, many people still did not accept that switching is the best strategy.[5] Paul Erdős, one of the most prolific mathematicians in history, remained unconvinced until he was shown a computer simulation demonstrating Savant's predicted result.


I think you've misread. The bankruptcy was in an analogy to poker. The point is if you get _one round_ to play - essentially all or nothing - should you play? No.


> The point is if you get _one round_ to play - essentially all or nothing - should you play? No.

On the contrary, in general yes you should, because life as a whole will give you a variety of these sorts of risks. Steve Balmer's offer is just one episode in a lifelong series of risks offered you by the universe at large.


It all depends too if the absorbing barrier is a true absorbing barrier like death.

Bankruptcy would suck but it is not like you have no possibility of another chance like death.

I suspect there is much muddled thinking in this area viewing the death of a LLC as equal to the death of a human.

A coin flip for 20 million dollars if tails, physical death if heads is much different than bankruptcy if heads.

One is a stupid gambler with their life and the other is basically a serial entrepreneur taking asymmetrical bets until the coin comes up tails.


OK, if you value survival, at what odds should you play?


I actually think parkour community and practice teach and handle this very well. Same for free solo climbing.

Imagine you're in your bathroom, choose two tiles and step from A to B.

No problem. What if those tiles were all you had to stand on and you were 30 storeys up suddenly it's a whole different equation and your body instinctively knows it. In fact some of parkour training is learning to master your own fear and confidently execute things you know you can already do in the face of higher stakes.

If I offered you a dice roll and said guess the number, bet 1 dollar I'll pay you 12 on the right guess you'd bet 1 dollar but you wouldn't bet your life on it even though the EV was high. Even if the payout was 1 million you still might not take it (some might)

Edit Thinking about it more, my example is more about factoring in all the risks - a positive EV bet with a large downside risk is not a great one to take even if the risks are small which is where the picking up pennies in front of a steamroller analogy comes from


Yeah, exactly - you wouldn't stake your life on hitting the average case in an n=1 scenario. Parkour community sounds really cool!


You are staking $1.


It's a good question, sort of related to optimal bet sizing. Check out this wiki which another commenter also mentioned: https://en.wikipedia.org/wiki/Kelly_criterion?useskin=vector


Kelly Criterion

Betting more than the Kelly fraction increases the risk of ruin, especially in the long run.

https://en.m.wikipedia.org/wiki/Kelly_criterion

Note: Not saying that this is applicable in the original post's situation. It's relevant to the parent comment though, and very useful in many situations, such as investing.


What if I wanted to maximise the bottom 5th or 10th percentile of wealth?


Fractional Kelly?


Yes, precisely. Although that's more about bet sizing for optimal return in the long run, not quite about binary choice of whether or not to play. But conceptually bang on, I had it in mind


> Ballmer's argument is essentially about tail risk. Expected value is absolutely not a good way to make bets if you value survival,

If he was trying to make that point, why set the bet at $1 - a loss that wouldn't imperil anyone?

The situation is entirely fictional, why not fictionally gamble with a five-figure sum?


You could? There's no reason you couldn't do $50,000 with bets of $10,000. The point is you get 5 guesses before you lose. The bet sizes don't really matter.


They matter. I could play a game with negative expected value just for fun, if this negative expected value is not too negative. Or not for a fun, but to let Ballmer win to make him feel himself winner, to make him feel superior, hopefully it will help me to get what I want from him. To lose a game gracefully is a manipulation device, especially if the winner doesn't suspect, that you lose on purpose. And, the stories I heard about MS suggest that Gates, Ballmer and all those gray beards of MS were very competitive, so they are probably much more susceptible for this particular bait. $1 is a very small price for such a tactic. But $10,000... it depends on what I'm hoping to win, and on my detailed understanding of my further steps and probabilities of their success.

Life is more difficult than math abstractions of life. No one yet managed to mathematically describe any social situation to such level of details, so you could blindly believe your equations, like you do with physics.


The bet sizes here do really matter, because they are selected to ensure that there is no risk of ruin. Ballmer asks about playing the game once at a bet amount that anyone doing the job interview can cover.


I don't view the original problem this way, but let's think about it!

> the spread on that surely goes over the 0 line.

Do you imagine starting with $1 or $1000? :)

Let's add a condition that Ballmer has infinite money, we start with a specific budget, and we can't continue playing if we exceed budget randomly changes after each game,

In the game where you start with $N, win $1 with probability p > 0.5 and lose $1 otherwise, the chance of eventually losing all your money is (p/(1-p))^N. [1]

So, the ruin chance actually becomes exponentially lower the more money you have at the start.

The steps in the random walk above belong to a simple, Bernoulli-like random distribution. Meanwhile the mixed strategy is a more complex discrete random variable because it can do more steps than just +1 and -1.

However, I believe that the same principle applies for the mixed strategy.

If you zoom out and consider "batches" of steps, you can apply the Central limit theorem and see that all these random walks work roughly the same. The caveat being that you need a large enough starting budget to "zoom out" :)

Granted, the standard deviation for the mixed strategy is ~$1. I would guesstimate that if you start with ~$1000, there's no way you will ever lose your money.

> What would be more interesting is to monte carlo simulate this strategy and look at the win/loss distribution. Presumably the choice is then not so clear cut.

Agree, this would be a nice demonstration! I will think about doing this next time I get a couple of hours of free time.

[1] https://math.stackexchange.com/a/153141/65143


hey, thanks for the blog post and your reply! I think I follow - a generalization of a coin-flip type game. I agree that if you have more starting money, you would never lose. From the binary search idea, even if chosen adversarially, the worst case is still log2(100) ~= 6.6. So if you get 1,000 guesses or just any number of guesses >= 7 you literally can't lose. Then you should definitely play.

Setting the limit at 5 brings you to the interesting point of there being a good mix of win/loss outcomes. 4 would be too few guesses and you'd very likely lose, and 7+ you'd definitely win. So the question is only interesting _because_ the limit is chosen so that the spread puts your odds on both sides of the 0 line. Otherwise it'd be clear cut.

The standard deviation being ~$1 is interesting. To me that suggests that with a mean of $0.07 and a deviation of +/- $1, it's essentially 50/50 odds. There's technically a slight edge in your favor, probably 53/47, but barely. So given a game with essentially no edge, would you play? Framing it that way - deciding to what degree the game is winnable - it's essentially not. You should not particularly expect to win, no matter your strategy.

I think part of the trick with the Ballmer question as well is the question is not necessarily about 'can you find an optimal strategy?' - it's 'do you play the game or not?'. The paths chosen within the round don't ultimately matter to that question. It's only intermediately necessary to model the intra-round decision paths in order to get to the overall win/loss distribution for a single round.

If you do end up getting the time, do make another blog post!


Here's another way to look at it. With a naive binary search strategy, which is an optimal algorithm for search for a sorted static array, we have:

Step # | % of reachable numbers in [1,100]

1 | 1% = 2^0

2 | 3% = 2^1 + 2^0

3 | 7% = 2^2 + 2^1 + 2^0

4 | 15% = 2^3 + ...

5 | 31% = 2^4 + ...

6 | 63% = 2^5 + ...

7 | 100% <-- 2^7 > 100

So out of all possible single-trial outcomes, only 31% of outcomes guess the number in <= 5 steps. In other words, in 1 trial you would expect a 69% chance of a non-positive outcome.

So an EV of +$0.2 or +$0.07 alone does not match the actual odds of winning in 1 trial. EV is most predictive in the infinite limit, and least predictive in a single trial - which this is. First off, $0.2 isn't even a possible outcome so there's a good reminder that the mean value does not always occur in the dataset.

This is also pretty straightforward from the classical interview-y 'Big O' perspective. If you translate the question to the natural CS-worded equivalent, something like "can you search a sorted array of length 100 to find an arbitrary value in 5 steps or less?", one readily sees that O(log2(n)) -> log2(100) = 6.6 > 5, so you definitely can't guarantee it. If we put odds on it as above, we can see that the odds are also not in your favor.

Now, if you want to look at it as saying after 5 steps you've removed ~96% of the search space, that's cool and all that you've reduced the candidates to only 3-4 remaining numbers, but those aren't the odds of winning. We know that 7 guesses is enough to guarantee finding the number, so after 5 guesses we ought to be 'close'. But the game is not horseshoes, so the fact that we're close is not relevant


> Same reason you wouldn't go all in every time you get a poker hand that's "expected" to win. Because you'll (very probably) be bankrupt in a few hands.

You're calling an all-in 100% of the time in a cash game if your expected value is positive. If you don't, you can't afford to play at that table.

You're not going all-in with any hand expected to win because that's not how you maximize profit. It has nothing to do with the risk of going bankrupt. Because again, if that's a concern you shouldn't be sitting at the table.

Tournament poker is a bit different because there are certain points where you have positive chip EV and negative $ EV and the math changes.


People are really zeroing in on the word 'bankrupt' here. The point was if you used that strategy, all in 100% of the time for positive EV, you will _probably_ go bankrupt even though the n=infinity limit of that strategy is positive.

The whole poker thing was merely an analogy in the first place.


Calling an all-in and going all-in are two totally different things, unless the all-in (that you'd be calling) is for an amount greater than you have. Otherwise, it's just "bet a lot, but you can keep trying if you lose". Going all in, on the other hand, is "bet it all, and if you lose you're done". The risk on the later is much greater. Any time there is no chance for recovery on failure, your risk analysis changes dramatically.


But in a cash game, if you're properly bankrolled (a common number people recommend is to have at least 40 buy-ins worth of cash set aside for the stakes you're playing), the one buy-in you lose if you lose an all-in is relatively small and you can just buy in again and keep playing. Going back to what OP said, if you shove every time you get aces you will profit over time as long as you have a properly managed bankroll, it's just not the most profitable way to play aces in most situations. It's different in tournaments because when you're late enough into a tournament, losing your stack actually does mean losing everything since you can't re-buy


What you are getting at is that for a single person, the chance of going bankrupt is high but for a population ensemble, the average wealth increases.

This is absolutely true. All it takes is to understand that for the single person, the model isn't ergodic but all expected value based models assume ergodicity.

See [section 4 here](https://www.jasoncollins.blog/posts/ergodicity-economics-a-p...) on losing wealth on a positive value bet.


It's kind of a given that you can't use the proposed mixed strategy for a single game, because it expects you to draw one of the patterns at the start of the game.

And some of the patterns are just obviously suboptimal if this is your only chance:

> With probability 0.9686%: Binary search, first guess is 1.

(I wonder what Ballmer would think that, when proposed to play this game, you first manually throw dice to draw a random number in the range 1 - 1,000,000 and if it's 9,686 or less, you start your binary search at 1. He might be impressed by your dedication to the mixed strategy.)


> This sort of misses the forest for the trees, although neat application.

I think people are missing the forest for the trees on it mattering if Balmer was "right" or "wrong".

It's his interview question. He's using it as a way to see your thought process more than the answer you arrive at at the end.

I imagine if interviewees had these thoughtful disagreements, he'd either guide them to the reason he had a different answer or value their input.


agree about the EV stuff. it only makes sense over many draws and the problem states that he is thinking of "a number", that sounds like one game.

that said, the problem screams binary search and you know your opponent is a computer person, so i guess the question is: if you make a bet that your opponent is making an adversarial choice that assumes you're going to do a vanilla binary search, can you improve your odds of coming out ahead by modifying your own binary search to always assume the target is an adversarial one?


That is an interesting question. I suppose it's equivalent to the 'worst case' performance for binary search, which would be a relevant topic. Finding the optimal strategy in the case where the opponent _may_ be adversarial could be interesting. I don't imagine the odds improve compared to the base situation, but not sure


what if you, say, decided to make the bet that your opponent is adversarial and just did binary search, but restricted high, low and mid to snap to the nearest adversarial numbers when adjusting them?


When Ballmer said 'adversarial', I considered this strategy: he's not actually required to pick a fixed number at the start at all. He can simply give the answer to each guess which leaves the largest number of possible numbers remaining, guaranteeing a loss regardless of strategy.


Right! I'm not sure if that's actually what he had in mind, but if he did, it's funny because it makes all this mathematical analysis completely pointless.

The OP has a complex randomized strategy that guarantees to average at least $0.07 against any adversary; meanwhile, just by delaying his "pick" and stringing you along, Ballmer makes you take seven guesses and owe him a dollar each time.

If you were expecting to win $0.07 on average, how many rounds would you play before you realise you're being scammed?


This should be higher.

The OP's article is interesting, but it assumes a very weak notion of "adversarial", in which Ballmer still commits to some initial choice.

Interestingly it's actually possible for a player to know this is the case if Ballmer uses a commitment scheme [1]. For example, at the start of the game Ballmer could generate 500 random bits, append his chosen number in the range 1-100 to this, hash the result and then send you that hash: At the conclusion of the game, he sends you the 500 random bits, and you can check that concatenating his chosen number (now revealed) to those bits and hashing the result produces the hash he originally sent. (If Ballmer lies and changes his number, he would need to somehow come up with 500 bits that when concatenated with this different number still produce the original hash. This is hard.)

[1]: https://en.wikipedia.org/wiki/Commitment_scheme


That's what I thought too, kind of like Absurdle, an adversarial variant of Wordle: https://qntm.org/files/absurdle/absurdle.html

It is by the author of HATERIS, a variant of Tetris that always gives you the worst piece.


His wording of the rules implies he chooses a number and sticks with it. He "has a number in mind". Of course some interviewers like to play mind games and twist things up to make themselves feel smart but I don't think that's his intent here.


>I don't think that's his intent here.

Well, rereading what he (was reported to have) said, I now think that probably was his intent, and he was just sloppy. At least, he can't have it both ways: Either he genuinely commits to a number at the outset, and uses the word "adversarial" to mean a very weak form of adversary (one that is defeated in expectation by TFA's mixed strategy), or he is using "adversarial" in the standard (strong) sense, in which case he must be lying about committing to a number, which is a shifty mind game as you say.


This is how it is done in the analysis of competitive ratios of online algorithms. The adversary can change its mind on a whim, it merely has to commit to the decisions it has already made in the past.


I mean, who know what he’s thinking, but based on the game description that strategy isn’t “adversarial”, it’s lying. Maybe the lesson is “don’t play games for money with people who will cheat”, but it would be a boring one.


Edit: Oops, nope, this comment is wrong, ty fgna for pointing that out!

I feel like there's an even simpler proof that you can beat adversarial-ballmer, with exactly the same expected positive outcome as binary search vs random ballmer.

I call my algorithm "randomly offset binary search". It goes like this:

1. Pick a random number between 0-100, call this 'offset'

2. Perform the binary search algorithm, except at each step add 'offset' to the value and mod by 100.

That's it. Now, even if Ballmer knows you're using this strategy, he can't make it perform any worse by selecting any specific number. Therefore your expected outcome is still $0.20 per game, beating the strategy proposed in this blog post.


Unfortunately the numbers are not circular :( By offsetting the initial number, the binary search does not work optimally right? Imagine the number is below 50, and you start by guessing 60, now you have to search for 30 numbers instead of 25, and thus the binary search is not optimal. reply


Ah, yup, you're right. Ballmer's answer of "high or low" isn't in the offset number system, but the normal one, so my strategy doesn't work.

That's what I get for not thinking it through properly, thank you for pointing that out!


Neat. A nice way to see this is to imagine that the numbers 1-100 are arranged around a clockface; you randomly spin the clock before doing a conventional binary search starting from the top.


This is brilliant!


Of all the things that Ballmer was wrong about... I guess this is one of them.


Ballmer was right about betting on Microsoft.


Agree, but Microsoft was wrong about betting on him.


I'd love to be wrong like Ballmer. The net balance of his decisions were billions of dollars.


The net balance of his decisions, his circumstances, his random events and who knows what else.

Please please stop with this "if he's rich he must be smart" argument. Please?


As long as we stop with "if he's rich it must have been luck alone". Even a lottery winner has to buy a lottery ticket.

Ballmer was a top salesman and a notorious workaholic. Of course he needed luck, but I doubt most in his position would have netted the same billions.


>The net balance of his decisions were billions of dollars.

Billions of dollars less than it could have been, not just for Microsoft, but all Windows users combined.

If he's rich he must be fortunate, have to find out if he's smart some other way.


Show us what you have been wrong about so we might judge you.



My favorite Ballmer practice is stack ranking. It completely screwed up the entire company.

I worked on Windows Mobile at the time the iPhone came out. We were all shitting ourselves.


Stack ranking existed at MSFT before Ballmer became CEO, i.e., when billg was CEO.

It was a practice that Jack Welch brought into the corporate world and Microsoft was just guilty of following what were thought to be best practices at the time.

Source: worked at Microsoft before Windows Mobile was a thing.

As an aside, Windows Phone was my favorite phone OS and Ballmer seemed like the one who actually cared about it a lot.


> best practices

Tangent: I love that phrase. Anytime I hear someone make that statement I think about the 9,000 things that have been considered "best practice" until we actually understood that they weren't even "good practice". It gets thrown around as if it's been studied and confirmed when in reality it means "a bunch of people are doing it"


Oof, that does not sound like a strategy that culminates a positive work atmosphere.


And this, friends, is the perfect example of why the modern tech interview process is pure insanity.


Is this a perfect example of broken modern tech interviews?

Balmer's question seems fair for the complexity of the answer he was expecting.

As the interviewee you would presumably provide the (mathematically) wrong answer, but you'd show your thinking along the way, including a small demonstration of CS principles.

Keep in mind that Balmer had a long career, so if he ever asked this question, it was probably back in the 80s when no one expected you to come up with the complex solution outlined in the post.

If you did outline the correct answer, that would be amazing, and you'd be an instant hire. But the question doesn't fundamentally seem broken to me because either answer (taking the bet or not) needs to be well justified.


The question seems like a mathematical one.

What you're saying sounds to me like "your answer doesn't need to be correct, it just needs to sound reasonable". What you're filtering on with this question is good bullshitters.

To me, the only reasonable to this question is "I don't know". I think even a mathematical genius like Terrence Tao would not be able to give you the answer to this on the spot. (Although I can also totally believe that he would instantly see this from some obscure theorem that only like 5 people on the planet know.)


> "I don't know"

Which is exactly the same as “no” in this situation. If you were capable of proving the opposite I assume he would have been willing to hear you out.

> not be able to give you the answer to this on the spot

Realizing the most obvious strategy would be suboptimal if the game is adversarial is the first step of the correct answer though. If you didn’t know what to do next obviously the correct answer is “no, because I don’t know”.

Someone who is trying to absolve himself from making any decision ls is presumably not the sort of a person they were looking for.

Also would Balmer have been hiring for engineers to begin with?


No, it's about understanding what an interview is for. They're not trying to get the answer. They're trying to find out what you know, how you think and how you communicate.

Do you spot that it's different with one Vs many plays? Do you spot the binary search? Do you spot that an adversarial opponent can push things? Can you clearly communicate these?

If you just say "I don't know" and that's it you are showing you don't know how to communicate important information and miss soft skills about understanding the context of an interview.

If you say "I don't know" and talk through your thoughts then great. The point is talking things through, even if you have gotten the wrong answer.

Maybe you'd be able to say "and here's how I'd code a simulation to check"


That seems reasonable, and in that context I would think it's actually a good interview question.

But it seems that Steve Ballmer used this question as if it has a single right answer, and that answer is "no". Unless it's more about the question "do you want to play this game, right here, right now?", then it becomes more about heuristics and quick reasoning.

It's all about context I guess.


In fairness, if someone proposes a complex and contrived game which cannot be easily analysed but is potentially adversarial... you shouldn't play them.


Are you talking about Balmer's question or the aforementioned job being interviewed for?


Seems so. The broken tech interview idea stems from the idea that interview questions that do not pertain to the job will lead to hiring the wrong people.

And, indeed, what Microsoft really needed in the 80s was people who truly understood memory management in C, not gamblers left to hack their way into something that kind of worked sometimes. Microsoft's need to correct that hiring mistake later set them back significantly. Had they asked about the intricacies of C as it directly pertained to the job instead of unrelated trivia, they would be in a much stronger position now.


> Had they asked about the intricacies of C

Presumably it wasn’t Ballmer who was asking questions like that?

If he was running the “business”, sales etc. part of the company.

All of the things you listed would have been less than worthless if they weren’t able to convince anyone to buy their products.


Thanks for playing along. This is a perfect example of why pointless trivia does well in an interview. It reveals the psuedo-intellectuals who will overanalyze the situation in an effort to try and sound intelligent. Exactly who you want to steer clear of.

The candidate you actually want to hire will respond with something to effect of "That's dumb. Let's instead talk about X, which will be a far more appealing topic to both of us."


Well.. most people generally don’t like working with annoying, self-entitled know-it-alls. So I guess the question serves its purpose if it filters such people out.

> effort to try and sound intelligent. Exactly who you want to steer clear of.

I wouldn’t be reading your comments if I wanted to stay clear of stuff like that, would I?


I think you're missing my point. The problem isn't the question - it's the fact that Balmer was objectively wrong about the answer - and he never changed that determination after having conversations about it however many times. ("I asked this question all the time.")

It doesn't matter that it was difficult to prove he was wrong. The issue is that it was impossible to prove he was right. And if anyone ever tried to bring that up to him, he never once heard them.

I believe an interviewer who is wrong and does not listen to you is a perfect example of the broken process. Especially given that he was an industry leader - in this interview he was providing a historical example of the process' merits - all while being entirely incorrect.


You are claiming things that are absent from and contradicted by the interview.


I'd love to address your point, but unless you further articulate your position I won't be able. I see nothing in my statements that isn't directly said or implied from the linked video.


You need to explain what you think those things are.


Yes because it is a question for a quant.


Well to be fair Steve Ballmer is a terrible leader and if he had to take the tech interviews he wouldn't have passed and Microsoft wouldn't have stagnated for 10 years, before Satya Nadella took over and brought the company back on its feet.


He’s probably a decent leader just not a very good CEO for a company that needs to develop new products and enter new markets to continue growing rather than to try and squeeze every remaining penny from their current customers.


Satya Nadella has been good for the stock price and is probably a better leader than Ballmer, but hasn't made MS into a respectable company.


Many Satyas successes started under Ballmer.


Could you give some examples?



is it? If I was forced to ask this question as an interviewer, and the candidate said, "actually, you're wrong, here's why" that's a very good signal. Do most people not do this?

(typically there's discussion with all the interviewers and it isn't just "did the candidate get the question right or not). I personally think a lot of big tech interview questions are dumb but I think the process isn't as broken as I thought, seeing it from both sides.


>If I was forced to ask this question as an interviewer, and the candidate said, "actually, you're wrong, here's why" that's a very good signal. Do most people not do this?

The question is whether Ballmers ego would allow him this flexibility if it's his own question.

Some people might be very emotionally attached to the questions they created, but not so much to those they've been given as an interviewer.

I've fared well pointing out issues with questions in the past and gotten the job. I'd try to be diplomatic about it though and not outright say they're wrong. Instead pointing out how with a classical binary search the expected value is negative, but there are strategies from game theory to deal with adversarial picks and here you could reach a positive expected value.

Kind of a "yes, and..." approach. You acknowledge their view, and then you add a new perspective. But don't say they're wrong.

Funny enough in situations where I suspect the interviewer was given the question it probably wouldn't have helped, not due to emotional attachment, but because the interviewer had a tenuous grasp of the topic themselves and couldn't stray from the script they were given.


I suppose that telling the interviewer that they're wrong is a good way for the interviewee to test culture fit.


> Do most people not do this?

I'd say it's impossible to answer this question conclusively within the time frame of an interview. That makes it, in my opinion, a bad interview question.

My answer to this question would be to show that I understand what it would take to answer this question correctly (you'd have to find a mixed strategy that has a positive expected value for every choice of number), I wouldn't be able to give a confident "yes" or "no" answer on the spot. I think that's the only correct answer.

In practice, I think this question is advantegeous for those who confidently blurt out an answer and then make up a heuristic argument for it. But a heuristic argument can be found for both "yes" and "no".


I think you're missing something.

Presumably Balmer did ask this question. At least a few times. And yet he never heard of the correct answer, and believed the incorrect answer to be correct.

That tells you that if anyone did say "actually, you're wrong" he never listened to them.


I don't work in the tech industry but I always assumed these questions were designed for you to demonstrate your problem-solving skills, regardless if you get the answer correct or not.

In this case, it would just be showing that you can reason about binary search and showing that the mean profit is 0.20 dollars


As long as it's used to figure out if the two parties would enjoy working with each other, I guess it's fine. But yes, increasingly often this turns into a quiz or worse.

At least we get some quality fiction like https://aphyr.com/posts/340-reversing-the-technical-intervie... and its sequels.


I think this is a good question, there are many veins of potential discussion which is what you want. It’s not likely a binary question.

Is it fair? Does he change his choice or pre-record it? Can you play multiple times?

Purely random distribution, totally fair? Sure play the game every time, the math pans out. It’s not that though.

It’s about showing your work


It is the perfect example, why you should pay attention in your math class too.


I don't think the average CS curriculum is going to cover advanced game theory. And you certainly aren't going to doing most linear programming on the spot in a interview.


This is actually extremely basic game theory, but I agree that most CS programs would probably not cover it.


A more extensive analysis of Nash equilibria including a numerical solution for the full game is presented in https://bowaggoner.com/blahg/2024/09-06-adversarial-binary-s...


Thank you, this is very interesting


I was looking for the comment that simply said "This looks right, good work!" and since I couldn't find one, let it be me:

This looks right. Good work!


Steve Ballmer's net worth is 120 billion dollars, so, assuming each game takes 30 seconds, it would take you 1.6 million years to win it all.


We let our computers play. My computer's AI vs Ballmer's AI. One trillion six hundred eighty-three billion thirty-six million fifty-one thousand nine hundred eighty-four computer games in 30 seconds.


Little Mathematics Library – Elements of Game Theory: https://mirtitles.org/2012/09/06/little-mathematics-library-...

This is a very nice book covering mixed strategy in game theory.

A very nice motivating example from the book: "There are two cards, an ace and a deuce. Player A draws either of the two at random; B does not see which card is drawn. If A has drawn the ace, he says "I've got the ace" and demands a dollar from his opponent. If A has drawn the deuce, then he may either (A1) say "I've got the ace" and demand a dollar from his opponent or (A2) confess that he has got the deuce and pay his opponent a dollar. The opponent, if he is paid the dollar voluntarily, can only accept it. If, however, a dollar is demanded from him, then he may either (B1) believe that player A has got the ace and give him the dollar or (B2) demand a check so as to see whether A's statement is true or not. If it is found that A does have the ace, B must pay A two dollars. If, however, it is found that A is bluffing B and has the deuce, player A pays B two dollars. Analyze the game and find the optimal strategy for each player and the expected payoff."


Can't wait for the paper that solves for the Nash equilibrium for this game.


Ballmer Peaking - an optional strategy for a number guessing game


Isn’t that what the Arthur O’Dwyer link gives?


Only for up to 5 numbers. Note that the Nash equilibrium for 100 numbers could require many pages to describe.


Right, I missed that one.


Nice! I tried to solve this the other day too, but came from the other angle--trying to find a probability vector for Ballmer that always won (finding the best response tree is n^3 complexity best I could find). I'm somewhat surprised since I figured for sure Ballmer had a big edge by picking numbers near the endpoints, making the player pay a large cost to check them.


> I’m thinking of a number between 1 and 100

People are unable to think randomly. They'll avoid the obvious "not random" numbers 2 and 99, for example. I read somewhere that most people, asked to pick a number between 0 and 10, will pick 7. And the next digit would probably be odd, and not 5, because 5 is not random. That leaves you with 71, 73, 77 and 79. 77 is not random, so 71, 73 or 79. I'd pick 73 as my first guess.

I'd say those were good odds!

(That's why when you're picking a "random" number, it's best to use an actual dice.)

This is how you win at hammer-paper-scissors, too.

Ballmer could also change the number he's thinking of as you make guesses, so part of the game would be guessing what he's thinking.


> Ballmer could also change the number he's thinking of as you make guesses, so part of the game would be guessing what he's thinking.

If I were to take the bet with him, I'd make him write down the number first hide it (turn the paper over / put it under a book, whatever).


> I’m thinking of a number between 1 and 100.

I guess this is part of the clarifications one normally asks when in an interview setting, but he has specified numbers and not integers. One could choose (pi*2)/2 and you will owe a lot of money.


Here is a chart for probabilities for starting value:

https://docs.google.com/spreadsheets/d/e/2PACX-1vThljkK2nUIL...

I find it interesting: it is definitely symmetrical, but I did not expect that in the final result 1/98 could be more important as a starting value, while 2-17/82-97 are not used at all.


This really depends on the pure strategies that you choose.

The initial set of strategies wasn't very diverse and compensated for the binary search "weaknesses" on the ends of the spectrum by sometimes guessing 1 and 98.

But after adding some more pure strategies to the set, we've got a far better mixed strategy that prefers the numbers between 28-70 as the first pick: https://github.com/gukoff/ballmer_puzzle#winning-strategy


O, wow, post got update!

  > Avg win if Ballmer chooses randomly: $0.16247848000093376
  > Win if Ballmer chooses adversarially: $0.14657033010415976
So the goal is to find a set of strategies where adversarial avg win == random avg win? Or these numbers will never be equal?


I did a very similar exercise after reading the original post.

You can get the EV a lot closer to the optimal +0.2 (Although I was unable to prove how close) by dropping the requirement "do not increase worst-case complexity for the binary search" as this is lost with initial guesses outside 36-64 anyway. Deviating at a higher depth makes punishing specific guesses in the tails a lot cheaper, only giving up 1-2 cents of EV.


Interesting! What about the worst case? And which kinds of strategies did you pick?


Using random strategies with small sub-optimal deviations, I get to about $0.189

Ref. https://pastebin.com/YcRhGpV6


It would only make sense if Ballmer writes the number he is originally guessing on a piece of paper and fold it before game begins. And win/loss is checked with what is written on the paper.

Otherwise it is a hidden mutable information game where Ballmer dynamically changes higher/lower for maximum tree depth and always make you lose.


What if he flips a coin? 50% chance to optimize for Binary Search and 50% chance to optimize for Ballmer Search?


I'd like an online demo where you play as Ballmer against an opponent using this strategy.


I wonder whether the search algorithm would need (and can?) to be adjusted to respond to the increased probability of playing numbers that are hard to find with standard binary search.


I dont get this. If this is true, then he found a more efficient algorithm than binary search. Why are we not using it in CS?


I'm no mathematician, but I think that if you know something about the probability distribution before searching, then you can be more efficient than blindly using binary search. And if you assume Ballmer is out to get you (i.e. the distribution is not random) then you can use that information to improve the search speed.


If a second party can submit adversarial values into a system, potentially causing a denial of service in a binary search (where comparisons are computationally expensive and data is unevenly distributed), there is a much simpler solution: avoid using sorted collections and binary search. Instead, use hashmaps. To address similar HashDoS attacks, many implementations (in Python, Rust, Java, etc.) use a randomized hash function, which vaguely resembles the idea of randomizing starting value for binary search.


The point is that Ballmer is an adversary, and may choose the worst cases for binary search. As I understood, the algorithm in TFA holds against any choice.

As others said, if you don't expect adversary behavior in your data, it should be good enough.


Binary search wins in the average case on random data. Ballmer is not required to choose randomly.


Honestly I think Ballmer would have appreciated this answer in an interview.


Only if he were hiring for game theorists game theorists game theorists game theorists


I 100% believe Balmer had an off by one error


From my own blog post (linked from TFA):

> If Ballmer is choosing his secret number uniformly at random, then the expected value of the game is [that you win $0.20]. But, as Ballmer points out in the linked video, if he knows you’re going to do a plain old binary search, then he certainly won’t ever choose 50 as his secret number. In fact, he has no reason to choose 25 or 75 either. Or 12, 13, 37, 38, 62, 63, 87, or 88. If Ballmer avoids those eleven numbers, and chooses uniformly at random from among the rest, that alone is enough to lower the expected value of the game from [$0.20] to about [−$0.0045].

So I think Ballmer was being perfectly honest in what he said: he does know a strategy that makes the expected value of binary search counterintuitively negative, and that strategy is (as he says explicitly) to avoid the first few numbers that you're going to guess. No further speculation about errors or deception on his part is needed.


> Out of the 100 numbers, there are 32 that would require you to ask 6 questions to make a guess.

Huh? I have 100 - (1+2+4+8+16+32) = 100 - 63 = 37, where 2^i numbers can be guessed after exactly i wrong guesses plus one correct guess.


Preliminary analysis: one in every three numbers has this undesirable property; the edges mess this up but shouldn't be able to add more than two extra undesirables; it should be impossible to have more than ceil(33.333) + 2 = 36 undesirables.

(Also, since six bits will serve to identify 64 different numbers, it should be impossible to have more than 36 numbers that can't be identified that way.)

I'll update with boring manual data later.

---- update ----

That was wrong; using a naive guessing method, I found 37 values that require 7 guesses.


Thanks for spotting it! Exactly right, I fixed the text.


moral of the story: you might be theoretically correct, but the other dude still has a net worth of 120bn and you don't. so who's the loser now?


I don't know. Play his game just shy of two trillion times and it will be you with the $120bn net worth!

It seems the real moral here is: The best time to plant a tree was 20 years ago.


not being a billionaire doesn't automatically make you a loser

a better moral of the story would be "a billion dollars does not guarantee that someone is right"


After watching the interview I can't imagine why anyone would spend time trying to solve for this or entertain this as a valid test of anything. In 7 guesses he TWICE amended if she was high/low after hearing her guess.


I watched the interview, and I see two problems:

- nowhere it says he has to choose whole number, he could choose fractions (55.25) or even irrational like PI. Number of questions can be infinitive.

- nowhere it says, he may not change his number while the game runs.

You pay upfront for each question, and you hope game is not somehow rigged. It is not just question of algorithms.

Also money you win is a taxable income, payments for hazard are not taxable expenses...


He also doesn't say he wants to play in this reality where the rules of maths hold. That his one and your one mean the same thing. That your accent doesn't have to match his. That you haven't got to be holding a certain pose when you say it. There are always implied rules, and Ballmer's implied rules are he'll use a whole number, not change his number and be fair. You could probably spend now until the end of time adding stipulations and he'd still be able to cheat.

I'd recommend never learning about philosophy as you'll disappear into nihilsm.

And lottery wins aren't taxable every where on the planet (e.g. the UK), so you made the same "mistake" as the author too!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: