Off Topic: Are You Smarter Than Google?

21 Dec 2018 • 7 minute read

It's the day before Cadence is shut down for the holidays. Breakfast Bytes will resume normal service on January 2nd. So today is something off-topic. I thought I'd talk about two very contentious problems in mathematics, that manage to get PhDs saying that the solution is "obvious" but picking different solutions. If you've not seen them before, I can almost guarantee that you'll get the answer wrong, or get it right through luck and faulty reasoning.

The first is known as the Monty Hall Problem. It dates back to 1975 but was popularized by Marilyn vos Savant in 1990 in, of all places, Parade Magazine, a Sunday insert magazine back in the days when journalism was printed on dead trees.

The second problem was supposedly one of those infamous interview questions at Google, and was popularized by economist Steven Landsburg on his blog and in his recent book Can You Outsmart an Economist? We'll call it "Are You Smarter Than Google?" since that's what he calls it. It is also apparently very old, but keeps being revived.

Let's start with the Google problem since that is a lot newer (at least to me).

Are You Smarter Than Google?

Here's the problem:

There's a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy and then they stop. What fraction of the population is female?

Of course, you can't know for sure and so, like all probability problems, it has to be answered in the expectation, if the experiment is repeated over many countries.

Steven Landsburg first came across this in a puzzle book when he was a kid, and the answer in the book, which is also the answer apparently Google expected, is incorrect. That answer, which was my reasoning too when I first saw the problem on his blog, was that each birth has a 50 percent chance of being a girl, and nothing that the parents can do can change that. Each child is equally likely to be a boy or a girl. So half of all the children are girls.

This answer is actually correct up to the very last sentence. The surprising fact is that although it is true that each individual child is equally likely to be a boy or a girl, it does not follow that (in expectation) half of all children are girls. Two things can be equal in expectation but this does not tell you about their expected ratio. Probabilities and statistics are full of traps for the unwary, such as Simpson's Paradox that I discussed in my Labor Day off-topic post Almost Everyone Has More Than the Average Number of Legs. In that, a player can have a better baseball batting average than another player two years running, but still have a worse average if the two years are taken together.

The correct answer is that it depends on the size of the country. As is often the case in math, insight is gained by pushing to very simple examples. Sometimes examples that are what mathematicians call "trivial" are useful, in this case, a country with no population. Unfortunately, there are no children, so we can make trivial statements like "all children are girls" but also "no children are girls". More useful, we can look at a country with one family. So with probability 1/2 they have a boy as their first child, and the ratio of girls to boys is zero (no girls, one boy). With probability 1/4 they have GB and the ratio is 1/2 (one girl and one boy). With probability 1/8 they have GGB and the ratio is 2/3 (two girls and one boy). And so on. Of course, there is always exactly one boy, the question is how many girls there are.

Now to add it all up. With probability 1/2 there are 0 girls, 1/4 there is 1 girl, 1/8 there are two girls, and so on. It is a well-known infinite series and it sums to 1 (just add up a few terms at it is already close). So the expected number of boys is one, the same as the expected number of girls.

But that is not what we were asked. We were asked for the fraction of girls. To add that up, we add up the fractions. There is a probability of 1/2 that it is 0 (no girls). There is a probability of 1/4 that it is 1/2 (one girl, one boy). There is a probability of 1/8 that it is 2/3 (two girls and a boy). And so on. These add up to 0.306 meaning that the fraction of girls is 30.6% (so not far off 1/3 rather than 1/2).

We can do similar calculations for countries with any number of families. If there are 10 families it is 47.51% girls, for example. If the numbers get large, say 5,000 families, the fraction gets close to 1/2 at 49.995% but it never gets there, and even will millions of families it is still just a teeny amount below 50%. Only in the case of a country with an infinite number of families, which doesn't exist, is the 50% answer correct.

As Steven Landsburg points out, you can't just say that the tiny difference is irrelevant. If you do the same problem, but instead of asking the fraction of girls, you ask the ratio of boys to girls, then one of the terms that you have to include in the series is the tiny possibility that everyone gets a boy on the first chance, with an infinite ratio of boys to girls. The probability might be low, but a tiny number times infinity is still infinity, so that answer is infinite, which is a long way from 1/2.

Advanced question: the solution as given assumes that everyone has finished their family. But more realistic would be to take a snapshot in time, in which case some families haven't had any children yet, and some still have had only girls. Does it make any difference?

The answer to this variation, along with lots of other mind-bending puzzles, is in the highly recommended (by me anyway) book Can You Outsmart an Economist?

The Monty Hall Problem

As originally asked in Parade in the Ask Marilyn column:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

At first sight, it seems that switching can't change your odds from being 50:50, which means that the answer would be that it makes no difference. But actually, you increase your chances to 2/3 by switching, versus only 1/3 by sticking with the door you already picked. When I said that it seems you can't change your odds, I talked about the wrong odds. When you first picked a door, your chance of getting the car was 1/3 (because there are three doors). Nothing can change that probability. No matter what Monty Hall with doors and goats changes that. Given that, the door that Monty could have opened but didn't has a chance of 2/3 (since there are only two choices and they have to sum to one).

To make it a bit more concrete, since the details matters, we need to assume that Monty Hall, the host, always opens one of the doors, and always (since he knows where they are) picks a door with a goat behind it. If you have picked the car, he can open either door. But if not, he has only one choice, the one without a car.

I think the easiest way to think about it is that you pick one door. There is a probability of 1/3 you have the car. There is a probability of 2/3 that the car is behind the other two doors. Monty opens a door. There is still a probability of 2/3 that the car is between one of those two doors, you just know an extra piece of information, that the probability is 0 where behind the open door with the goat. The above is NumberWatch and Wikipedia's pictorial representation.

For more information, the problem has its own Wikipedia page The Monty Hall Problem. I like to say that there's an XKCD for everything, and of course, there is for this one. Wishing you all the best for the holidays and see you in 2019.

Sign up for Sunday Brunch, the weekly Breakfast Bytes email.