Tuesday, July 28, 2020

Coder's Block and the Birthday Problem


So, I've been going through something of a Coder's block, which is just a fancy way to say that I've been procrastinating too much instead of working. I get ideas, work on them with pen and paper, but when it comes to testing them, even very simplistic stuff is hard to write down. So, I thought maybe trying something very simplistic (like undergrad first year kind of simplistic) might help at least writing something. And maybe that would help with getting out of this fancy block.

When thinking of simple problems, the Birthday Problem came to mind, which asks-
What is the smallest n, such that in n randomly chosen people, the probability that a pair of them will have the same birthday is at least half?

Now, this problem can be solved analytically (and the solution's really neat as a counting problem). But I'm not really in a counting block, I'm in a coding block. So instead of solving this analytically, decided to do a Monte Carlo (MC) simulation for it. An MC simulation really comes down to generating different random numbers. So, for different values of n, we generate n random integers between 0 and 364 inclusive (why?) and see if any pair among the n are equal. We repeat this experiments a number of time for each n and count how many times, such an event has happened. And we treat the fraction of time this has happened as the prediction of the probability. (In actuality, to get a better prediction, we should repeat this experiment multiple times and take the means, but again, that's not the point of why I was doing this).
As it turns out, the number of people we need to get the probability above half --according to the results of our code, which is shown below-- is merely 23. This is the same as the analytically found result as well.
Now, this was a result that I used to find very intriguing back in high school. There are 365 days in a year after all. So, why can only 23 people give a probability of more than half that two of them share a birthday? Interestingly, this is easy to understand if we count the right thing. We are essentially looking at "pairs of people" than individuals for the Birthday Problem. And how many pairs of people are there with n peoples? n(n-1)/2. For 23 people, this is 253, a number much closer to 365. And suddenly the result does not seem so weird. This argument can naturally be extended to support the fact that soon the probability reaches almost 1. Any way, this 20-ish line code with a couple of for loops and ifs were easy enough. Let's just hope that doing simple exercises does help in getting back to actual coding that I "need" to do.

No comments:

Post a Comment