Sunday, May 23, 2010

McKinney's Generalization of the Birthday Problem: Part II

In Part I of the article, we considered the expression for computing the probability that at least r people will have the same birthday, given n people are selected at random.

In this article, we will consider a special case to illustrate the use of formula numerically: Suppose five people are selected at random, what is the probability that at least three people will share the same birthday?"

In this example, the Frobenius equation n1 + n2 = 5 has three sets of solutions and they can be tabulated as follows:

Following McKinney, we define the general form of the probability P (n; n1, n2) as

It follows that P(E) can be computed as follows:

The required probability that at least three people are sharing their birthdays is therefore:

Alternatively, we could also arrive at the same result if the problem is approached from another direction, but we would require a slight modification on McKinney's formula.

The required probability is thus the summation of the following fractions:

This is approximately 0.007 percent. From the probability table above, we noted that if we were to select five people at random, chances are 97.3% that they will all have unique birthday, this is consistent with our intuition.

Read more!

McKinney's Generalization of the Birthday Problem: Part I

In 1966, E. H. McKinney of Ball State University published a paper entitled "Generalized Birthday Problem" in the American Mathematical Monthly, in which he seek to answer two questions related to the famous birthday problem.

The problems considered by McKinney are (a) Suppose n people are selected at random. What is the probability that at least r people will have the same birthday? (b) What is the smallest value of n such that the probability is greater or equal to 1/2 that a least r people to have the same birthday? What follows is the solution by McKinney with some notational adjustment.

Let Xj (= 1, 2, ... , n) be the n random birthdays, let event E' be defined as "r or more birthdays are equal" and the event E be defined as "none of the r random birthdays are equal", then we obviously have P(E)+P(E') = 1, in which P(E) can be computed by summing the probabilities of all ways in which the n random birthdays can take on less than r equal values. Suppose we use the following notation:

Then it is obvious that nj are governed by the following Frobenius equation (also called the Diophantine equation of Frobenius):

In general, there are m sets of solutions to the Frobenius equation and we denote the i-th solution as

The general term of the summation representing P(E) is the probability that there are exactly n1 unique birthdays, n2 pairs of equal birthdays, n3 triples of equal birthdays, ... , nr-1 (r-1)-tuples of equal birthdays, which is given by

Therefore, the probability that r or more birthdays are equal is given by

McKinney was using an IBM 7090 computer to compute the probabilities and he stopped at r = 4. For r = 5, the computation for one case of n is estimated by McKinney to take more than 2 hours. Thus it would take him 26 days to arrive at the solution if the computation was attempted.

I tried to repeat McKinney's calculation for second problem in Mathematica 7 and detected that the third/fourth decimal place of the probabilities given in original paper was not correct, possibly due to roundoff errors in his IBM 7090.

It took only 22 seconds to compute the probabilities in Mathematica for the case of r = 5. Not surprisingly, I ran into memory problem when the case of r = 6 is attempted as the number of terms in the summation is expected to exceed 15 millions. In Part II of the article, a numerical example will be given to illustrate the use of the formula for a simple case.

Read more!

Amazon Contextual Product Ads