A set of numbers such that when you remove one number, the rest can be arranged into two equal groups with the same sum

Suppose you have a set of 17 numbers with the property that if you remove any one of the numbers, the remaining can be split into two groups of 8 which sum to the same total. We will show that the numbers must all be equal.

If we have a set of 17 numbers with this property, then we can create another set of 17 numbers by subtracting a constant. So if a_1 \cdots a_{17} are one group of numbers with the property, then so is b_i = a_i - K.

Similarly, if a_i have the property, then so do c_i = K a_i, where K is a constant.

In particular, this means that we can shift it so that one of the numbers is zero.

To begin, let us assume the numbers in question are all integers. Let us take out the number which is zero. Therefore we get two groups which both sum to the same (integral) value, call it S. Therefore the sum of all the numbers T is simply 2S + 0. Therefore T is even.

Let us now take out another one of the numbers, call it k. So we get two groups of numbers again which sum to the same value, call it S’. So T = 2S’ + k. This implies that k is even, as is.

Therefore each of the numbers is even. But if that is the case, we can divide all the numbers by two to get another set of integers with the property. This process can be continued indefinitely, which is impossible unless all the numbers were zero to start with. Thus we conclude all the numbers were equal to zero. That means in the original case, before a constant was subtracted, they must have all been equal. (c.f. Proof by Infinite descent)

The case for rational numbers is simple, as you may just multiply all the rational numbers be a constant in order to make them all integers (for instance, by multiplying by the product of the denominators). Thus they are all equal.

The case for real numbers requires a trick. Denote the 17 numbers by a_i. These numbers generate a finite dimensional vector space over the rational numbers (of dimension at most 17, call it N). Therefore we can choose a basis (without the Axiom of Choice). Call this basis r_j, 1 \le j \le N \le 17.

Therefore each of the a_i can be written uniquely as a sum of the r_j multiplied by a rational number. Write [a_i]_j for the coefficient of r_j in this representation.

Fix j. Then the set of 17 numbers [a_i]_j, 1 \le i \le 17 form a set of rational numbers with the property that taking out any one of them allows the remaining to be rearranged into two groups with the same sum. This is because the set of 17 real numbers have this property, and whatever the two sums are, the r_j coefficients must be equal by the uniqueness of basis representations.

However, this implies by the above that the value of [a_i]_j is constant for different values of i.

Therefore all the coordinates are individually constant, and so the 17 numbers are all equal.

This problem is from Problems and Theorems in Classical Set Theory. Obviously, no use is actually made of the fact there are 17 numbers. In fact, no use is made of the fact that the two groups of numbers are equal sizes.

Was Wronski Wrong?

I was reminded of a problem from a text book from school. The book was MEI Pure Mathematics 5, First Edition. It had the following question:

The Polish mathematician Hoëné Wronski (1778-1853) once wrote that:

\pi = \frac{2\infty}{\sqrt{-1}}\left\{ (1+\sqrt{-1})^{1/\infty} - (1-\sqrt{-1})^{1/\infty}\right\}

Was Wronski wrong?

At the time I found this problem very hard and I was very proud to have solved it. Now I am older, I don’t understand how I could have ever found it difficult, but I suppose that is a good thing.

Let us first turn it into modern notation. We wish to find q, where:

q = \lim_{x \rightarrow \infty} \frac{2x}{i}\left\{ (1+i)^{1/x} - (1-i)^{1/x}\right\}

Note that 1+i = \sqrt{2} e^{i \pi/4} and 1+i = \sqrt{2} e^{i \pi/4}. Using these representations,

q = \lim_{x \rightarrow \infty} -2^{1+\frac{1}{2x}} xi \left\{ e^{i\pi/4x} - e^{-i\pi/4x}\right\}

And so

q = \lim_{x \rightarrow \infty} -2ix2^{1/2x} (2i \sin (\pi/4x))

Now, if we use the fact that \sin {x} \approx x for small x, we have:

q = \lim_{x \rightarrow \infty} 2^{\frac{1}{2x}} \pi

and hence q =\pi as required.

So, yes, Wronski was right.

Arithmetic: then and now

The Conservatives recruited Carol Vorderman to head a ‘mathematics task force’ before the election, in 2009. After the election, in August 2011, the ‘task force’ reported (Telegraph article / Original Report). I find it a little bit puzzling that Vorderman was selected for this, as she is neither a teacher nor a mathematician. (She only managed a 3rd in each of her years at Cambridge studying Engineering.) Nevertheless, the bulk of the work was done by Roger Porkess, it appears, who is head of MEI. He is good.

The report has some good points in it. Nevertheless, page 52 is interesting. It shows three exam questions from previous papers, and the following remark:

Our conclusion from looking at the papers from these three years is that O level was harder than what came before or after it. However, a comparison is not entirely fair because GCSE is designed for a very wide range of students: O levels were not. Discussion of the standard of this, or any other, qualification needs to take place in the context of its fitness for purpose. However, there is compelling evidence that grade inflation has occurred.

The questions are as follows:

(In what follows I will answer the questions. This is probably fairly pointless, as if you read this blog you can almost definitely do them yourself.)

I think the 2011 GCSE question is actually pretty good as approximating an expression is a useful, practical skill which is not so mechanical. Presumably, this was a question in an exam without a calculator. (At least when I did the exam, there was a non-calculator exam and a calculator based one.)

If I were to answer the question, I would multiply the fraction 95.9/(0.81\times 0.62) ‘top and bottom’ by 100, and approximate 95.9 as 100, to give 100^2/(8.1\times 6.2). The product in the denominator is then approximately 8\times6 = 48 \approx 50 and thus you get 100\times100/50 = 200, as required.

I consider that a fairly good, practical problem. It’s much shorter than the other two questions, but this can be attributed to a change of style.

The 1971 O Level question is not as good in my opinion. The first part is, I suppose, a test of their understanding of adding fractions. That the whole thing is itself in a fraction is presumably just to make the question seem harder. It shouldn’t really make the question any harder, but I suppose under pressure in an exam it does. The obvious thing to do is to multiply the fraction ‘top and bottom’ by xy. This gives:

\frac{xy}{x+y} = \frac{3 \times 7}{3+7} = \frac{21}{10} = 2.1

The fact the 3 and 7 were chosen, which sums to 10, makes the question easier if it wanted an answer as a decimal expansion. Presumably an answer as a fraction would be acceptable too.

Let us skip the second part for a moment and head to the third. Clearly the values of and q do not matter here, as they are cancelled out. The answer is hence merely 3. If for whatever reason you do not notice this, you have a lot of calculating to do! Nevertheless, it is a simple factorisation and will certainly be tested in more modern exams.

Going back to the second part – it is a pain really. Log tables are the easiest solution within the rules of the exam. Rather than find any particular shortcut, it is easier to ‘just do it’. Find the log of p, double it, anti-log it. Do the same with q. Add the two numbers together. Log it. Half it. Anti-log it. Then we got the answer. It’s boring, and how is it any better than using a calculator? All you are doing is looking some numbers up in a table and occasionally adding or halving them. It’s just a procedure. Perhaps it was useful then, but it isn’t useful now.

The first question, from the 1931 School Certificate, I actually like quite a bit. Let us do the third part first, as it the easiest. It is well known that 1/3 = 0.333…, and that 1/100 = 0.01. Thus 1/200 = 0.005. It follows that:

1/3 - 1/100 - 1/200 = 0.318333\ldots \approx 0.31833

Now let us go back to the first part. We wish to find find the decimal expansion of 7/22. (Note that 22/7 is a popular approximation of pi, which comes from the continued fraction expansion of the number. 7/22 is also the approximation from the first three terms in the continued fraction expansion of 1/pi)

We could just use long division and not think about what we are doing, but I prefer to do these calculations more explicitly. To find the first decimal place of 7/22, we can multiply it by 10 and then take the integer part. This gives 70/22 = 3 + 4/22. Hence it starts 0.3. We then continue with 4/22 – multiply by 10 and take the integer part. This will clearly give the next decimal place. Now 40/22 = 1 + 18/22. Thus the expansion of 7/22 starts 0.31. Continue again: 180/22 = 8 + 4/22. Thus it continues 0.318. We have already seen 4/22, and hence it will yield a sequence continuing 1 then 8 forever. Hence the expansion is 0.318181818…, or 0.31818 to five decimal places.

The second part is unfortunately basically identical, and so is rather dull. We have:

  • 710/223 = 3 + 41/223
  • 410/223 = 1 + 187/223
  • 1870/223 = 8 + 86/223
  • 860/223 = 3 + 191/223
  • 1910/223 = 8 + 126/223
  • 1260/223 = 5 + 145/223

And hence the decimal expansion starts 0.318385, and so is 0.31839 to 5 decimal places.

Therefore our numbers are 0.31818, 0.31839 and 0.31833. The closest to the target number of 0.31831 is thus the third. (What were they playing at! 106/333 is much closer than all of them!)

Perhaps the third part is still useful today, but exactly evaluating the decimal expansions of fractions probably isn’t because of computers. So it is merely more modern.

It’s hard to say really which questions were harder. I didn’t like Part II of the second question as I couldn’t think of an easy way to do it. The other parts of the question were brainless however. The third question was interesting, but the first question was the most interesting, except for Part II which required you to do a load of arithmetic which was almost identical to what you had just done, which is boring. No wonder everybody hates maths.

I don’t have any solid conclusions myself. Indeed, when I was at school I looked through A-Level mathematics exams from the 1960s onwards. I actually thought the hardest papers were in the years 1990-1992. Previous years did not require you to understand anything, but just required you to ‘do’ things which could be learnt without understanding why. Later years were just easy. Happily, the exams I did myself were of the easy variety !

It is often said that people are worse at maths now than previous generations. Perhaps this is true, but I’m not entirely convinced previous generations were any good at it either. For instance, perhaps people who went to score in the 1970s can multiply and divide long integers, but in my experience from having seen people do it, they merely perform an algorithm they have learnt and never understand it. My calculator can do that, and it can do it better. When something goes wrong, they don’t know what they might have done wrong because they don’t know why it works. They might not remember what to do when they encounter a zero, etc.

Perhaps I will try and expand on this subject in future.

Expected number of different birthdays

Suppose there are n people in a room. It is the well known ‘birthday problem‘ that if n > 22 the probability of two people sharing the same birthday is greater than 50%, and so it is very likely that there aren’t n unique birthdays. However, what is the expected number of unique birthdays?

[Let us assume there are D days in a year, in order to be more general.]

Actually, the expected number of unique birthdays is very easy to calculate. Let e(n) denote the expected number of unique birthdays when there are n people.

Choose one of the n people. Then either they have a different birthday than everybody else, or they don’t. They have a different birthday than the other n-1 with probability \left(\frac{D-1}{D}\right)^{n-1}. Let us write \phi for (\frac{D-1}{D}), and so the probability that they have a different birthday to everybody else is \phi^{n-1}. In this case, the expected number of unique birthdays is 1+e(n-1). In the other case, which thus has probability  1 - \phi^{n-1}, the expected number is just e(n-1).

Hence,

 e(n) = \phi^{n-1}(1+e(n-1)) + (1-\phi^{n-1})e(n-1) = \phi^{n-1} + e(n-1)

As e(1) = 1, it follows that:

e(n) = \phi^{n-1} + \phi^{n-2} + ... + \phi + 1 = (\phi^n - 1)/(\phi-1) = D(1-\phi^n)

 (As one should expect, as n tends to infinity this tends to D, as it is almost certain that every birthday gets taken)

For instance, with D = 365, you should expect about 21 different birthdays when there are 22 people, or 46 different birthdays when there are 50 people. When there are 1000 people, there will be around 241 different birthdays. Perhaps it is quite surprising that you should expect there to be so many unclaimed birthdays with so many people!

Consider the following problem:

Suppose a factory is in a country with the following law: Every worker must work every day, unless it is one of the workers birthdays. In this case, everybody has the day off. How many people should you hire in order to maximize the expected number of man-days worked every year?

Naturally, if there is only a single person he will certainly work 364 days (D-1) a year. If there are two, they are likely to have different birthdays and so each will work 363 days a year (D-2), giving a total of 726 days.

The expected number of man days worked will be the number of people, n, multiplied by the expected number of non-birthdays. The number of non-birthdays is just D – e(n). Thus we are trying to maximise the following:

W(n) = n(D-e(n)) = nD\phi^n

Considering n as a real number instead of an integer, one can find that the function W has a maximum at n = -1/\log\phi = -1/\log(1-1/D) \approx D - 0.5. Thus the maximum must be around this, and so for all but the smallest D (where the approximation is not valid), the maximum will be either D-1 or D.

In fact, it is easily seen that W(D) = W(D-1). It follows that one should employee either D or D-1 people. Since you will probably have to pay these workers, it is better to go for the D-1 people.

A simple change of this problem is to consider that for each man-day of work in the factory, you earn £a. However, you must pay each worker £y per year. Then instead you are maximising the yearly income of:

V(n) =anD\phi^n - ny

The derivative of this is V'(n) = aD\phi^n + a D n \phi^n \log \phi - y. Hence if V'(n) = 0 we must have:

y/aD = \phi^n(1+n \log \phi) = e^{n \log \phi} ( 1 + n \log\phi )

Define w = 1 + n\log\phi so that:

y/aD = w e^{w-1}

And so we have that:

ey/aD = w e^{w}

Let us note that y/aD is the ratio between what you pay the worker per year, and what he earns for you per year if he were to work every day. Let us call this ratio P.

Note that when z is positive, the equation z = w e^w always has a unique solution. The function which takes z to w is called the ‘Lambert W Function‘. Hence we have that:

W(eP) = W(ey/aD) = w = 1 + n\log\phi

It follows that:

n = (W(eP)-1)/\log\phi

As P will be small, and in particular less than 1, an approximation of the Lambert W function may be useful. We will use the Taylor Series approximation W(x) \approx x-x^2+3x^3/2 - 8x^4/3. Along with the usual log approximation, this gives:

n \approx D-\frac{e y}{a}+\frac{e^2 y^2}{a^2 D}-\frac{3 e^3 y^3}{2 a^3 D^2}+\frac{8 e^4 y^4}{3 a^4 D^3}

Or, in terms of P,

n \approx \frac{1}{6} D \left(6-6 e P+6 e^2 P^2-9 e^3 P^3+16 e^4 P^4\right)

For instance, suppose that I pay my workers £18,000 per year. They generate about £350 in profit for me each day they work. They work every day, except for when one of them is a birthday. By evaluating the W function exactly, I find out that the maximum expected profit is when there are around 260 workers. The approximation yields 269 workers. If the profit they generate per day were to be decreased to £75then the approximation is completely inaccurate. Whilst I should only employee 72 people, the approximation gives over 7,000!

Generalising Sylvester’s Law of Inertia

Sylvester’s Law of Inertia is a statement about how the Eigenvalues of Hermitian matrices change under matrix *-congruence. Equivalently, it is a statement about possible forms of diagonal, normalised matrix representations of sesquilinear forms.

Using the former interpretation, it is possible to generalise the theorem a bit. This generalisation is in fact of practical importance. (This importance is not described here. Essentially it naturally arises because if the covariance matrix of a random vector X is R, then the covariance matrix of AX is ARA’, where a dash denotes conjugate-transpose).

The Axiom of Choice for Finite Sets

Define Cn as the statement that every set (irrespective of cardinality) with all elements of cardinality n has a choice function. One fact is that C2 implies C4.

This document explores these implications.

“I think that usually with 10 points all the teams go through always, 99 percent” – The Champions League Group Stage

Manchester City failed to qualify from their group in the Champions League despite getting ten points, prompting their manager, Mancini, to say what is quoted in the title of this post.

A group in the Champions League has 4 teams in it. Each team player each other twice. There are three points for a win, one for a draw, none for a loss. Since 1999-2000, the top 2 teams (ordered by points, then in the case of ties head-to-head goal difference, and then general goal difference) in each group quality to the next round.

Is it true that ten points has guaranteed qualification 99% of the time? Well, it’s not too far out at 2.5%. On 158 occasions has a team achieved 10 or more points, since the 1999-2000 season (but excluding this season). Only four of those teams failed to qualify, and each of those got exactly 10 points.

These were:

  • Werder Bremen in 2003-4 (Barcelona got second play with 11 points)
  •  Olympiacos and Dynamo Kyiv in 2005-6 (Liverpool and Real Madrid went through in second place with 10 and 11 points respectively)
  • PSV Eindhoven in 2006-7 (Deportivo had 10 points but went through instead of them).

Whilst in practice no team has ever failed to go through with 11 points, it is in fact possible to still miss out with 12 points. Imagine three teams A, B and C who win all their home games against each other (and so lose all their away games against each other), but all beat D both home and away. Then you get the following table, where C misses out on qualification:

13 points does ensure you qualification. As suppose you fail to qualify with 13 points. Then another two teams must get at least 13 points, and so there are at least 39 points on the board. But it is only possible to get 36 points on the board, as there are 12 games and each game can add up to three points on the board.

In reality, no team has qualified with less than 7 points. However, teams have qualified with exactly 7 points. Werder Bremen in 2005-6 and Lokomotiv Moscow in 2002-3 both came runners up to a high-scoring Barcelona. In theory, it is possible to get four points and still qualify. Imagine team A wins every game and teams B, C and D draw against each other. Then you get the following table:

The Determinate of the Anti-Diagonal Matrix

Let us define the “anti-diagonal” matrix of dimension k, Jk, as matrix with ones on the anti-diagonal and zeros everywhere else. For instance, the first few matrices are:

The question is to find the determinate of Jk. It’s a straight forward question, but I am writing a blog post on it because it is interesting in that it allows a few different approaches to easily be taken, and so it is interesting to see which you select. This post will document five solutions to the problem.

Solution 1: Calculation via minors

The usual way of actually calculating the determinate of matrices is to use ‘minors‘. Let us use the minors expansion in the first column. All the terms in the first column are zero except for the bottom one, and so you get that:

|J_k| = (-1)^{k+1} |J_{k-1}|

With the extra condition that J1 = 1, it follows that:

|J_k| = (-1)^{ (k+1) + k + (k-1) + \cdots + 2} = (-1)^{(k-1)(k+4)/2}

In order to bring it to the same form as the following solutions, we note that if k = 2r + s then $latex (-1)^{(k-1)(k+4)/2} = (-1)^{(r+3s+s^2)/2 = (-1)^r&bg=ffffff$ where s is either zero or one.

Solution 2: Using Eigenvalues and the trace

This method is particularly elegant.

Note that as J_k^2 = I and so all Eigenvalues \lambda of Jk have \lambda^2 = 1, and thus all the Eigenvalues are either 1 or -1.

Let Tr(A) be the trace of a matrix A. We have that Tr(Jk) = 0 when k is even, and Tr(Jk) = 1 when k is odd. The trace is the sum of the Eigenvalues, and so in either of the cases k = 2r or k = 2r + 1, we must have r pairs of -1 and 1 Eigenvalues. (In the odd case we have an addition 1 Eigenvalue.) As the determinate is the product of the Eigenvalues, it must be |J_k| = (-1)^r.

I like this method the best, but if it had a flaw it would be that it doesn’t easily generalise to the case when the anti-diagonal elements are arbitrary, instead of being ones. However, this doesn’t really matter, as the properties of a determinate just mean you can ‘pull out’ the values anyway, and so it is just the product of the diagonal elements multiplied by |Jk|.

Solution 3: Via Swaps

The determinate is such that if you swap two adjacent columns (or rows), you reverse the sign. We also know that the determinate of the identity matrix is 1. Therefore it should be possible to turn Jinto I using a series of swaps.

Therefore the problem reduces to finding how many swaps can turn the sequence [k,k-1,...,2,1] into the sequence [1,2,...,k-1,k]. To solve this, note it takes k-1 swaps to ‘push the k to the right’, and another k-2 to push the 1 back to the left. Therefore after 2k-3 swaps you get  [k-1,k-2,...,1,k] . After that, you can apply the same result for k’ = k-2. Therefore we have that |Jk| = -|Jk-2|.

Given the base cases for k=1 and k=2, the result follows.

Solution 4: Via Permutations

This method is essentially the same as the above, but carried out in a different way.  We use the following definition of a determinate:

Using this definition to calculate |Jk|, we note that only one permutation contributes anything to the sum: The one which takes i to k-i. In this case, the determinate is just the sign of the aforementioned permutation.

The permutation swaps the first and last, second and second-last, etc. Therefore, if k = 2r, there are r permutations. If k = 2r + 1, there are still r permutations, as the central element (r+1) remains fixed. Hence the determinate of k is (-1) to the power r.

Solution 5: Directly via the Eigenvalues

Whilst this is a difference approach to Solution 4, it gives fairly similar calculations. The determinate of a matrix is the product of the Eigenvalues. Thus we can just determine the Eigenvalues.

Jswaps coordinate 1 with coordinate k. Therefore [1,0,...,0,1] is an Eigenvector with Eigenvalue 1. Also, [1,0,...,0,-1] is an Eigenvector which goes to negative itself, and so -1 is an Eigenvalue.

As above, there are r coordinates which are swapped when k = 2r or k = 2r + 1. Therefore the determinate is (-1) to the power r.

The solutions do have similarities, but they also have differences, and I think it is interesting the number of ways there are to approach this simple problem.

“An Introduction to Primes & Proof” back online

An Introduction to Primes & Proof” is a ‘mini-book’ intended for readers who like mathematics, but want to learn more about it. In particular, it is for the transition from doing algebra into learning about proofs. As the title suggests, prime numbers are the vehicle for doing this. It introduces proof by contradiction (using Minesweeper) and proves the basic theorems about prime numbers.

Earlier versions carried on a bit, talking about the zeta function. These sections have now been removed in order to make the it easier.

View

Three Birthdays On The Same Day

At the company I work at, which has about 35 members of staff, it was discovered that three people have the same birthday. The so-called ‘Birthday Paradox‘ tells us that two people having a birthday on the same day is much more likely than one would imagine, so it is interesting to calculate the probability that three people have the same birthday.

Let us suppose that there are N members of staff. Let our probability space be the set of N-tuples of integers between 1 and 365 (inclusive). There are 365^N elements in this probability space.

Let us denote the event that there is at least one triplet B. Similarly, let us denote the event that there are exactly k pairs of people who are sharing a birthday (and no triplets or higher) as A_k.

Then we have that the probability that B happens is one minus the probability that A_0 or A_1 or A_2, \ldots happens. That is,

\mathbb{P}(B) = 1 - \mathbb{P}(A_0) - \mathbb{P}(A_1) - \mathbb{P}(A_2) - \ldots

Hence we must find an expression for A_k – the probability that k pairs of people have the same birthday and everybody else has a different one.

Recall that we are working in the space of ordered N-tuples. Therefore we need to select the positions of each of the pairs. For the first pair we do ”N choose 2″, for the second we do “(N-2) choose 2″, and so forth. So, selecting the positions of the birthdays we want to be duplicated, we can do:

\binom{N}{2} \binom{N-2}{2} \cdots \binom{N-2(k-1)}{2}

Note that we have an ordered list of k unordered pairs.

We can simplify the above expression to \frac{N!}{2^k (N-2k)!}.

Now, we need to choose the k birthdays for these pairs to have as the common day. Obviously, each pair must be different, else we’d have (at least) four people with the same birthday, and we don’t want more than two people with birthdays on the same day.

Suppose we pick the k days in an ordered fashion (i.e. 365\times364\times\cdots\times(365-k+1)). Then we will have over-counted, because the pairs are in an ordered list. For instance, suppose we picked the ordered list of unordered pairs “Person 1 & Person 2″, “Person 3 & Person 4″. Then we selected the ordered list of birthdays “1st Jan, 2nd Jan”. Then this would give us the same thing as if we’d selected the pairs “Person 3 & Person 4″, “Person 1 & Person 2″ with the list of birthdays “2nd Jan, 1st Jan”. In this case we over counted by a factor of 2. In general, we will overcount by the number of permutations of the list of pairs. This is k!.

Now, the remaining N-2k people can have any remaining birthday. So the first has a choice of 365-k, the next a choice of 365-k-1 and so forth. Therefore,

|A_k| = \frac{N!}{2^k (N-2k)!}\frac{( 365 \times \cdots (365-n+k+1))}{k!}

So, for instance, in Mathematica:

A[N_, k_] := N!/(2^k (N - 2 k)!) * 365!/((365 - (N - k))!*k!)

We can verify this by ensuring that 1 - \mathbb{P}(A_0) is the probability that at least two people will share a birthday in a group of N people. Remember that \mathbb{P}(A_k) = |A_k|/365^N.

We wish to find the sum of the A_i. There is no obvious closed form expression for it, so it will suffice to numerically calculate it for the desired values. Doing so yields the following graph:

The probability of three birthdays on the same day in a group of various sizes.

In particular, for my situation – that out of a company of 35 people, three had the same birthday – has a probability of about 4.5%. In fact, these members were at the company back when it was only 12 people strong, and then it had a probability of only 0.33%.

You need a group of 87 people before the chance of three people having a birthday on the same day is 50%. 75% happens after 110 people and 25% at 65 people.

The following graph shows the probability of having two people sharing a birthday in blue and three people sharing a birthday in red:

The red is the probability of three people (or more) sharing a birthday. The blue is the probability of two people (or more) sharing a birthday. The horizontal axis is the number of people; vertical is the probability.

This spreadsheet gives the full table the probabilities. (Apologies for the Excel format, WordPress didn’t seem to want plaintext files uploaded.)

Follow

Get every new post delivered to your Inbox.