Keypads with paint on the buttons

Near my home there is a door that is opened by a keypad. I know that it uses a four-digit code, where the order matters, but I don’t know what the code is.

doorlock

An example of a keypad

When I walked past it the other day I noticed that there were blue fingerprints over three of the numbers on the keypad, but I did not look very closely and therefore there may have been four. If there were only three numbers with fingerprints on them then I would infer that one of the numbers in the code was repeated.

I wondered this: If I wanted to guess the code as quickly as possible would I prefer that there were three digits or four digits with paint on them?

Intuitively, most people (at least in my informal survey) seemed to expect that it is more constrained when there are only three digits, and so there are fewer combinations. After all, if there was only one blob of paint, so all the digits were the same, then there would be only one possible combination, so you may expect that each digit adds more freedom and thus more combinations.

In fact, this is not the case, and you would be better placed to guess the code if you knew it had four different digits.

The number of combinations to try if you know there are four different digits is of course a.  If instead it is known there are only three different digits in the code then one of them is repeated. To calculate the number fo combinations, imagine choosing the two positions in the 4-digit code that are the same digit – there are 6 (i.e. 4C2) ways to do this – and then choose which digit is the one which is repeated – there are 3 to choose from. The remaining 2 digits can be arranged in 2 ways – therefore there are b ways of doing it – 50% more!

I thought this was somewhat surprising.


In general, suppose the painted key-pad buttons are c and the lock takes an M digit code. Of course, d.

I think it is easier at this point to reason if we forget about locks and talk about functions instead. A function e2 is equivalent to a possible code if it is surjective. The code it represents is f2. As f is surjective every painted digit appears at least once, as is required.

We can therefore restate the problem as counting the number of surjective functions from {1,2,…,M} to {1,2,…,N}.

It’s easy to find out the number of functions if we drop the ‘surjective’ requirement – it is NM. This number includes non-surjective functions – where the image of the function is some proper subset of {1,2,…,N} – as well. An inclusion-exclusion argument can be used to calculate the number of surjections.

In particular, define cito be the number of functions f from {1,2,…,M} to {1,2,…,N} such that at least i of elements of {1,2,…,N} are missing from the image of f (i.e. the image of f is at most N-i).Ci is then easy to calculate – choose the i elements to exclude (there are NCi of them) and then assign to each f(j) one of the possible remaining N-i values. Repeat that for all M function inputs and hence:

x1

Then, by inclusion-exclusion, the number of surjective functions is:

x2

And hence the number of surjections, and hence the number of possible lock codes of length M using N different digits, each at least once, is:

x3

This is related to Stirling Numbers of the Second Kind.


Not all keypads use four-digit codes. Looking online, most digital keypad locks can use between 1 and 12 digit codes. The following chart shows the approximate number of combinations that are possible if the length of the code is known (the number on the left-hand side) and the number of painted digits is known (the number on the top):

Exactly

(Here a million is 106 and a billion is 109)

In reality, you may not know the number of digits in the code. Here is the number of combinations that must be tried if you know an upper-bound to the number of digits in the code:

UpToNumerical


A few additional thoughts:

  • Many digital keypads also require a card to be presented and the code is associated with the particular card (i.e. is a PIN). Then you will need access to the card too.
  • Most mechanical keypads, disappointingly, don’t actually require the digits to be entered in any particular order.
  • There exist keypads that randomise the digits every time they are used. In addition, they can use special screens that can only be seen when looking straight at them.
  • You may be tempted to think that the repeated digit, if there was one, might have more paint on it. However, it actually turns out that the first button pressed deposited much more paint on it than the others. By way of an example, if you are looking for a four-digit code and there are three paint splodges, then by identifying the first digit you reduce the 36 combinations to 12 (not 6, as the first digit may have also been the repeated digit).

A Bit of Graph Theory

I’ve written a document which explains some basic graph theory. In particular, it poses a few puzzle questions related to planar graphs and then solves them using Euler’s Formula.

I tried to write it in a casual style. The idea was that it should be easy to read whilst not giving up any mathematical accuracy. Actually, I’m not sure the style works at all, but it was worth experimenting.

Thank you to Kris for some suggestions to improve it. A version suitable for A5 booklet printing is available on my website.

graph_preview

Click to view/download

Self-Referential Number Square #1 (Updated)

I’ve updated the first self-referential number square so that there isn’t any need for any additional clues at the bottom. It now looks as follows:

Suggested solution

A set of numbers such that when you remove one number, the rest can be arranged into two equal groups with the same sum

Suppose you have a set of 17 numbers with the property that if you remove any one of the numbers, the remaining can be split into two groups of 8 which sum to the same total. We will show that the numbers must all be equal.

If we have a set of 17 numbers with this property, then we can create another set of 17 numbers by subtracting a constant. So if a_1 \cdots a_{17} are one group of numbers with the property, then so is b_i = a_i - K.

Similarly, if a_i have the property, then so do c_i = K a_i, where K is a constant.

In particular, this means that we can shift it so that one of the numbers is zero.

To begin, let us assume the numbers in question are all integers. Let us take out the number which is zero. Therefore we get two groups which both sum to the same (integral) value, call it S. Therefore the sum of all the numbers T is simply 2S + 0. Therefore T is even.

Let us now take out another one of the numbers, call it k. So we get two groups of numbers again which sum to the same value, call it S’. So T = 2S’ + k. This implies that k is even, as is.

Therefore each of the numbers is even. But if that is the case, we can divide all the numbers by two to get another set of integers with the property. This process can be continued indefinitely, which is impossible unless all the numbers were zero to start with. Thus we conclude all the numbers were equal to zero. That means in the original case, before a constant was subtracted, they must have all been equal. (c.f. Proof by Infinite descent)

The case for rational numbers is simple, as you may just multiply all the rational numbers be a constant in order to make them all integers (for instance, by multiplying by the product of the denominators). Thus they are all equal.

The case for real numbers requires a trick. Denote the 17 numbers by a_i. These numbers generate a finite dimensional vector space over the rational numbers (of dimension at most 17, call it N). Therefore we can choose a basis (without the Axiom of Choice). Call this basis r_j, 1 \le j \le N \le 17.

Therefore each of the a_i can be written uniquely as a sum of the r_j multiplied by a rational number. Write [a_i]_j for the coefficient of r_j in this representation.

Fix j. Then the set of 17 numbers [a_i]_j, 1 \le i \le 17 form a set of rational numbers with the property that taking out any one of them allows the remaining to be rearranged into two groups with the same sum. This is because the set of 17 real numbers have this property, and whatever the two sums are, the r_j coefficients must be equal by the uniqueness of basis representations.

However, this implies by the above that the value of [a_i]_j is constant for different values of i.

Therefore all the coordinates are individually constant, and so the 17 numbers are all equal.

This problem is from Problems and Theorems in Classical Set Theory. Obviously, no use is actually made of the fact there are 17 numbers. In fact, no use is made of the fact that the two groups of numbers are equal sizes.

The Average of Average Speeds

I was giving a series of questions to complete from a friend. They were from a job application test. The questions were not very sophisticated, but the last one was as follows (verbatim):

A cyclist has to complete 2 laps of a track at an average of 60 Mph to beat the World record. If he completes the first lap at 30 Mph, what must his average speed be on the second lap for the record to be broken?

A naïve approach would be to say that if the first lap was at 30 mph, and the second lap at 90 mph, then the overall average speed is the average of these two: 60 mph.

However, this is wrong. The intutive reason is that when the lap is completed with a higher average speed, the lap took less time to complete and so contributes to the average less.

Let us be explicit. Suppose that each lap is of distance D. Let us say that the first lap was completed with an average speed of v1, which took time t1. Likewise, let us say the second lap was completed with an average speed of v2, which took time t2.

Straight away we have the following relationships:

v_1 = \frac{D}{t_1}\qquad v_2 = \frac{D}{t_2}

Let us call the overall average speed u. Then u is:

u = \frac{\mbox{Overall distance}}{\mbox{Overall time}} = \frac{2D}{t_1+t_2} = \frac{2D}{\left(\frac{D}{v_1} + \frac{D}{v_2}\right)} = \frac{2v_1 v_2}{v_1 + v_2}

Suppose, as in the question, that our first lap was at 30 mph. We therefore have that u = \frac{60 \times v_2}{30 + v_2}. Suppose we wish u = 60. Then we have that 1 = \frac{v_2}{30+v_2}. Note that there is no value of v2 that makes this true!

Therefore the cyclist has blown his chance! He cannot possibly beat the world record!

Another way to see the conclusion is as follows. The world record time is distance/avg. speed = 2D/60 = D/30. The first lap was took a time of distance/avg. speed = D/30. Therefore all the time is taken, and we reach the same conclusion as before.

Given the lack of sophistication in the previous questions, it seemed this was not the answer they were looking for. It is certainly a lot harder than the others, especially given how it is a trick question! I thought it safer just to give the naïve answer, especially considering that it might be considered suspicious if the person I was doing it on the behalf of got it right.

Expected number of different birthdays

Suppose there are n people in a room. It is the well known ‘birthday problem‘ that if n > 22 the probability of two people sharing the same birthday is greater than 50%, and so it is very likely that there aren’t n unique birthdays. However, what is the expected number of unique birthdays?

[Let us assume there are D days in a year, in order to be more general.]

Actually, the expected number of unique birthdays is very easy to calculate. Let e(n) denote the expected number of unique birthdays when there are n people.

Choose one of the n people. Then either they have a different birthday than everybody else, or they don’t. They have a different birthday than the other n-1 with probability \left(\frac{D-1}{D}\right)^{n-1}. Let us write \phi for (\frac{D-1}{D}), and so the probability that they have a different birthday to everybody else is \phi^{n-1}. In this case, the expected number of unique birthdays is 1+e(n-1). In the other case, which thus has probability  1 - \phi^{n-1}, the expected number is just e(n-1).

Hence,

 e(n) = \phi^{n-1}(1+e(n-1)) + (1-\phi^{n-1})e(n-1) = \phi^{n-1} + e(n-1)

As e(1) = 1, it follows that:

e(n) = \phi^{n-1} + \phi^{n-2} + ... + \phi + 1 = (\phi^n - 1)/(\phi-1) = D(1-\phi^n)

 (As one should expect, as n tends to infinity this tends to D, as it is almost certain that every birthday gets taken)

For instance, with D = 365, you should expect about 21 different birthdays when there are 22 people, or 46 different birthdays when there are 50 people. When there are 1000 people, there will be around 341 different birthdays. Perhaps it is quite surprising that you should expect there to be so many (24) unclaimed birthdays with so many people!

Consider the following problem:

Suppose a factory is in a country with the following law: Every worker must work every day, unless it is the birthday of one of the workers. In this case, everybody has the day off. How many people should you hire in order to maximize the expected number of man-days worked every year?

Naturally, if there is only a single person he will certainly work 364 days (D-1) a year. If there are two, they are likely to have different birthdays and so each will work 363 days a year (D-2), giving a total of 726 days.

The expected number of man days worked will be the number of people, n, multiplied by the expected number of non-birthdays. The number of non-birthdays is just D – e(n). Thus we are trying to maximise the following:

W(n) = n(D-e(n)) = nD\phi^n

Considering n as a real number instead of an integer, one can find that the function W has a maximum at n = -1/\log\phi = -1/\log(1-1/D) \approx D - 0.5. Thus the maximum must be around this, and so for all but the smallest D (where the approximation is not valid), the maximum will be either D-1 or D.

In fact, it is easily seen that W(D) = W(D-1). It follows that one should employee either D or D-1 people. Since you will probably have to pay these workers, it is better to go for the D-1 people.

A simple change of this problem is to consider that for each man-day of work in the factory, you earn £a. However, you must pay each worker £y per year. Then instead you are maximising the yearly income of:

V(n) =anD\phi^n - ny

The derivative of this is V'(n) = aD\phi^n + a D n \phi^n \log \phi - y. Hence if V'(n) = 0 we must have:

y/aD = \phi^n(1+n \log \phi) = e^{n \log \phi} ( 1 + n \log\phi )

Define w = 1 + n\log\phi so that:

y/aD = w e^{w-1}

And so we have that:

ey/aD = w e^{w}

Let us note that y/aD is the ratio between what you pay the worker per year, and what he earns for you per year if he were to work every day. Let us call this ratio P.

Note that when z is positive, the equation z = w e^w always has a unique solution. The function which takes z to w is called the ‘Lambert W Function‘. Hence we have that:

W(eP) = W(ey/aD) = w = 1 + n\log\phi

It follows that:

n = (W(eP)-1)/\log\phi

As P will be small, and in particular less than 1, an approximation of the Lambert W function may be useful. We will use the Taylor Series approximation W(x) \approx x-x^2+3x^3/2 - 8x^4/3. Along with the usual log approximation, this gives:

n \approx D-\frac{e y}{a}+\frac{e^2 y^2}{a^2 D}-\frac{3 e^3 y^3}{2 a^3 D^2}+\frac{8 e^4 y^4}{3 a^4 D^3}

Or, in terms of P,

n \approx \frac{1}{6} D \left(6-6 e P+6 e^2 P^2-9 e^3 P^3+16 e^4 P^4\right)

For instance, suppose that I pay my workers £18,000 per year. They generate about £350 in profit for me each day they work. They work every day, except for when one of them has a birthday. By evaluating the W function exactly, I find out that the maximum expected profit is when there are around 260 workers. The approximation yields 269 workers. If the profit they generate per day were to be decreased to £75 then the approximation is completely inaccurate. Whilst I should only employee 72 people, the approximation gives over 7,000!

Three Birthdays On The Same Day

At the company I work at, which has about 35 members of staff, it was discovered that three people have the same birthday. The so-called ‘Birthday Paradox‘ tells us that two people having a birthday on the same day is much more likely than one would imagine, so it is interesting to calculate the probability that three people have the same birthday.

Let us suppose that there are N members of staff. Let our probability space be the set of N-tuples of integers between 1 and 365 (inclusive). There are 365^N elements in this probability space.

Let us denote the event that there is at least one triplet B. Similarly, let us denote the event that there are exactly k pairs of people who are sharing a birthday (and no triplets or higher) as A_k.

Then we have that the probability that B happens is one minus the probability that A_0 or A_1 or A_2, \ldots happens. That is,

\mathbb{P}(B) = 1 - \mathbb{P}(A_0) - \mathbb{P}(A_1) - \mathbb{P}(A_2) - \ldots

Hence we must find an expression for A_k – the probability that k pairs of people have the same birthday and everybody else has a different one.

Recall that we are working in the space of ordered N-tuples. Therefore we need to select the positions of each of the pairs. For the first pair we do “N choose 2”, for the second we do “(N-2) choose 2”, and so forth. So, selecting the positions of the birthdays we want to be duplicated, we can do:

\binom{N}{2} \binom{N-2}{2} \cdots \binom{N-2(k-1)}{2}

Note that we have an ordered list of k unordered pairs.

We can simplify the above expression to \frac{N!}{2^k (N-2k)!}.

Now, we need to choose the k birthdays for these pairs to have as the common day. Obviously, each pair must be different, else we’d have (at least) four people with the same birthday, and we don’t want more than two people with birthdays on the same day.

Suppose we pick the k days in an ordered fashion (i.e. 365\times364\times\cdots\times(365-k+1)). Then we will have over-counted, because the pairs are in an ordered list. For instance, suppose we picked the ordered list of unordered pairs “Person 1 & Person 2”, “Person 3 & Person 4”. Then we selected the ordered list of birthdays “1st Jan, 2nd Jan”. Then this would give us the same thing as if we’d selected the pairs “Person 3 & Person 4”, “Person 1 & Person 2” with the list of birthdays “2nd Jan, 1st Jan”. In this case we over counted by a factor of 2. In general, we will overcount by the number of permutations of the list of pairs. This is k!.

Now, the remaining N-2k people can have any remaining birthday. So the first has a choice of 365-k, the next a choice of 365-k-1 and so forth. Therefore,

|A_k| = \frac{N!}{2^k (N-2k)!}\frac{( 365 \times \cdots (365-n+k+1))}{k!}

So, for instance, in Mathematica:

A[N_, k_] := N!/(2^k (N - 2 k)!) * 365!/((365 - (N - k))!*k!)

We can verify this by ensuring that 1 - \mathbb{P}(A_0) is the probability that at least two people will share a birthday in a group of N people. Remember that \mathbb{P}(A_k) = |A_k|/365^N.

We wish to find the sum of the A_i. There is no obvious closed form expression for it, so it will suffice to numerically calculate it for the desired values. Doing so yields the following graph:

The probability of three birthdays on the same day in a group of various sizes.

In particular, for my situation – that out of a company of 35 people, three had the same birthday – has a probability of about 4.5%. In fact, these members were at the company back when it was only 12 people strong, and then it had a probability of only 0.33%.

You need a group of 87 people before the chance of three people having a birthday on the same day is 50%. 75% happens after 110 people and 25% at 65 people.

The following graph shows the probability of having two people sharing a birthday in blue and three people sharing a birthday in red:

The red is the probability of three people (or more) sharing a birthday. The blue is the probability of two people (or more) sharing a birthday. The horizontal axis is the number of people; vertical is the probability.

This spreadsheet gives the full table the probabilities. (Apologies for the Excel format, WordPress didn’t seem to want plaintext files uploaded.)

The Coin Problem

Here is a problem: You have 10 coins. One of these coins is heavier than the others, but they are all otherwise identical. You have a balance scale which you can use to compare the weights of any two sets of coins. You must devise a strategy to find the heavier coin in the minimum number of weighs (in the worst case).

The solution is simple. Split into two groups of 5 and compare those two. Select the heavier of the two sets and compare two sets of two in there. If they are equal, it must’ve been the one you didn’t compare. If they are not, you compare the two on the heavier side to finally find the heaviest. So it takes only 3 weighs.

Now, what if the odd coin out is either heavier than the others or lighter than the others but you don’t know which? In fact, it is still possible to do it in three goes. The following diagram shows you how.

 

 

The ten coins are labelled as shown at the top. ‘A’ is short for A1 + A2 + A3, and similarly for ‘B’ and ‘C’. The underlines denote that the coin is a known reference (i.e. known to be neither heavier nor lighter). If A and B are not the same weight, we assume A is heavier (as otherwise you can just relabel them).

Note that in every path except for the first, which ends in D, you know not just which coin it is, but if the coin is heavier or lighter than the others.

99p Shop

Here is a problem from “SYMmetry Plus+” which is the magazine produced by the Mathematical Association for young mathematicians. This is from the Summer 2005 edition (#27).

The supermarket Tesbury’s prices all its items at so many pounds and 99 pence. If the total bill comes to £35.84 is it possible to say how many items have been bought? What if the total were £135.84 or £47.12?

To be clear, this means that every item is either 99p, £1.99, £2.99, etc. It’s a very simple to solve problem, but it is kind of still interesting I suppose. So let us consider the problem in general. Let us say the cost price overall pill is P pounds and p pence (i.e. 100P + p pence and 0 \le p < 100). That we have:

100P + p = \sum_{i=1}^N (100a_i + 99)

For some integers a_i. Reducing mod 100 we have that:

p = \sum_{i=1}^N 99 = 99N \pmod{100}

Of course, the inverse of 99 is again 99 (as it equals -1) and so we have that 99p = N \pmod{100}. It hence the follows that:

N = 99p + 100k

(where p is the number of pence spent (ignoring the pounds), N is the number of items bought and k is some integer.)

Take the original example, where p = 84. Then N = 8316 + 100k. By choice of k, this is equal to N = 16 + 100k'. It follows that k' = 0 as if there were 116 items the cost would be at least 99×116 = £114.84. Given that it is less than this at £35.84, we be sure that there are only 16 items. This also answers the second part – you cannot tell, there could be 16 or 116 items in your basket.

As for the final part, that means there is N = 88+ 100k items. However, if there 88 items that must cost at least £87.12, and so there is no way that this total could be achieved.

Digit Concatenation (A Little Puzzle Turned Interesting)

I was given the following little problem:

Find a number such that prepending the digit ‘1’ to the front of it gives a number one-third as big as appending the digit ‘1’ to the end.

To make things clearer, let us denote digit concatenation by a colon. So for instance, 12:34 = 1234. Then the problem is to find a natural number x such that 3(1:x) = x:1.

Let us generalise the problem to make it more interesting. First of all, instead of the digits being in base 10, let them be in base b. Next, let us change the factor of 3 to a factor of k, so we have k(1:x) = x:1.

Obviously x:1 = bx + 1 and 1:x = b^d + x, where x is a d digit number (in base b). Thus we are attemping to solve:

k(b^d+x) = bx + 1

Now let us first remark that this implies that k and b are coprime, and so we shall assume that in what follows. Rearranging the equation, we get:

x = \frac{kb^d - 1}{b-k}

Since this must be an integer, there can only possibly be solutions when kb^d - 1 = 0 \pmod{b-k}. However, as k = b \pmod{b-k} it follows that this is equivelent to:

b^{d+1} = 1 \pmod{b-k}

As b and k are coprime, so are b and b-k. Therefore b is a primitive root. Thus it is equal to one exactly when it is raised to the power of multiples of \phi(b-k). In particular, valid values of d must be of the form d = \phi(b-k)r - 1.

So that is when there can be solutions. (And furthermore, there can only be one solution for each length d)

Suppose that S and R are numbers with this property. That is, k(1:S) = S:1 and k(1:R) = R:1, where the digit concatenations are in base b. Let us also define S to be s digits long (in base b) and R to be r digit long (in base b).

From the above result, s = a_1q - 1 and r = a_2q - 1 (where in fact q = \phi(b-k)).

Now consider the number T = S:1:R:

T:1 = (S:1) + (R:1) b^{s+1} = k(1:S) + kb^{s+1} (1:R) = k[(1:s) + (1:R)b^{s+1}] = k(1:S:1:R) = k(1:T)

Then T has the property too, and T is of length r+s+1.  Notice how, as expected, this is still of the required form.

In particular, in the case S = R, it follows that if there is a solution of length s there is a solution of length k(s+1)-1, for each k.

Now we can turn our attention but on to the specific problem again. Here b = 10 and k = 3. Therefore all possible solutions must be of length d = 6k+5 for various values of k. For d = 5 we have that:

x = \frac{kb^d - 1}{b-k} = \frac{3 \times 10^5 - 1}{10-3} = 42,857

We must verify that this actually does work. And indeed, you can see that 142857×3 = 428,571, as required. It follows that all solutions are this repeated a number of times with 1 in the middle.

However, there is something else to be noticed here. The above could be written as:

x = 10^5 \times \frac{3}{7} - \frac{1}{7}

And so you can see the relationship between our solution and the fraction 1/7. In fact,

1/7 =  0.14285714285714285714…
2/7 =  0.28571428571428571429…
3/7 =  0.42857142857142857143…
4/7 =  0.57142857142857142857…
5/7 =  0.71428571428571428571…
6/7 =  0.85714285714285714286…

(In bold is the first occurrence of the found solution. Of course, it repeats after that with a ‘1’ between them.)

So the weird fact that the decimal expansions of sevenths have the same general pattern (except with some stuff at the start) is due to this little problem. I may expand on this more at a later point, but for now it is at least clear there is a connection.