Deducation to do with knowledge and belief

Knowledge and belief are fairly intractable subjects. Nevertheless, we can give some properties that we believe knowledge and belief have, and deduce from that.

First, any statement about the world can be known or believed (or both, or neither). In addition, if A is something that can be known or believed, then so can the statements ‘I believe A’ and ‘I know A’. That is, it’s possible to have beliefs about your knowledge, or knowledge about your beliefs, etc.

Knowledge ought to be stronger than belief. In particular, if you know something you should also belief it. I cannot sensibly know that apples grow on trees but not believe it. If I don’t believe it, it is because I don’t know for sure.

Whilst ‘doublethink‘ is perfectly possible in reality, it is not a property of a good belief system. It is fair to assume that if you believe something, you do not believe the opposite to it.

If somebody asks you whether you know a statement is true or not, you’re aware if you don’t know, and can tell the person that. You know your ignorance. That is, if you don’t know something, you know you don’t know it.

And finally, the tightest coupling between believe and knowledge comes from the idea that if you believe something, you believe you know it. Let us give an example. I know that apples grow on trees. I also believe that it would be odd for something to both grow on trees and be an animal. Thus, I conclude in my head that I believe apples are not an animal. But I feel my believes are fairly secure, so although I don’t know apples are not animals, I believe I know it.

There are other rules one might want to add. For instance, it might be that knowledge is such a strong condition that it is impossible to know something that is false. Others are to do with you knowing what you believe, and others still to do with being able to deduce things from knowledge and believe of implications (as was done in the example to do with animals and apples). Nevertheless we do not need any more assumptions, so we stop here.

The four assumptions we have listed are enough to prove that if one believes something, then they know it. As one of the assumptions was that knowledge implies belief, together they imply that you believe something if and only if you know it. They are just two words for the same thing. It is possible to conclude different things from this, but I’d just say it implies our axioms do not model the system well. I feel it is the last statement – if you believe something, then you believe you know it. That is far too strong. There are plenty of things I believe – for instance, the truth of certain mathematical conjectures – but I do not believe I know such things. (In fact, I know I do not know them.)

Let us write ¬ for ‘not’, K(A) for ‘I know A’, and B(A) for ‘I believe A’, where A is an arbitrary thing which can be believed and known. The axioms become, in the order they were introduced above:

  1. K(A) implies B(A)
  2. B(A) implies ¬B(¬A)
  3. ¬K(A) implies K(¬K(A))
  4. B(A) implies B(K(A))

Let us also assume the usual laws of logic, e.g. that A implies B is the same as ¬B implies ¬A.

Suppose you believe a statement S. That is, B(S). Then you believe you know it, so B(K(S)) – that is rule 3. But B(K(S)) implies ¬B(¬K(S)), where it is an application of rule 2 with the parameter A = K(S). That is, follows that one does not believe that one does not know S.

The contrapositive of rule 2 is that if you don’t believe something, then you don’t know it. Therefore ¬K(¬K(S)) follows. You don’t even know that you don’t know S. But the rule of knowing your own ignorance – rule 4 – implies that this means you know S. Therefore K(S).

Hence, if you believe something, then you know it.

As mentioned above, I feel axiom 4 is the problem. Nevertheless, if one were to change ‘belief’ and ‘knowledge’ to ‘conjecture’ and ‘formal proof/strong evidence’, to model mathematical or scientific reasoning, one sees that axiom 3 is problematic too. Just because there is a lack of evidence for something, doesn’t mean there is evidence that there is a lack of evidence for something.


Tie-break Criteria and the Manchester Clubs

The 2011/12 Premier League season finished with Manchester United and Manchester City at the top of the table with a joint number of point. Different leagues have different rules on what happens in this situation.

In the past the relevant tie-breaker was ‘goal average’. This is the ratio of goals scored to goals conceded. It has not been used in the English Football League since 1975-76. This seems like an odd measure, as intuitively it is the number of goals you win by which is important, irrespective of offsets. For instance, surely 2-1 is just as good as 1-0?

This is no longer used, presumably for the reason that it makes conceding goals too costly. Goal difference is used instead. This is the difference between the number of goals scored and the number conceded.

In tournaments like La Liga, Serie A, the Champions League group stage and the European Cup group stages, it isn’t immediately used. If two or more teams are equal in points then a new league is considered which contains only these teams and from the games between these teams. This is then ordered first by points, and then by goal difference (and then it will typically continue to goals scored, and so forth). The ordering of this league then orders the joint teams in the original league.

[As an aside, I find it slightly odd that the rules are not applied recursively.]

In other tournaments, like the Bundesliga, the Premier League and the World Cup group stage, then after points goal difference across all games is compared. This is what happened to Manchester United and Manchester City, where Manchester City become the first team to win the league on goal difference. (The only time there was been a draw on points at the top of the table since goal average was abolished was 1988-89, where Arsenal and Liverpool were drawn on both points and goal difference, and Arsenal won it on goals scored)

Both teams won, lost and drew the same number of games, giving them the same score. However, Manchester City had a goal difference of 64 and Manchester United had a goal difference of 54.

Goal difference is better than goal average in that it only cares what the difference in scores is. Nevertheless, it is perhaps too linear. Is winning four games 1-0 and then winning a single game 6-0 better than consistently winning 2-0 over five games? Goal difference treats them as the same. If they aren’t equal, it isn’t particular obvious which is better. The latter team were consistently better, but the former team achieved a large victory. There seems to be an argument either way.

The following is a graph of the distribution of wins of the two Manchester teams. As it is already decided that it is only goal difference in a match which matters, this is all that is graphed. Obviously the total left of ‘0’ denotes the number of losses, and right of it the number of wins.

It should be noted that when the Manchester clubs played each other, United won 1-0 away and lost 6-1 at home. Therefore if a head-to-head system were to be used which went to goal difference, City would still win. Goal average would not help United either – they conceded more goals and scored less, so any sensible combination of these two numbers will see City win. Had the United-City game been 1-0 instead of 6-1, the goal difference delta would have changed by 8, but City still would have won on goals scored.

The graph shows that United did achieve more very large wins than City though. United won of five occasions by five or more goals. City did this ‘only’ twice. Perhaps there is some home for United after all, if we change our values?

Is winning by two goals twice as good as winning by 1? Perhaps not. Winning by one can be luck, but winning by two seems like there might have been some skill involved. Perhaps instead of adding up the games goal differences, one should add up the squares of the games goal differences. Obviously, one would have to subtract instead of add if you lost by that amount, otherwise losing by 2 goals would be just as good as winning by 2 goals!

If this is done, Manchester United get a score of 188, but Manchester City get 212. They will still win.

Okay, let’s say winning by two goals is twice as good as winning by one. But then is winning by three goals only 1.5 times better than winning by 2? Perhaps it is twice as good again. Continuing  5 goals is 16 times better than scoring 1. If you lose, it should be the same as if you win except with the opposite sign, and if it was a draw the value is zero.

If we do this, then Manchester City get 112 points, but Manchester United get 121.

Nevertheless, this method is fairly silly, as it means the game Manchester United won 8-2 at the start of the season screws it all. Perhaps the truth is the opposite: the first goal is very important, but each goal after that becomes less and less important. After all, 5-0 and 6-0 are basically the same score. The last goal doesn’t matter very much at all. Perhaps it shows more like a logarithm than a power.

Using \log{(1+|d|)} \times \rm{sign}(d) gives Manchester United 38.1 points, but Manchester City 32.3.

In summary, there is no easy answer to a tie-breaking criterium that rewards the things you want it to. At least goal difference is simple, and so it seems a good choice.

A Bit of Graph Theory

I’ve written a document which explains some basic graph theory. In particular, it poses a few puzzle questions related to planar graphs and then solves them using Euler’s Formula.

I tried to write it in a casual style. The idea was that it should be easy to read whilst not giving up any mathematical accuracy. Actually, I’m not sure the style works at all, but it was worth experimenting.

Thank you to Kris for some suggestions to improve it. A version suitable for A5 booklet printing is available on my website.


Click to view/download

Self-Referential Number Square #1 (Updated)

I’ve updated the first self-referential number square so that there isn’t any need for any additional clues at the bottom. It now looks as follows:

Suggested solution

A set of numbers such that when you remove one number, the rest can be arranged into two equal groups with the same sum

Suppose you have a set of 17 numbers with the property that if you remove any one of the numbers, the remaining can be split into two groups of 8 which sum to the same total. We will show that the numbers must all be equal.

If we have a set of 17 numbers with this property, then we can create another set of 17 numbers by subtracting a constant. So if a_1 \cdots a_{17} are one group of numbers with the property, then so is b_i = a_i - K.

Similarly, if a_i have the property, then so do c_i = K a_i, where K is a constant.

In particular, this means that we can shift it so that one of the numbers is zero.

To begin, let us assume the numbers in question are all integers. Let us take out the number which is zero. Therefore we get two groups which both sum to the same (integral) value, call it S. Therefore the sum of all the numbers T is simply 2S + 0. Therefore T is even.

Let us now take out another one of the numbers, call it k. So we get two groups of numbers again which sum to the same value, call it S’. So T = 2S’ + k. This implies that k is even, as is.

Therefore each of the numbers is even. But if that is the case, we can divide all the numbers by two to get another set of integers with the property. This process can be continued indefinitely, which is impossible unless all the numbers were zero to start with. Thus we conclude all the numbers were equal to zero. That means in the original case, before a constant was subtracted, they must have all been equal. (c.f. Proof by Infinite descent)

The case for rational numbers is simple, as you may just multiply all the rational numbers be a constant in order to make them all integers (for instance, by multiplying by the product of the denominators). Thus they are all equal.

The case for real numbers requires a trick. Denote the 17 numbers by a_i. These numbers generate a finite dimensional vector space over the rational numbers (of dimension at most 17, call it N). Therefore we can choose a basis (without the Axiom of Choice). Call this basis r_j, 1 \le j \le N \le 17.

Therefore each of the a_i can be written uniquely as a sum of the r_j multiplied by a rational number. Write [a_i]_j for the coefficient of r_j in this representation.

Fix j. Then the set of 17 numbers [a_i]_j, 1 \le i \le 17 form a set of rational numbers with the property that taking out any one of them allows the remaining to be rearranged into two groups with the same sum. This is because the set of 17 real numbers have this property, and whatever the two sums are, the r_j coefficients must be equal by the uniqueness of basis representations.

However, this implies by the above that the value of [a_i]_j is constant for different values of i.

Therefore all the coordinates are individually constant, and so the 17 numbers are all equal.

This problem is from Problems and Theorems in Classical Set Theory. Obviously, no use is actually made of the fact there are 17 numbers. In fact, no use is made of the fact that the two groups of numbers are equal sizes.

Was Wronski Wrong?

I was reminded of a problem from a text book from school. The book was MEI Pure Mathematics 5, First Edition. It had the following question:

The Polish mathematician Hoëné Wronski (1778-1853) once wrote that:

\pi = \frac{2\infty}{\sqrt{-1}}\left\{ (1+\sqrt{-1})^{1/\infty} - (1-\sqrt{-1})^{1/\infty}\right\}

Was Wronski wrong?

At the time I found this problem very hard and I was very proud to have solved it. Now I am older, I don’t understand how I could have ever found it difficult, but I suppose that is a good thing.

Let us first turn it into modern notation. We wish to find q, where:

q = \lim_{x \rightarrow \infty} \frac{2x}{i}\left\{ (1+i)^{1/x} - (1-i)^{1/x}\right\}

Note that 1+i = \sqrt{2} e^{i \pi/4} and 1-i = \sqrt{2} e^{-i \pi/4}. Using these representations,

q = \lim_{x \rightarrow \infty} -2^{1+\frac{1}{2x}} xi \left\{ e^{i\pi/4x} - e^{-i\pi/4x}\right\}

And so

q = \lim_{x \rightarrow \infty} -2ix2^{1/2x} (2i \sin (\pi/4x))

Now, if we use the fact that \sin {x} \approx x for small x, we have:

q = \lim_{x \rightarrow \infty} 2^{\frac{1}{2x}} \pi

and hence q =\pi as required.

So, yes, Wronski was right.

Arithmetic: then and now

The Conservatives recruited Carol Vorderman to head a ‘mathematics task force’ before the election, in 2009. After the election, in August 2011, the ‘task force’ reported (Telegraph article / Original Report). I find it a little bit puzzling that Vorderman was selected for this, as she is neither a teacher nor a mathematician. (She only managed a 3rd in each of her years at Cambridge studying Engineering.) Nevertheless, the bulk of the work was done by Roger Porkess, it appears, who is head of MEI. He is good.

The report has some good points in it. Nevertheless, page 52 is interesting. It shows three exam questions from previous papers, and the following remark:

Our conclusion from looking at the papers from these three years is that O level was harder than what came before or after it. However, a comparison is not entirely fair because GCSE is designed for a very wide range of students: O levels were not. Discussion of the standard of this, or any other, qualification needs to take place in the context of its fitness for purpose. However, there is compelling evidence that grade inflation has occurred.

The questions are as follows:

(In what follows I will answer the questions. This is probably fairly pointless, as if you read this blog you can almost definitely do them yourself.)

I think the 2011 GCSE question is actually pretty good as approximating an expression is a useful, practical skill which is not so mechanical. Presumably, this was a question in an exam without a calculator. (At least when I did the exam, there was a non-calculator exam and a calculator based one.)

If I were to answer the question, I would multiply the fraction 95.9/(0.81\times 0.62) ‘top and bottom’ by 100, and approximate 95.9 as 100, to give 100^2/(8.1\times 6.2). The product in the denominator is then approximately 8\times6 = 48 \approx 50 and thus you get 100\times100/50 = 200, as required.

I consider that a fairly good, practical problem. It’s much shorter than the other two questions, but this can be attributed to a change of style.

The 1971 O Level question is not as good in my opinion. The first part is, I suppose, a test of their understanding of adding fractions. That the whole thing is itself in a fraction is presumably just to make the question seem harder. It shouldn’t really make the question any harder, but I suppose under pressure in an exam it does. The obvious thing to do is to multiply the fraction ‘top and bottom’ by xy. This gives:

\frac{xy}{x+y} = \frac{3 \times 7}{3+7} = \frac{21}{10} = 2.1

The fact the 3 and 7 were chosen, which sums to 10, makes the question easier if it wanted an answer as a decimal expansion. Presumably an answer as a fraction would be acceptable too.

Let us skip the second part for a moment and head to the third. Clearly the values of and q do not matter here, as they are cancelled out. The answer is hence merely 3. If for whatever reason you do not notice this, you have a lot of calculating to do! Nevertheless, it is a simple factorisation and will certainly be tested in more modern exams.

Going back to the second part – it is a pain really. Log tables are the easiest solution within the rules of the exam. Rather than find any particular shortcut, it is easier to ‘just do it’. Find the log of p, double it, anti-log it. Do the same with q. Add the two numbers together. Log it. Half it. Anti-log it. Then we got the answer. It’s boring, and how is it any better than using a calculator? All you are doing is looking some numbers up in a table and occasionally adding or halving them. It’s just a procedure. Perhaps it was useful then, but it isn’t useful now.

The first question, from the 1931 School Certificate, I actually like quite a bit. Let us do the third part first, as it the easiest. It is well known that 1/3 = 0.333…, and that 1/100 = 0.01. Thus 1/200 = 0.005. It follows that:

1/3 - 1/100 - 1/200 = 0.318333\ldots \approx 0.31833

Now let us go back to the first part. We wish to find find the decimal expansion of 7/22. (Note that 22/7 is a popular approximation of pi, which comes from the continued fraction expansion of the number. 7/22 is also the approximation from the first three terms in the continued fraction expansion of 1/pi)

We could just use long division and not think about what we are doing, but I prefer to do these calculations more explicitly. To find the first decimal place of 7/22, we can multiply it by 10 and then take the integer part. This gives 70/22 = 3 + 4/22. Hence it starts 0.3. We then continue with 4/22 – multiply by 10 and take the integer part. This will clearly give the next decimal place. Now 40/22 = 1 + 18/22. Thus the expansion of 7/22 starts 0.31. Continue again: 180/22 = 8 + 4/22. Thus it continues 0.318. We have already seen 4/22, and hence it will yield a sequence continuing 1 then 8 forever. Hence the expansion is 0.318181818…, or 0.31818 to five decimal places.

The second part is unfortunately basically identical, and so is rather dull. We have:

  • 710/223 = 3 + 41/223
  • 410/223 = 1 + 187/223
  • 1870/223 = 8 + 86/223
  • 860/223 = 3 + 191/223
  • 1910/223 = 8 + 126/223
  • 1260/223 = 5 + 145/223

And hence the decimal expansion starts 0.318385, and so is 0.31839 to 5 decimal places.

Therefore our numbers are 0.31818, 0.31839 and 0.31833. The closest to the target number of 0.31831 is thus the third. (What were they playing at! 106/333 is much closer than all of them!)

Perhaps the third part is still useful today, but exactly evaluating the decimal expansions of fractions probably isn’t because of computers. So it is merely more modern.

It’s hard to say really which questions were harder. I didn’t like Part II of the second question as I couldn’t think of an easy way to do it. The other parts of the question were brainless however. The third question was interesting, but the first question was the most interesting, except for Part II which required you to do a load of arithmetic which was almost identical to what you had just done, which is boring. No wonder everybody hates maths.

I don’t have any solid conclusions myself. Indeed, when I was at school I looked through A-Level mathematics exams from the 1960s onwards. I actually thought the hardest papers were in the years 1990-1992. Previous years did not require you to understand anything, but just required you to ‘do’ things which could be learnt without understanding why. Later years were just easy. Happily, the exams I did myself were of the easy variety !

It is often said that people are worse at maths now than previous generations. Perhaps this is true, but I’m not entirely convinced previous generations were any good at it either. For instance, perhaps people who went to score in the 1970s can multiply and divide long integers, but in my experience from having seen people do it, they merely perform an algorithm they have learnt and never understand it. My calculator can do that, and it can do it better. When something goes wrong, they don’t know what they might have done wrong because they don’t know why it works. They might not remember what to do when they encounter a zero, etc.

Perhaps I will try and expand on this subject in future.

%d bloggers like this: