For “P or (Q or R)” to be true, we need at least one of P and “Q or R” to be true. And for “Q or R” to be true we need at least one of Q or R to be true. Putting all that together, we find that for “P or (Q or R)” to be true we need at least one of P, Q and R to be true. And clearly the same argument shows that the same condition needs to hold for the statement “(P or Q) or R” to be true. So we just drop the brackets and write “P or Q or R” and think of this as saying that at least one of the three statements is true.

Similarly, we don’t bother with brackets for repeated ANDs, writing “P and Q and R” for the statement that all three of P, Q and R are true. And similar considerations hold for larger collections of statements as well.

I realized after writing the title of this post that it might look as though I was saying, “I’m going to discuss connectives … not!” Well, that’s not what I meant, since “not” is a connective and I’m about to discuss it.

If you don’t know how to negate a mathematical statement, then you won’t be able to do serious mathematics. It’s as simple as that. So how does the mathematical meaning of the word “not” differ from the ordinary meaning? To get an idea, let’s consider the following sentences.

When Queen Victoria said, “We are not amused,” it is clear that what she meant was she was distinctly *un*amused, and not merely that she had failed to laugh. Similarly, if I say, “He is not a happy man,” I will usually mean that he is positively unhappy rather than neutral on the contentment scale. And if I say, “That was not a very clever thing to do,” I am saying, in a polite British way, that it was a stupid thing to do. (I could perhaps avoid that interpretation by stressing the word “very”. For example, if someone had made a good but reasonably standard move in chess and I knew enough about chess to tell that — which I don’t — then I might say, “That was not a *very* clever thing to do — but it was pretty good, so well done.”)

In all the examples above, the word “not” takes us from one end of some scale to the other: from amused to unamused, from happy to unhappy, from clever to stupid. In mathematics, the word “not” does not have this sense. If P is a mathematical statement, then “not P” is the mathematical statement that is true precisely when P is false. That is, if P is true, then “not P” is false, and if P is false, then “not P” is true. So if you want to understand a statement of the form “not P” then you should think to yourself, “What are the exact circumstances that need to hold for P to be false?”

In the case of the statement “n is not a perfect square” this is completely straightforward. We don’t have a notion of an utterly imperfect square, so there is no possibility of misinterpretation. We just mean that it is not the case that n is a perfect square. But take the statement “ is not the largest element of the set ” We have been told that is an element of If we want to show that is not the largest element of what do we have to establish? Do we need to show that is the *smallest* element of ? No. All we need to do is establish that it is not the case that is the largest element of The usual way to set out that objective would be to formulate it as the following statement.

- There is some element of that is larger than

If that statement is true, then is not the largest element of And if that statement is false, then there is no element of that is larger than and since is an element of that tells us that is the largest element of

The third mathematical statement above was “ is not a subset of .” Let me give two mistaken interpretations of that statement.

- First mistaken interpretation: is a subset of

This is mistaken because it is perfectly possible for to fail to be a subset of without being a subset of For example, could be the set and could be the set

To work out what it means for to fail to be a subset of let us write out more carefully what it means to say that *is* a subset of . The usual definition (written out in a slightly wordy way because I’m still trying to avoid symbols) is this.

- is a subset of if every element of is also an element of .

By the way, I should mention here a convention that you need to know about. When mathematicians give definitions, they tend to use the word “if” where “if and only if” might seem more appropriate. For example, I might write this: “an integer is *even* if there exists an integer such that ” You might argue that this definition says nothing about what happens if no such integer exists. Might 13 be even too? No, is the answer, because I am defining something and the convention is that you should simply understand that the words “and not otherwise” are implicit in what I’ve said, or equivalently that the “if” is really “if and only if”.

OK, if “A is a subset of B” can be translated into “Every element of is also an element of ” then how should we translate “A is not a subset of B”? This brings me to the second incorrect answer.

- Second mistaken interpretation: No element of is an element of .

This makes the going-to-the-opposite-extreme mistake. Don’t do that. Faced with a statement of the form “not P”, think to yourself, “What needs to be true for P to be false?” Applied to this case, we ask, what needs to be true for “Every element of is an element of to be false?” The answer is that *at least one* element of should fail to be an element of . Or as mathematicians might normally write it,

- There exists an element of that is not an element of .

Or more formally still,

- There exists such that and .

As I hope you can guess if you didn’t know already, the symbol means “is not an element of”. As I also hope you can guess, putting a line through a symbol usually has the force of a not. You are probably already familiar with the symbol , which means “does not equal”.

There is a pair of logical principles known as *de Morgan’s laws* that you may well just take for granted, but that it is probably good to be consciously aware of. They concern what happens when you negate a statement that involves an “and” or an “or”. Let me illustrate them with some depressing scenes from my everyday life.

Once a year I have to renew the tax disc on my car. If I didn’t do that I would have to pay a fine, but doing it without a fuss involves being a bit more organized than I normally have it in me to be, so I find myself facing a last-minute panic. What is difficult about it? Well, I have to bring along my insurance and MOT certificates, as well as a form I am sent and a means of payment. (For non-UK readers, MOT stands for “Ministry of Transport” but it also means a test of roadworthiness that you have to have carried out once a year if your car is over three years old, which mine very definitely is.) I take all those along to the post office and can buy the new tax disc there.

Now suppose I were to arrive back from an expedition to the post office, the aim of which had been to get a new tax disc, and say to my wife, “I can’t believe it. I thought I had everything I needed, but I didn’t, and now I’m going to have to make another trip. Damn.” What could she deduce? Well, in order for me to have everything I needed, the following statement would have had to be true.

- I had my insurance certificate and I had my MOT certificate and I had the form and I had the means to pay.

So from my failure to get a tax disc, she could deduce that the above statement was false. But what does it take for a statement like that to be false? It just takes one little slip-up. So she could deduce the following statement.

- I didn’t have my insurance certificate or I didn’t have my MOT certificate or I didn’t have the form or I didn’t have the means to pay.

What happens here is that NOT turns AND into OR. Or to be a bit clearer about it, the statement “not (P and Q)” is the same as the statement “(not P) or (not Q)”. In the tax-disc example we had four statements linked by “and”, so the rule was a generalization of the basic de Morgan law, which told us that “not (P and Q and R and S)” was the same as “(not P) or (not Q) or (not R) or (not S)”.

As you might guess, the other de Morgan law is that NOT changes OR into AND. Suppose we vary the scenario above slightly. This time I am trying to open a bank account and I need some ID. I end up having to go back home and I say to my wife, “Damn, I didn’t manage to open the account, because they said that the only ID they would accept was a passport or a driving licence with a photo on it.” What could she deduce this time? Well, in order to open the account, I needed the following statement to be true.

- Either I had my passport on me or I had a driving licence with photo on me.

What does it mean for that statement to be false? It means that I failed on both counts. In other words, this happened.

- I did not have my passport on me and I did not have a driving licence with photo on me.

The more abstract rule here is that “not (P or Q)” is the same as “(not P) and (not Q)”.

A very general and possibly helpful rule of thumb applies to negation, including to several of the examples above. It’s this.

*Negating something strong results in something weak, and negating something weak results in something strong.*

For instance, suppose you are given two statements, which we’ll call P and Q. Then the statement “P and Q” is quite strong because it tells you that *both* the statements P and Q are true. By contrast, the statement “P or Q” is fairly weak because all it tells you is that one or other of P and Q is true and you don’t know which. Why do I use the words “strong” and “weak”? Well, it is easier for “P or Q” to be true than it is for “P and Q” to be true. That means that if “P and Q” is true, then I am getting a lot of information, whereas if “P or Q” is true I am getting less information. If you still don’t find that intuitively clear, then consider the statements

- n is prime and n is even
- n is prime or n is even

The first statement tells us that which is an extremely strong piece of information — we get to know exactly what number is. The second statement merely tells us that is one of the numbers 2,3,4,5,6,7,8,10,11,12,13,14,16,17,18,19,20,22,… which is giving us much less information. So “strong” basically means “tells us a lot”.

The rule of thumb is us that negating something strong — that is, pretty informative — gives us something weak — that is, not very informative — and vice versa. Therefore, de Morgan’s laws

- “not (P and Q)” is the same as “(not P) or (not Q)”
- “not (P or Q)” is the same as “(not P) and (not Q)”

are exactly what you would expect. The first law negates the strong statement “P and Q” and gets a weak statement “(not P) or (not Q)” and the second law negates a weak statement and gets a strong statement.

For another example, consider the statement

- This room is INCREDIBLY INCREDIBLY HOT.

That, I hope you will agree, is strong information: it describes a most unusual state of affairs. So we would expect that negating it produces a very weak statement. And indeed, if I were to say,

- This room is not INCREDIBLY INCREDIBLY HOT.

you might well give me a funny look and ask, “Was there some reason that you expected it to be?” Note that the rule of thumb gives us a quick way of seeing that the negation of “This room is INCREDIBLY INCREDIBLY HOT” is not “This room is INCREDIBLY INCREDIBLY COLD.” After all, the second statement is also very strong, and we do not expect the negation of a strong statement to be strong.

**Double negatives.**

I haven’t mentioned all the basic logical laws that concern AND, OR and NOT. One important one is the rule that two NOTs cancel. If I say, “It is not the case that A is not a subset of B” then I mean that A is a subset of B. In general, “not (not P)” is the same as P.

We can actually use that to deduce the second de Morgan law from the first. Here they are again.

- “not (P and Q)” is the same as “(not P) or (not Q)”
- “not (P or Q)” is the same as “(not P) and (not Q)”

To deduce the second, let me begin by applying the first to “not P” and “not Q”. (What I’m doing here is just like what you are allowed to do with an identity: I am substituting in a value. In this case, the first statement holds for *any* statements P and Q, so I am substituting in “not P”, which is a perfectly good statement, for P and “not Q” for Q.) I get this.

- “not(not P and not Q)” is the same as “not not P or not not Q”

I’ve decided to dispense with some brackets here. Again just as with equations, there are conventions about what to do first when you don’t see brackets. And the convention is “do all your NOTs first”. So here “not P and not Q” means “(not P) and (not Q)”. It does not mean “not (P and not Q)”.

Using the rule that two NOTs cancel, we can simplify the above law to this.

- “not(not P and not Q)” is the same as “P or Q”

Now we can “apply NOT to both sides” (just as with equations, if two things are the same and you do the same to both then you end up with two things that are still the same). We get this.

- “not not(not P and not Q)” is the same as “not(P or Q)”

And finally, using once again the fact that two NOTs cancel, we get to the second de Morgan law.

- “not P and not Q” is the same as “not(P or Q)”

**A philosophical digression.**

It would annoy some people if I left the discussion here, because some mathematicians feel strongly, and to many other mathematicians puzzlingly, that two NOTs do *not* cancel. That is, they maintain that “not (not P)” is not the same statement as P. That is because these mathematicians do not believe in the law of the excluded middle. If you believe that every statement is either true or false, then you can define (as I did) “not P” to be the statement that is true precisely when P is false and false precisely when P is true. But what if a statement doesn’t have to be true or false?

I don’t recommend worrying about this, but let me try to explain why mathematicians who ask this kind of question are not mad (or not necessarily, at any rate). There are at least three reasons that one might decide, in certain contexts, that not every statement has to be true or false.

The first is when we are dealing with statements that are not completely precise. Let me illustrate this with a few ordinary English sentences.

- Tony Blair is happy.
- The weather is awful.
- He was out LBW.
- Abolishing the 50% tax rate would be unfair.
- Democracy is better than dictatorship.

Do you want to say of each of those sentences that there *must* be a fact of the matter as to whether they are true or not (even if we might not know which)? Sometimes we can be pretty sure that Tony Blair is happy, but what about when he had just got up this morning and was cleaning his teeth? Was it definitely the case that one of the two statements, “Tony Blair is happy” or “Tony Blair is not happy” was true and the other false? (Here I’m interpreting the second statement as “It is not the case that Tony Blair is happy”.) A more reasonable attitude would surely be to say that being happy is a rather complex and not entirely precisely defined state of mind, so there is a bit of a grey area.

If you concede that there is this grey area, then how should you interpret the following sentence?

- It is not the case that it is not the case that Tony Blair is happy.

Or to put it more concisely, “not not (Tony Blair is happy)”.

When P is a vague sentence like “Tony Blair is happy” then it seems to me that a reasonable interpretation of “not P” is that it is sufficiently clear that P is false for it to be possible to state that confidently. Under this interpretation, “It is not the case that Tony Blair is happy” means that he is sufficiently clearly not happy for it to be possible to say so with confidence. Then “It is not the case that it is not the case that Tony Blair is happy,” means something like “It is clear that we cannot be clear that Tony Blair is not happy,” which is not the same as saying “It is clear that Tony Blair is happy.” When we say “Tony Blair is happy” we are ruling out the grey area, but when we say “not not(Tony Blair is happy)” we are allowing it. (Why? Because if Tony Blair’s mood is not clear to us, then we clearly cannot say with confidence that he is *not* happy, and therefore not not(Tony Blair is happy).) Perhaps if you asked him whether he was happy he would give a small sigh and say, “Well, I’m not *un*happy.”

Yes, you might say, but the great thing about mathematics is that it *eliminates* vagueness. So surely the above considerations are simply irrelevant to mathematics.

That is by and large true, so let us consider a second type of statement.

- There are infinitely many 7s in the decimal expansion of
- is irrational.

These are both famous unsolved problems. So we don’t know whether they are true or false. Worse still, Gödel has shown us that it is at least conceivable that one of these statements (or another like it) cannot be proved or disproved. (That’s a bit of an oversimplification of what Gödel’s theorem says, for which I apologize to anyone it irritates.) So if we don’t have a proof, or even any certainty that there *is* a proof, what gives us such a huge confidence that these statements *must* have a determinate truth value? What does it mean? The problem now is not vagueness, but rather the lack of any accepted way of deciding whether or not the statement is true.

Suppose, for example, that we try to argue that even if we don’t know whether is irrational, there must nevertheless be a fact of the matter one way or the other. We might say something like this: either there are two integers p and q sitting out there such that or there aren’t. In principle we could just look through all the pairs of integers and check whether they equal Either we would find a pair that worked, or we wouldn’t.

Hmm … what is this “in principle” doing here? We live in a finite universe, so we can’t just look through infinitely many pairs of integers. So what happens in the actual universe? We find that at any one time the best we can hope for is to have looked through just finitely many pairs What can we conclude if none of the corresponding fractions is equal to ? Precisely nothing about whether is irrational.

Note, incidentally, that another “infinite algorithm” for solving the problem would simply be to work out the entire decimal expansion of and then go back and see whether it is a recurring decimal or not. Again, we can’t do this algorithm in practice because we live in a finite universe.

As a result of considerations like these, some mathematicians do not agree that a statement like “ is irrational” must have a determinate truth value. So again we have a grey area, but this time the reason is not vagueness but rather the lack of a proof.

I should also make clear that most mathematicians (I think) *do* believe that there must be a fact of the matter one way or the other, regardless of what we can prove. I myself don’t, but I am in the minority there.

A third reason for abandoning the idea that every statement must be either true or false is to insist on stricter standards for what counts as true. If you would like some idea of what I mean by this, I refer you to the excellent comment by Andrej Bauer below, and also to the Wikipedia article on intuitionism.

[This section was rewritten in response to criticisms from Andrej Bauer and Michael Hudson-Doyle below. There is no reason to set it in stone at any point, so further criticisms are welcome if you have them.]

**Generalizing de Morgan’s laws.**

If you feel like doing a little exercise, you could try using de Morgan’s laws together with the associativity of addition to deduce that

- “not(P and Q and R)” is the same as “(not P) or (not Q) or (not R).”

If you manage it, then that is a good sign: you are probably at, or well on the way to, the level of understanding of and fluency with “and”, “or” and “not” that you need to do undergraduate mathematics.

SPOILER ALERT — I’M ABOUT TO GIVE A SMALL HINT, SO IF YOU DON’T WANT IT THEN SKIP THE NEXT SENTENCE, WHICH IS IN FACT THE LAST SENTENCE OF THIS POST.

Hint: You should begin by putting the brackets back in and writing “P and (Q and R)” instead of “P and Q and R”.

***********************************

## Basic logic — connectives — IMPLIES

By Gowers

I have discussed how the mathematical meanings of the words “and”, “or” and “not” are not quite identical to their ordinary meanings. This is also true of the word “implies”, but rather more so. In fact, unravelling precisely what mathematicians mean by this word is a sufficiently complicated task that I have just decided to jettison an entire post on the subject and start all over again. (Roughly speaking what happened was that I wrote something, wasn’t happy with it for a number of reasons, made several fairly substantial changes, and ended up with something that simply wasn’t what I now feel like writing after having thought quite a bit more about what I want to say. The straw that broke the camel’s back was a comment by Daniel Hill in which he pointed out that “implies” wasn’t, strictly speaking, a connective at all.

I’ll mention a number of fairly subtle distinctions in this post, and you may find that you can’t hold them all in your head. If so, don’t worry about it too much, because you can afford to blur most of the distinctions. There’s just one that is particularly important, which I’ll draw attention to when we get to it.

**“Implies” versus “therefore” versus “if … then”.**

The three words “implies”, “therefore”, and “if … then” (OK, the third one isn’t a *word* exactly, but it’s not a phrase either, so I don’t know what to call it) are all connected with the idea that one thing being true makes another thing true. You may have thought of them as all pretty much interchangeable. But are they exactly the same thing?

Some indication that they aren’t quite identical comes from the grammar of the words. Consider the following three sentences.

- If it’s 11 o’clock, then I’m supposed to be somewhere else.
- It’s 11 o’clock implies I’m supposed to be somewhere else.
- It’s 11 o’clock. Therefore, I’m supposed to be somewhere else.

The first one is the most natural of the three. The second doesn’t quite read like a proper English sentence (because it isn’t), and the third, though correct grammatically, somehow doesn’t quite mean the same as the first, which is partly reflected by the fact that it is two sentences rather than one. (I could have used a semicolon instead of the full stop, but a comma would not have been enough.)

Let’s deal with the difference between “Therefore” and “if … then” first. The third formulation starts with the sentence, “It’s 11 o’clock.” Therefore, it is telling us that it’s 11 o’clock. By contrast, the first formulation gives us no indication of whether or not it is 11 o’clock (except perhaps if there is a note of panic in the voice of the person saying the sentence). So we use “therefore” when we establish one fact and then want to say that another fact is a consequence of it, whereas we use “if … then” if we want to convey that the second fact is a consequence of the first without making any judgment about whether the first is true.

How about “implies”? Before I discuss that, let me talk about another distinction, between *mathematics* and *metamathematics*. The former consists of statements like “31 is a prime number” or “The angles of a triangle add up to 180″. The latter consists of statements *about* mathematics rather than of mathematical statements themselves. For example, if I say, “The theorem that the angles of a triangle add up to 180 was known to the Greeks,” then I’m not talking about triangles (except indirectly) but about theorems to do with triangles.

The sort of metamathematics that concerns mathematicians is the sort that discusses properties of mathematical statements (notably whether they are true) and relationships between them (such as whether one implies another). Here are a few metamathematical statements.

- “There are infinitely many prime numbers” is true.
- The continuum hypothesis cannot be proved using the standard axioms of set theory.
- “There are infinitely many prime numbers” implies “There are infinitely many odd numbers”.
- The least upper bound axiom implies that every Cauchy sequence converges.

In each of these four sentences I didn’t *make* mathematical statements. Rather, I *referred* to mathematical statements. The grammatical reason for this is that the word “implies”, in the English language, is supposed to link two noun phrases. You say that one thing implies another.

A *noun phrase*, by the way, is, roughly speaking, anything that could function as the subject of a sentence. For instance, “the man I was telling you about yesterday” is a noun phrase, since it functions as the subject of the sentence,

- The man I was telling you about yesterday is just about to pass us on his bicycle for the third time.

Other noun phrases in that sentence are “his bicycle” and “the third time”.

Let me write something stupid:

- The man I was telling you about yesterday implies his bicycle.

I wrote that because there is an important difference between two kinds of nonsense. The above sentence doesn’t make much sense, because you can’t imply a bicycle. However, it is at least grammatical in a way that

- The man I was telling you about yesterday. Therefore, his bicycle.

is not.

All this means that when we use “implies” in ordinary English, we are not connecting statements (because statements are not noun phrases) but talking *about* statements (because we use noun phrases to refer to statements).

I can think of three ways of turning statements into noun phrases. The first is rather crude: you put inverted commas round it. For example, if I want to do something about the incorrect sentence

- It is 11 o’clock implies I am supposed to be somewhere else.

then I could change it to

- “It is 11 o’clock” implies “I am supposed to be somewhere else.”

The second method is to come up with some name for the statement. That doesn’t work well here, but let’s have a go.

- The mid-morning hypothesis implies the inappropriate personal location scenario.

It works better for mathematical statements with established names such as the Bolzano-Weierstrass theorem.

The third method is to stick “that” or something like “the claim that” in front.

- The fact that it is 11 o’clock implies that I am supposed to be somewhere else.

I mentioned above that “implies” is not, strictly speaking, a connective. Why is this? It’s because connectives are used *to turn mathematical statements into mathematical statements*. For example, we can use “and” to build the statement “ is prime and ” out of the two statements “ is prime” and ““. When we do that, the new statement isn’t *referring* to the old statements, but rather it *contains* them.

Unfortunately, as so often with this kind of thing, common mathematical usage is more complicated than the above discussion would suggest. Most people read the “” symbol as “implies”. And most people are quite happy to write something like

which, according to what I said above, is ungrammatical because “implies” is not linking noun phrases. What I suggest you do here is not worry about this too much: confusion between mathematics and metamathematics is unlikely to be a problem when you are learning about Numbers and Sets and about Groups. If you *are* inclined to worry, then you could resolve to read a sentence like the above as “If then ” I would also say that the symbol “” should in general be used fairly sparingly. In particular, don’t insert it into continuous prose. For instance, don’t write something like, “Therefore and ” Instead, write, “Therefore and which implies that ” (Note that in that last sentence the word “which” functioned as the subject of “implies” and referred back to the statement “ and “.)

**Quotation and quasi-quotation.**

If you like subtle distinctions that will not matter in your undergraduate mathematical studies, then read on. If you don’t, then feel free to skip this short section.

The distinction I want to draw attention to is between two uses of quotation marks. Just for good measure, let’s look at three different ways of doing something with the sentence, “There are infinitely many primes.”

- There are infinitely many primes, but only one of them is even.
- “There are infinitely many primes” is a famous theorem of mathematics.
- “There are infinitely many primes” is an expression made up of five words.

The first of these sentences is about numbers. As such, it doesn’t use quotation marks. The third sentence is about *a linguistic expression*. As such, it very definitely requires quotation marks, just as they are needed in the sentence

- “Dog” is a noun and “bark” is a verb.

As for the second sentence, it is somewhere in between. It isn’t about numbers, but it’s also not about a linguistic expression. It’s about a *mathematical fact*. This use of quotation marks is sometimes called quasi-quotation. I won’t say any more but will instead refer you to the relevant Wikipedia article if you are interested. [Thanks to Mohan Ganesalingam for drawing my attention to it.]

**Yes, but what do “if … then” and “implies” mean?**

I’ve just spent rather a long time discussing the grammar of “implies”, “therefore” and “if … then” and said almost nothing about what they actually mean. To avoid confusion, I’m mainly going to discuss “if … then” since there is no doubt that that really is a connective. But sometimes I’m going to want to do what I’ve done in previous posts and use the letters P and Q to stand for statements, and here, unfortunately, there is a danger of the confusion creeping back. In particular, if one is being careful about it then one needs to be clear what “standing for a statement” actually means.

Is it something like the relationship between “The Riemann hypothesis” and “Every non-trivial zero of the Riemann zeta function has real part 1/2″? That is, are P and Q *names* for some statements? Not exactly, because we want to be able to make sense of the expression (recall that is a symbolic way of writing “and”) and the word “and” links statements rather than names. (You don’t, for example, say, “The Riemann hypothesis and Fermat’s Last Theorem” if you want to assert that the Riemann hypothesis and Fermat’s Last Theorem are both true.) So we should think of P and Q *as statements themselves* — it’s just that they are unknown statements.

But in that case we shouldn’t be allowed to write or at least not if means “implies”. But that’s just too bad. I’m going to write it, and if you’re worried about it then read “” as “if P then Q”. But actually what I recommend is not worrying about it and just knowing in your heart of hearts that it would be easy to replace what you are saying by something that is strictly correct if there was ever any danger of confusion.

So let us pause, take a deep breath, allow everything I’ve written so far to slip comfortably into the back of our minds, and turn to the question of what “if … then” and “implies” actually mean. And the answer is rather peculiar. In everyday English, when we use one of these words, we are trying to explain that there is a *link* between the two statements we are relating (either directly or by referring to them). For example, if I say, “If we continue to emit carbon dioxide into the atmosphere at the current rate then sea levels will rise by two metres by 2100,” I am suggesting a causal link between the two.

Let me now give the standard account of what mathematicians mean by “if … then”. Later I shall qualify it considerably — not because I think it is incorrect but because I think it doesn’t give the whole picture and can be unnecessarily off-putting. The standard thing to say is that is true unless is true and is false. That is, if you want to establish that then the only thing that can go wrong is being true and being false.

A brief interruption: purists will note that I have been inconsistent. If *is* a statement rather than something that *refers to* a statement, then I can’t say “ is true”. I have to say, “”” is true.” Alternatively, I should have said, “ unless and .” Can we agree that I’ll be slightly sloppy here? (If you don’t understand why it’s sloppy, I don’t think it matters.)

Let me illustrate this with a few examples.

- If there were weapons of mass destruction in Iraq then pigs can fly.
- The Riemann hypothesis implies Fermat’s Last Theorem.
- If is both even and odd, then
- If is a prime not equal to 2, then is odd.

Of these four statements, the fourth one seems quite reasonable, while the other three are all a bit peculiar. For example, it’s quite obvious that (the recent Pink Floyd stunt notwithstanding) pigs cannot fly. Doesn’t that make the first sentence false? And how can one say that the Riemann hypothesis implies Fermat’s Last Theorem when nobody expects a proof of Fermat’s Last Theorem that uses the Riemann hypothesis? And surely if is both even and odd, it could just as well be 19. Can it be correct to say that it has to be 17? As for the fourth sentence, it seems fine: if is a prime not equal to 2, then it cannot have 2 as a factor (or it wouldn’t be prime), so it must indeed be odd.

Well, mathematicians would say that all four statements are true. That’s because the only way “If P then Q” can be false is if P is true and Q is false. You should understand this as a *definition* of “if … then”. Let’s check the four statements using this definition.

For the first one to be false, we would need there to have been weapons of mass destruction in Iraq and for pigs to be unable to fly. Well, we’ve got the earthbound pigs but there were no weapons of mass destruction in Iraq, so the first statement is true. (Again, this is not some metaphysical claim. It just follows from the way we have chosen to define “if … then”.)

For the second to be false, we would need the Riemann hypothesis to be true and Fermat’s Last Theorem to be false. Well, Andrew Wiles, with help from Richard Taylor, has proved Fermat’s Last Theorem, so it’s not false. So the second statement in the list is true.

As for the third, the only way for that to be false is if is both even and odd but is not equal to 17. But no number is both even and odd. Therefore, the third statement is true. The problem about equalling 19 doesn’t arise because there are no even and odd integers in the first place.

**Truth values and “causes”.**

There’s something unsatisfactory about the truth-value definition of “if … then” and “implies”. It seems to leave out the idea that one thing can be true *because* another is true. It would be quite wrong to say, for instance, that Fermat’s Last Theorem is true because the Riemann hypothesis is true.

Fortunately, there is a very close link between the truth-value definition and what I’ll call the causal concept of “if … then”. I’m not going to attempt a precise definition of the causal concept — I’m just referring to the basic idea of one statement’s being a reason for another.

Let’s go back to the one statement that felt reasonable in the list above. It was this.

- If is a prime not equal to 2, then is odd.

Now comes another somewhat subtle distinction, and this is the one I really care about. What does that statement above actually mean? I think a very natural way of interpreting it is this.

- Whenever is a prime not equal to 2, it is also odd.

In other words, although it looks like a statement about some fixed number , the fact that we have been told nothing whatsoever about makes us read it in a slightly different way. We say to ourselves, “Since we’ve been told nothing at all about this must be intended as a general statement about an arbitrary So what it’s really saying is that if a positive integer has one property — being a prime not equal to 2 — then it has another — being odd.” If we’re thinking about things that way, then it’s rather tempting to say that the property “is a prime not equal to 2″ implies the property “is odd”.

What I’ve just suggested is not standard mathematical practice, but in principle it could have been. However, it is incredibly important in mathematics to be completely sure at all times what kinds of objects one is dealing with. I said earlier that “if … then” connects statements and “implies” connects noun phrases that refer to statements. I did not say that either of them connects properties. So if I want to say that one property implies another, then I have to be absolutely clear that this is a *different* meaning of the word “implies” (even if it is related to the previous one).

OK, so let me be careful. First of all, what is a property? It’s what you get when you take a statement that concerns a variable and you remove that variable. For example, if I take the statement “ is a perfect square” and remove the variable from it, I get the property “is a perfect square”. A property is a thing you say about something else. (It’s almost like an adjective, but not quite because of the extra “is”.) If you want to be more formal about it, if you are given a set like the set of all positive integers, a property associated with that set is a function from elements of the set to statements. For example, the property “is prime” takes the number to the statement “ is prime”. (It is more conventional to say that all we actually care about is the truth values of these statements. So the property “is prime” takes the value TRUE at each prime number and FALSE at all other numbers. I’ll stick with my unconventional discussion here.)

Now suppose that we have two properties A and B associated with the positive integers. When do we say that A implies B (according to my unconventional definitions)? Well, for each positive integer we have a statement and a statement I’ll say that implies (in the property sense) if for every positive integer , the statement implies the statement (in the truth-value sense). In other words, whenever is true, so is and otherwise anything can happen. In the example above, is the property “is a prime not equal to 2″, is the property “is odd”, and for each is the statement “ is a prime not equal to 2″ and is the statement “ is odd”. Every time is true, which it is when , so is This gives us the feeling that the property “causes” the property .

Let me go back to the statement that seemed reasonable.

- If is a prime not equal to 2, then is odd.

It’s important to be careful about what this means. Is it a statement about some specific ? If so, then we must interpret the “if … then” in the strict truth-value sense. Or is it really a way of saying, “Every prime not equal to 2 is odd”? In that case, it has more of a causal feel to it.

The best way to keep everything clear at all times is not to write the above sentence when you’re really talking about all Instead, you can write

- For every positive integer if is a prime not equal to 2, then is odd.

Now, if you pick out just the part of this statement that says, “If is a prime not equal to 2, then is odd,” then you have something that must be interpreted in the truth-value sense. But when you apply those truth-value statements to all positive integers simultaneously, what you end up with is the nice “causal” statement that the property “is a prime not equal to 2″ implies the property “is odd”.

**A silly deduction and a sensible deduction.**

Because there is a sort of causal notion of implication, and because it is in a way what we really care about when doing mathematics, I very much prefer to illustrate the meaning of “implies” or “if … then” with reference to examples that include variables. If I just take two fixed statements like “Margaret Thatcher used to be Prime Minister of the UK” and “there was recently a tsunami in Japan” and tell you that, despite the lack of any obvious relationship between them, the first statement implies the second statement because the second statement happens to be true, then it it is clear the notion of implication I am using has nothing to do with one thing being true *because* another thing is true: not even the most rabidly left-wing person is going to blame the Japanese tsunami on Thatcher’s premiership. But a statement like, “If then ,” is completely reasonable. Moreover, because is a general element of which might be an infinite set, we can’t establish a statement like this by running through all and checking the truth values of the statements and Rather, we have to give a *proof* — that is, an explanation of why *must* belong to if it belongs to Thus, once you start looking at statements with variables, the truth-value notion of implication forces you to look for “reasons” and “causes” so that you can establish lots of truth-value facts at once. (I’m leaving out the possibility here that a statement could in some sense “just happen to be true”. For example, many people take seriously the following possibility. Perhaps the property “is even and at least 4″ implies the property “is a sum of two primes” in the sense that no number is even and at least 4 without being a sum of two primes, but perhaps also there isn’t a *reason* for this — perhaps it just happens to be the case.)

Here’s another illustration of the difference between statements that involve parameters and statements that don’t. Consider the following claim.

- If is rational then there is an integer that is both even and odd.

I’m going to prove it in two different ways.

Proof 1. is irrational, so the statement “ is rational” is false, and therefore implies all other statements. In particular, it implies that there is an integer that is both even and odd.

Proof 2. If is rational, then we can find positive integers and such that which implies that Let be the largest integer such that is a multiple of Since is a perfect square, must be even. (To see this, just consider the prime factorization of ) But and the largest k for which is a multiple of is odd. (To see this, just consider the prime factorization of ) Therefore, is both even and odd, which proves the result.

Which of these two arguments is more interesting? Undoubtedly the second, since it actually gives us a proof of the irrationality of So is the first argument valid at all? You might object to it on the grounds that it uses without proof the fact that is irrational. But we can make the question more interesting as follows. There is (it happens) a different proof of the irrationality of that does not involve the statement that some positive integer is both even and odd. What if we used that argument, concluded that “ is rational” was false, and then went on to deduce “there exists an integer that is both even and odd” in the way that argument 1 does above. Would that be a valid deduction?

I think the answer has to be yes, but it is not an *interestingly* valid deduction. It is not showing that the irrationality of is in any way caused by a contradiction that involves parity, since we deduced that from another, and unrelated, false statement.

If we think of implication as primarily something we apply to statements with parameters, and therefore indirectly and in a different sense to properties, then our starting point is not the statement “ is irrational” but rather the statement ““. And our conclusion, that there exists an integer that is both even and odd, is deduced from the more precise (and informative) statement, “the highest such that is a multiple of is both even and odd”.

As a final remark about the above example, which allows me to emphasize a point I have already made, suppose that I start a proof of the irrationality of by writing,

What I am really saying is that *whatever* and might be, if then In other words, although it looks as though I’m talking about a specific pair and in fact I’m making a general deduction.

**What’s good about the usual convention concerning “if … then” and “implies”?**

I think I have partially answered this question by pointing out that when we consider statements *with parameters* then the truth-value meaning of “implies” feels a lot closer to the more intuitive “causal” meaning of “implies”. However, the agreement isn’t total. One of the “silly” examples from early in this post was this.

- If is both even and odd then

This looks odd, because although we know that can’t be both even and odd, we also feel that *if* were even or odd, there would be nothing about that fact that steered towards the number 17 as opposed to any other number. I can’t deny the feeling of oddness. All I can say is that the hypothetical situation never arises because the hypothesis, that is even and odd, is impossible.

What I *can* do, however, is explain why I don’t want to try to find a different convention that would make this statement false. I don’t want to do that because it would force me to give up some general principles that I like. One of those I have already mentioned:

- Property implies property if and only if the set of all such that is a subset of the set of all such that

I hope you’ll agree that that looks highly reasonable, and we don’t want to start having ugly exceptions to it if we don’t have to.

Here’s another mathematical principle that I think you will also have to agree with.

- The empty set is a subset of every other set.

Now let’s apply these two principles. I’m going to let be the property “is both even and odd” and I’m going to let be the property “equals 17″. Then the set of such that is the empty set (since no is both even and odd). The set of such that is the set Since the empty set is a subset of the set the first principle tells us that implies

To summarize this discussion, the formal mathematical notion of implication is a bit strange, but most of the strangeness disappears if you just look at statements with parameters, which tend to be the statements we care about. Each such statement corresponds to a property of those parameters, and implication of properties is closer to our intuitive notion of one thing “making” another true than implication of statements. Even then there are one or two oddnesses, but these are a small price to pay for the cleanness and precision of the definition and for the fact that it allows us to hold on to some cherished general principles.

**An exercise — not to be taken too seriously.**

(i) Prove that Borsuk’s conjecture implies the Riemann hypothesis.

(ii) Comment on your proof.

Hint: if you find part (i) difficult, then you are not applying one of the pieces of general study advice I gave in the first post of this series.

*****************************

## Basic logic — quantifiers

By Gowers

When I started writing about basic logic, I thought I was going to do the whole lot in one post. I’m quite taken aback by how long it has taken me just to deal with AND, OR, NOT and IMPLIES, because I thought that connectives were the easy part.

Anyway, I’ve finally got on to quantifiers, which are ubiquitous in advanced mathematics and which often cause difficulty to those beginning a university course. A linguist would say that there are many quantifiers, but in mathematics we normally make do with just two, namely “for all” and “there exists”, which are often written using the symbols and (If it offends you that the A of “all” is reflected in a horizontal axis and the E of “exists” is reflected in a vertical axis, then help is at hand: they are both obtained by means of a half turn.)

Let me begin this discussion with a list of mathematical definitions that involve quantifiers. Some will be familiar to you, and others less so.

1. A positive integer is *composite* if there exist positive integers and both greater than 1, such that

2. An matrix is *invertible* if there exists an matrix such that (Here is the identity matrix.)

3. A binary operation on a set A is *commutative* if for every and for every

4. A function from a set to a set is a *surjection* if for every there exists such that

5. A set of real numbers is *dense* if for every real number and for every there exists a real number such that and .

I have put those in approximately ascending order of difficulty. To see how such a definition comes about, let us take the last of them. It is a familiar and useful property of the rational numbers (that is, numbers that can be written as fractions) that they “appear everywhere”. This property can be expressed in a number of ways. One is to say that whenever and are real numbers and there must be at least one rational number that lies between them. Another way of saying it is that every real number can be arbitrarily well approximated by rationals.

Let’s try to turn those two thoughts into precise definitions. But before we do so, I would like to draw attention to a number of words that should alert you to the possible presence of , or a *universal quantifier*, as it is sometimes known. A few examples are “whenever”, “always”, “every”, and “each”. For each one of these words I’ll give an example of a sentence that contains it. Then I’ll translate those sentences into a more mathematical style using a universal quantifier.

- Whenever it rains, the grass smells wonderful.
- I always believe what my doctor says.
- Every country in the EU is having economic difficulties at the moment.
- Each picture in the Fitzwilliam museum is worth getting to know.

Now the translations.

- For every time t, if it rains at time t then the grass smells wonderful at time t.
- For every statement S, if my doctor says S then I believe S.
- For every country C, if C belongs to the EU then C is having economic difficulties at the moment.
- For every picture P, if P is in the Fitzwilliam museum then P is worth getting to know.

There are other words and phrases that suggest the lurking presence of or an *existential quantifier*. They are things like “there is”, “for some”, “some”, “at least one”, “you can find”. I’ll content myself with just one example this time.

- Some cars run on electricity.

This could be translated as follows.

- There exists a car C such that C runs on electricity.

You might want to argue that the word “cars” in the first sentence implies that more than one car runs on electricity. If that bothers you, here’s another example. Suppose I receive an email and react by saying, “*Somebody* likes me.” The meaning there (if you are rather literal-minded and take my words at face value) is

- There exists a person P such that P likes me.

**Creating mathematical statements that involve quantifiers.**

Right, let’s see what we can do with this statement:

- Whenever and are real numbers with there is some rational number that lies between and

I’m going to use symbols this time. The word “whenever” alerts me to a universal quantifier. Indeed, the phrase “Whenever and are real numbers” can be translated into before we even look at the rest of the sentence. “There is some” now looks suspiciously like an existential quantifier, and it is: we translate “there is some rational number ” into the symbolic form Finally, “that” is referring back to the number so we are saying that lies between and which we can put more mathematically by saying Putting that all together gives us this.

- s. t. .

To read that sentence, read as “for every” or “for each” (or if you like, “for all” but that sounds somehow less idiomatic), read as “in”, read as “the reals”, read as “there exists”, read as “the rationals”, read “s. t.” as “such that” and read as “is less than”. So what you would actually *say* when reading those symbols is this.

- For every in the reals, there exists r in the rationals such that is less than is less than

As you will have deduced from that, is the conventional symbol for the set of real numbers and is the symbol for the set of rational numbers. We also have for the set of natural numbers (or positive integers), for the set of all integers, and for the set of complex numbers.

What about the definition of “dense” in terms of arbitrarily good approximation? The informal definition was this.

- Every real number can be arbitrarily well approximated by rationals.

We can make a start on this by turning the “every” into a proper quantifier. That gives us this.

- For every real number x, x can be arbitrarily well approximated by rationals.

So now our problem is reduced to finding a formal way of saying that x can be arbitrarily well approximated by rationals. What does that mean? It means this: however well you want me to approximate x by a rational number, I can do it. Now the word “however” contains “ever” within it. Could this be hinting at a universal quantifier? Yes it could. It is saying something like, “Give me any level of accuracy you want,” which contains the word “any”, a real giveaway. Having said that, the word “any” is a bit problematic because sometimes it replaces an existential quantifier, as it does for instance in the sentence, “If there is any reason to go, I’ll happily go.” The usual advice, with which I concur, is to keep “any”, “anything”, “anywhere”, etc. out of your mathematical writing.

Anyhow, we can avoid the word “any” by going further and saying, “For every level of accuracy that can be specified.” To clarify this, let us think what “level of accuracy” means. When I approximate a real number by a rational number, I am trying to pick a rational number that is *close* to the real number. And the accuracy of the approximation is naturally measured by the difference. So to specify a level of accuracy is to provide a small positive number and insist that the difference should be less than that number. For historical reasons, mathematicians like the Greek letters and for this purpose. So “for every level of accuracy” ends up as the rather more straightforward “for every “.

Where have we got to now? We are here.

- For every real number x and for every “, x can be approximated to within by a rational number.

Now the word “can” is another one that sometimes conceals an existential quantifier. For example, “It can be cold in Cambridge,” means that amongst the possibilities for the weather in Cambridge there exists at least one cold one. So we could take the hint from that and rewrite the above sentence as follows.

- For every real number x and for every “, there exists a rational number such that approximates to within

And now, to finish off, we just have to remember what we meant by “approximates to within “. We end up with statement 5 from earlier on.

- For every real number x and for every “, there exists a rational number such that “.

In symbols, this would be as follows.

By the way, a quick piece of stylistic advice. Some people, when they first come across the symbols and get too keen on them and start using them in the middle of ordinary text, writing sentences like this: “And therefore we know that is at most M.” That looks awful. You should either write something like, “And therefore for every in the set we know that is at most M,” or you should write something more like this.

“And therefore,

“

In general, don’t overdo the symbols. And if you do use them (in order, say, to avoid an excessively wordy sentence), then don’t mix them up with words too much. A good rule of thumb there is to make sure that each symbol is part of a bunch of words that can stand reasonably well on their own. For example, the following mixture isn’t too bad:

- Therefore, whenever

But these are unspeakably awful.

- Therefore, the union of A and B.
- Therefore, elements of
- Therefore, is an element of A, which is an element of B.
- It follows that that does not belong to B.

I leave it to you to come up with nicer formulations of the above four sentences. One final exhortation: please don’t ever use the symbol to stand for the word “every”. If you’re now thinking “But isn’t that what it means?” then I’m glad I brought this up. It doesn’t mean “every”. It means “*for* every”. That kind of distinction really matters in mathematics.

**Understanding mathematical statements that contain quantifiers.**

I’ve discussed how you can take a slightly vague English statement and convert it into a precise formal mathematical one. It’s tempting to give many more examples, but I’d rather save that up for the *actual* definitions you will encounter. So if one example isn’t enough for you, be patient and there will be more.

But what about the reverse process? Suppose you are presented with a statement like this.

If you haven’t seen that before, you will probably find it pretty opaque. In fact, some people find it pretty opaque even if they *have* seen it before. So what can one do to make sense of it?

Well, most people find that the more quantifiers they have to cope with, the harder it gets. So a good technique for understanding a statement such as the above is to build up gradually. Let me illustrate how this can be done. (What I’m about to show is meant to be something you can do for yourself with other definitions. The hope would be that once you’ve gone through the process with a few of them you will get used to them and not need to go through the process any more.)

To make things *really* easy, let’s start with no quantifiers at all. That is, let’s start with the quantifier-free “heart” of the statement, which is

This isn’t hard to understand: it’s saying that the nth term of the sequence differs from by less than

OK, now let’s add a quantifier. The one thing to remember is that we’ll add the quantifier furthest to the *right*. In other words, we start at the end of the entire statement (this we’ve just done) and work backwards.

That’s got one quantifier, but it’s still not too bad. It’s simply saying that *every* term of the sequence differs by at most from Or rather, it would be if it weren’t for that little condition that So I lied. It isn’t quite saying that every term differs by at most It’s saying that every term differs by at most *as long as we’ve got to* *or beyond*.

Right, let’s add a second quantifier.

What is the effect of that “” on the previous sentence? Well, the previous sentence said that is within of as long as we’ve got past But it gave us no idea what was. And in fact isn’t really a fixed number. All we know is that there is *some* that makes that statement true. That is, there is *some* such that stays within of once gets to or beyond.

Note that I hid the “for all” quantifier inside the word “stays” there. I used the word “stays” to mean “is for evermore” and the “ever” in “evermore” is a very clear hint of a universal quantifier. This informal language isn’t part of mathematics and should be kept out of proofs, but it is a useful aid to thought.

Actually, there is another piece of informal language that I find useful for this specific situation where we have I think of the as saying “eventually” (which could also be “from some point onwards” if you want to make the stick out a bit more clearly) and the as saying “always”. So this part of the statement is saying

- eventually is always within of

I quite like the word “stays” too:

- eventually stays within of

We’ve still got a quantifier to go. What is ? Again, it isn’t something fixed. Let’s have a look at the whole statement.

We now reach something that’s a bit less easy to put into informal language. Here’s an attempt.

- However close you want to be to eventually it will always be that close.

In general, if you ever see a statement that begins and ends with something being less than the general idea is that however small you want that something to be … complicated stuff … you can get it to be that small. (Of course, the letter doesn’t have to be Another popular choice is )

It would be remiss of me not to mention that the definition we have just picked apart is the formal definition of the concept of convergence. You will find over the next few weeks that if you see the sentence

- converges to

then to work with it you need to translate it into the formal statement we’ve just looked at with its three quantifiers. That’s an oversimplification because it applies only when we are reasoning from first principles. Once you have met the definition of convergence, you will prove simple facts about it such as that if converges to and converges to then converges to These facts can then be used to prove facts that involve convergence *without* writing out the definition in full. However, when you’re just starting, and sometimes later on too, you do need to write out the definition. So one thing you have to do is *learn* these definitions — off by heart. If you don’t, you might just as well give up. But if you follow some of the tips above, you may find that you don’t have to learn the definitions as if they were a random jumble of symbols. Ideally, you will develop enough understanding to have a good intuitive picture of what the definition says, and the means to translate that intuitive picture into the formal definition with quantifiers. Another thing you can do is try writing out wrong versions of the definition and seeing why they are wrong. For example, suppose we interchange the first two quantifiers in the definition we’ve been discussing. Then we get the following statement.

That is an unnecessarily complicated way of saying that from some point on all terms of the sequence are equal to (If you don’t immediately see that it saying that, then try carrying out the process I’ve just outlined above. At some point the meaning will jump out at you.)

What I’ve just recommended may sound like hard work; that is because it is. But it isn’t impossibly hard, and time invested at this stage will pay huge dividends later.

I could say plenty more about quantifiers, but I think I’ll hold my fire for now, and discuss them more when they come up in the courses.