
Notes
Supplementary notes for some of the listed theorems are provided below. Any suggestions for additions or corrections can be emailed to me and will be most welcome.
Theorem no. 1: The Four Colour Theorem

 The original announcement (September 1976) by Appel and Haken of their proof is available on free access here courtesy of Project Euclid. The full publication followed a year later: Part I and Part II (there are also microfiche supplements).
 A 1998 update on proving the 4CT (still computerassisted) is given by Robin Thomas here.
 The obvious progression from the sophisticated computerassisted proofs of 4CT to formalised, computergenerated proofs, is discussed here.
 Brendan McKay "A note on the history of the fourcolour conjecture", Journal of Graph Theory,
Vol. 72, No. 3, 2013, 361–363, has given the earliest publication date for the Four Colour Conjecture as 1854 (it was discussed in correspondence in 1852). A preprint is here.
 There is a reference by Isabel Maddison to a "slightly different form" of the mapcolouring question due to Möbius and his amateur mathematician friend Adoplh Weiske, publicised by Möbius in 1840 ("Note on the history of the mapcoloring problem", Bull. Amer. Math. Soc., Volume 3, Number 7, 1897, page 257; online here). On p. 146 of Alexander Soifer, The Mathematical Coloring Book: Mathematics of Coloring and the Colorful Life of its Creators, Springer, 2009, you can find the details: a country is to be divided into 5 regions each bordering every other. Essentially, prove that \(K_5\) is nonplanar (cannot be the graph of a map drawn in the plane), so perhaps more a precursor to the deeper Hadwiger's Conjecture than the fourcolour conjecture.
Theorem no. 2: The Fundamental Theorem of the Calculus

 There is a very nice overview of the history of calculus by Thony Christie here.
 Making the accumulation function \(\int_0^x f(t)dt\) of Part II of the theorem the starting point for explaining the whole Fundamental Theorem makes good pedagogical and aesthetic sense, as argued by McQuillan, D. and Olsen, D. M., "A Truly Beautiful Theorem: Demonstrating the Magnificence of the Fundamental Theorem of Calculus," Journal of Humanistic Mathematics, Volume 6 Issue 2, pages 148160. Online here.
Theorem no. 4: Euclid's Infinity of Primes

 Our description of Euclid's theorem follows conventional practice in casting the proof as 'by contradiction'. One may take issue with this: see Michael Hardy and Catherine Woodgold, "Prime Simplicity",
The Mathematical Intelligencer,
December 2009, Volume 31, Issue 4, pp 44–52. A good overview of the issue is given here.
 More poofs of this theorem can be found, encapsulated in an elegant contextual discussion, here.
 Among the proofs to be found via (2), Fürstenberg’s 1955 topological proof has been given a more gentle exposition by TaiDanae Bradley.
 Among the proofs not found via (2) (it would have appeared too late, I think) is this lovely oneline proof by contradiction by Sam Northshield, with products taken over all primes \(p\), supposedly finite in number and with \(P\) denoting the product of all these primes:
$$0<\prod_p \sin(\tau/2p)=\prod_p \sin(\tau/2p+\tau P/p)=0.$$ (As usual, \(\tau\) denotes circumference of unit circle. More details are given by John D Cook here.
Theorem no. 5: The Chinese Remainder Theorem

 John D. Cook gives a neat description of how our congruence system \(x\equiv y_i\pmod{n_i}\) may be solved as
$$y_1(N/n_1)^{\varphi(n_1)}+y_2(N/n_2)^{\varphi(n_2)}+\ldots + y_r(N/n_r)^{\varphi(n_r)},$$
where \(N=n_1n_2\cdots n_r\) and \(\varphi\) is Euler's totiant function. Thus our system with \(y_1=3, y_2=4,n_1=4, n_2=5\) is solved by \(3\times 5^2+4\times 4^4=1099\pmod{20}=19.\)
 A good online CRT solver is this by MathCelebrity.com which gives all the working and accepts negative number inputs.
Theorem no. 6: The Fundamental Theorem of Algebra

 Daniel Litt gives a nice 'minimal' proof of the theorem.
 Daniel J. Velleman offers "The Fundamental Theorem of Algebra: A Visual Approach", The Mathematical Intelligencer, December 2015, Volume 37, Issue 4, pp 12–21, preprint online here.
 Paul Taylor has posted this English translation of "Gauss's second proof of the fundamental theorem of algebra" (thanks to John D. Cook for drawing my attention to this).
Theorem no. 7: The Fundamental Theorem of Arithmetic

 Symbolically, this theorem asserts a unique (up to order) representation of a positive integer \(n\) as a product of powers of primes:
$$n=p_1^{a_1}p_2^{a_2}\cdots p_r^{a_r},$$
(with 1 being by convention the value of the empty product). This representation is implicit or explicit in many proofs in elementary number theory. It is also explicit in various calculations, e.g. in counting certain magic squares (Theorem 129) or in calculating periods of modular Fibonacci sequences (Theorem 235, notes(4)).
 A lovely implementation of FTA for positive integers up to 99 has been knitted by Sondra Eklund!
Theorem no. 8: The Central Limit Theorem

 The requirement that the sampled distribution has nonzero variance
 John D. Cook gives an elegant summary of the historical origins of the normal distribution here.
 There is a formal proof of the theorem, announced here by Jeremy Avigad, Johannes Hölzl, Luke Serafin.
 A nice animated illustration of CLT is given by Victor Powell.
Theorem no. 9: Fermat's Last Theorem

 Another elegant account of the resolution of FLT, which gives a little more on subsequent developments, can be found in this list of lectures by Karl Rubin.
 The role of modular forms in the proof of FLT is made explicit in this presentation by Ken Ribet.
Theorem no. 10: Bayes' Theorem

 A nice retrospective on the history of use and abuse of Bayes' Theorem is provided by Bradley Efron in "Bayes’ Theorem in the TwentyFirst Century", Science 340, June 7, 2013. A preprint is available here. I also like very much this Scientific American blog entry.
 An investigation by Stephen M. Stigler, "Who Discovered Bayes's Theorem?", The American Statistician, 37 (4), 1983, 290–296 finds credible evidence that Bayes' Theorem was first discovered by Nicholas Saunderson. On online version is here and the article is reproduced in Stigler's book Statistics on the Table: The History of Statistical Concepts and Methods, Harvard University Press, paperback edition 2002.
 A beautiful animated illustration of conditional probability by Victor Powell is here, while Will Kurt's Bayesian blog Count Bayesie has a nice alternative to our cow counting illustration here.
Theorem no. 14: Cook's Theorem

 An English translation of Leonid Levin's paper, together with a thorough analysis, may be found here. Cook's paper has been TEXed into pdf by Tim Rohlfs, available here.
 I provide a little more (amateur) analysis of this theorem as an example of a 'simultaneity' in mathematics here.
 Regarding P=NP? Gerhard Woeginger provides a valuable 'clearing house' for proof/disproof attempts.
Theorem no. 16: Sperner's Lemma

 The proof of Sperner's Lemma in n dimensions is a charming induction. It is welldescribed here by Jonathan Huang.
Theorem no. 17: The WellOrdering Theorem

 As the remark on the Banach–Tarski Paradox suggests, it is infinity, rather than wellordering or the Axiom of Choice, that can defy intuition. This 'antiantiBanarch–Tarski' argument is very well made by Asaf Karagila at Boole's Rings (follow the link back from the wonderful cartoon).
Theorem no. 18: Brouwer's Fixed Point Theorem

 A very nice discussion by Phil Wilson of Brouwer and constructivist mathematics is given here by plus magazine.
Theorem no. 19: Dilworth's Theorem

 The current illustraion of this theorem replaces one based on snooker balls which was less informative but to which I may as well retain a link.
Theorem no. 21: Brun's Theorem

 Brun's \(cx/(\ln x)^2\) bound on the twin prime count is very well motivated by James Maynard in his PROMYS Europe 2015 lecture "Patterns in the Primes" which can be found here (last time I checked, Firefox complained about the security of the pdf download, so proceed with caution).
Theorem no. 23: Cantor's Theorem

 The popular story has it that thinking about infinity, and attacks by others (notably Kronecker) on his thinking about infinity, drove Cantor insane. A much less sensational view is given in this blog entry by Richard Zach.
Theorem no. 24: Kuratowski's Theorem

 The multiple discoveries and proofs of this theorem are wonderfully charted in John W Kennedy,
Louis V Quintas,
Maciej M Sysłois, "The theorem on planar graphs", Historia Mathematica,
Vol. 12, No. 4, 1985, 356–368, available here via Elsevier's Open Access Archive. The abstract of Frink and Smith's unpublished proof appears here (Bull. AMS, 36, 3) under 'Abstracts of papers'.
 Bill Tutte, under the Blache Descartes pseudonymn wrote a little poem about the nonplanarity of \(K_{33}\) which may be read here.
Theorem no. 27: The Pythagorean Theorem
 It would seem that even 'Pythagorean' is a misnomer for this theorem. Piers BursillHall
gave me the following valuable pointers:
 A nice summary of the theorem's history is provided by Manjul Bharagava here.
 The theorem is equivalent to Euclid's notorius 5th axiom, the Parallel Postulate, in the sense that each may be derived directly from the other, a fact which seems to date back to Legendre. More details here.
 There is a beautiful account by Steven Strogatz of a particularly elegant proof of the Pythagorean Theorem attributable to and, Strogatz argues, bearing all the hallmarks of, Albert Einstein.
 Jan Stevens provides an interesting commentary on Dijkstra's account of Pythagoras, online here.
 A wonderful java applet proof of the theorem has been created by Jim Morey. It is online here but nowadays the chances are you will struggle to make your browser run it.
Theorem no. 28: Ramsey's Theorem

 An account of Erdős and Ramsey theory is given by Ronald L. Graham and Joel Spencer in the centennial reflections here (from p. 132).
 A very interesting account of early precursors to Ramsey's theorem is offered by the Computational Complexity blog. This is also the focus of Alexander Soifer, The Mathematical Coloring Book: Mathematics of Coloring and the Colorful Life of its Creators, Springer, 2008.
 The latest bounds (as of March 2017) for \(R(5)\) are due to Vigleik Angeltveit and Brendan D. McKay.
 Moore on the famous Erdős quote by Evelyn Lamb.
Theorem no. 29: Gauss's Law of Quadratic Reciprocity

 Özlem Imamoğlu's review, Bull. Amer. Math. Soc.,
Vol. 44, No. 4, 2007, 647–652, online, of Franz Lemmermayer's Reciprocity Laws: From Euler to Eisenstein, Springer, 2000, is a superb miniessay on reciprocity beyond quadratic.
Theorem no. 30: The Law of Large Numbers

 The theorem is described here in elementary terms, as would have been understood by Laplace himself. An excellent modern account in terms of measure theory is given by Terence Tao here.
 I cannot resist linking to "The Law of Small Numbers", Jonathan Kujawa's elegant centenary homage to Richard Guy in 3quarksdaily.
Theorem no. 31: Benford's Law

 More detail on use of Benford by the IRS in the United States is given here. The Wikipedia entry has a good entry on forensic aspects of Benford
 An interesting occurrence of Benford is in the frequencies of leading digits in base 10 representations of powers, e.g. \(2^k,k=0,1,\ldots\). See Theorem 299 (notes(5)) and also "A simple answer to Gelfand’s Question" by
Jaap Eising, David Radcliffe and Jaap Top, The American Mathematical Monthly, March 2015, 234–245, online here. Tangentially, we can ask whether prime numbers obey Benford.
 Ted Hill has written a very nice blog entry on Benford for Princeton University Press who publish his book with Arno Berger.
Theorem no. 32: The Green–Tao Theorem on Primes in Arithmetic Progression

 There is a nice overview by Ben Green here (p. 10, 11MB pdf file). Tao has collected some surveytype presentations at various levels here.
Theorem no. 33: The Prime Number Theorem

 The classic account by Don Zagier of Newman's short proof of the Prime Number Theorem may be found here.
 A general account is given here by Apostol.
 There is a brief discussion of heuristic explanations for the Prime Number Theorem (notably the one by Greg Martin) here.
 If \((\bar{x},\bar{y})\) is the centre of mass of the arc of \(y=\log(x)\) in the interval \([1,x]\) then \(\frac12\pi(x)\) is asymptotic to \(\bar{x}/\bar{y}\) (see M500 magazine, issue 260, pp. 10–12).
 Regarding the famous 'elementary proof' of PNT see Levinson's awardwinning essay. The unfortunate associated priority dispute is meticulously documented by Dorien Goldfeld here. It includes the observation that Tchebychef had given an elementary proof in 1852 that \(x/\log x\) is the correct order of magnitude for \(\pi(x)\). Both the proof and dispute are given a nontechnical overview by Joel Spencer and Ronald Graham, "The Elementary proof of the prime number theorem", Mathematical Intelligencer, vol. 31 (3), June 2009, 18–23.
 There are formal proofs of PNT:

the Erdős–Selberg elementary proof: Jeremy Avigad, Kevin Donnelly, David Gray, and Paul Raffand here (see here for a nice overview presentation by Avigad) and
 the classical complex analysis proof: John Harrison here.
 Chance News have a series of lectures on probabilistic number theory, with part 1 offering an excellent exploration of the Prime Number Theorem.
Theorem no. 36: Euler's Identity

 The symbol used in the illustration of this theorem is on loan from Michael Hartl, with thanks.
Theorem no. 37: Girard's Theorem

 Attribution of this theorem to Harriot can be found in chapter 2 of Roger Penrose, The Road to Reality: A Complete Guide to the Laws of the Universe,
Vintage, 2005; and in chapter 10 of David S. Richeson, Euler's Gem: The Polyhedron Formula and the Birth of Topology, Princeton University Press, 2008. I have seen it given to Legendre (e.g.) but Legendre's result, published in 1798, much later than Girard, approximates the difference between angles in a spherical triangle and angles in a plane triangle having the same side lengths. A good account is here. (Legendre did not, in any case, claim the result as his.)
Theorem no. 38: Lucas' Theorem

 Romeo Meštrović has compiled a fine survey of applications and extensions of Lucas's theorem.
Theorem no. 39: Pascal's Rule

 There is one version of Pascal's triangle which is indeed triangular: where the entries are reduced modulo 2. In this case the pattern which emerges is a version of Sierpinski's gasket!
 Amazingly a simple relationship between Pascal's triangle and \(e=2.71828\ldots\) appears to have been noticed for the first time, in the twentyfirst century, by Harlan J. Brothers.
Theorem no. 40: Stirling's Approximation

 There is a very nice introduction to Stirling's approximation by Finbarr Holland in the
Irish Mathematical Society Bulletin, 77, Summer 2016.
 A good source of information on the central binomial coefficients is the corresponding entry at oeis.org, where the sequence is no. 984.
Theorem no. 42: Zeckendorf's Theorem

 It would appear that Zeckendorf's theorem was first published in C. G. Lekkerkerker, "Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci", Simon Stevin, 29 (19511952), 190–195. I have not seen this paper but Daykin in a 1960 paper (see note 3 below) says "Zeckendorf's proof is given by C. G. Lekkerkerka (sic) in [1]," ([1] being the Simon Stevin paper). Zeckendorf himself published his theorem in 1972 in "Representation des nombres naturels par une somme de nombres
de Fibonacci ou de nombres de Lucas", Bull. Soc. Royale Sci. Liege 41 (1972) 179–182. I haven't seen this paper either (the Bulletin de la Société Royale des Sciences de Liège is online but not all issues appear to be digitised).
 A good presentation of Zeckendorf's and Lekkerkerker's theorem is given here by Steven J. Miller. Another good source on Lekkerkerker's theorem is Jukka Pihko, "On Fibonacci and Lucas representations and a theorem of Lekkerkerker", Fibonacci Quarterly, vol. 23, no. 3 (1988), 256–261, online here.
 The exact value of the average number of Zeckendorf summands, over the interval \([F_{n+1},F_{n+2})\), as approached asymptotically by Lekkerkerker's theorem, is \(L_n=1+\varepsilon(n)/F_n\), where \(\varepsilon(n)=\sum_{k=0}^{\lfloor(n1)/2\rfloor}k{n1k\choose k}\) (see Steven J. Miller's presentation). This apparently allows us to write, via the identity \(\varphi^2=1+\varphi\), the error in Lekkerkerker's ratio as the constant \(3/5\), thus: \(\lim_{n\rightarrow\infty}\left(L_n2n/(5+\sqrt{5}\,)\right)=3/5\). The function \(\varepsilon(n)\) can itself be written entiredly in terms of Fibonacci numbers:
$$\varepsilon(n)=\left\{\begin{array}{ccl}
\left(\frac{n}{2}\frac25\right)F_n\frac{n}{10}\left(F_{n1}+F_{n+1}\right) & &n \mbox{ even}\\
\frac15F_{n1}+\frac{n1}{5}(F_{n2}+F_n)&&n \mbox{ odd},
\end{array}\right.$$ alternatively,
$$\varepsilon(n)=\left\{\begin{array}{ccl}
\left(\frac{n}{2}\frac25\right)F_n\frac{n}{10}F_{2n}/F_{n} & &n \mbox{ even}\\
\frac15F_{n1}+\frac{n1}{5}F_{2n2}/F_{n1}&&n \mbox{ odd},\end{array}\right.$$
the ratio \(F_{2T}/F_T\) also being the \(T\)th Lucas number (A000032).
 David Daykin's paper is "Representation of Natural Numbers as Sums of Generalised Fibonacci Numbers", J. London Math. Soc., (1960) s135 (2): 143160. The first page is freeaccess here. A followup paper by Daykin appeared in Fibonacci Quarterly in 1969 and can be viewed here. A brief account of Daykin's uniqueness result is given by "Zeckendorf's theorem and some applications", Fibonacci Quarterly, vol. 2, no. 3, 1964, 163–168; online here.
 Garry J. Tee has contributed some interesting remarks on Zeckendorfbased arithmetic, which he investigated in "Russian Peasant Multiplication and Egyptian Division in Zeckendorf Arithmetic", Australian Mathematical Society Gazette, vol. 30, no. 5, 2003, 267–276:

"My algorithms for arithmetic in Zeckendorf arithmetic are much more efficient than any published previously, but they still cost very much more than binary arithmetic. I commented that, if more efficient algorithms for Zeckendorf addition and subtraction could be devised, then they could be used to give much more efficient algorithms for Zeckendorf multiplication and division.
The 2013 paper by Conner Ahlbach, Jeremy Usatine, Christiane Frougny and Nicholas Pippenger on Efficient Algorithms for Zeckendorf Arithmetic, Fibonacci Quarterly, 51(3):249–255 does give much more efficient algorithms for Zeckendorf addition and subtraction  but their main emphasis is on the depth of circuitry required." 

The paper can be viewed in pdf form (≈1.6MB) here (with kind permission of the Australian Mathematical Society).
 The universality of the Fibonacci sequence, restricted to just 1,1,2,3,5, is exploited in a clever clock by Philippe Chrétien, described here by Alex Bellos.
Theorem no. 43: The DPRM Theorem
 Martin D. Davis, "Hilbert's tenth problem is unsolvable", The American Mathematical Monthly, vol. 80, (1973), pp. 233–269; online; is an excellent account of the proof of DPRM.
 A particularly charming and accessible example of a Diophantine set is the set of Fibonacci numbers: James P. Jones, "Diophantine Representation of the Fibonacci Numbers", Fibonacci Quarterly, Feb. 1975, 84–88; online here (3rd from bottom). Jones is also one of the team who produced a particularly compelling prime polynomial, via the fact that the set pf primes is Diophantine: Douglas Wiens, James P. Jones, Daihachiro Sato, Hideo Wada, "Diophantine representation of the set of prime numbers", The American Mathematical Monthly, vol. 83, 1976, pp. 449464; online.
Theorem no. 45: Binet's Formula
 The 'square spiral' illustrating this formula is the basis for a lovely curve construction discovered by Edmund Harriss and described here.
Theorem no. 46: Cameron's Theorem on DistanceTransitive Graphs
 Cameron's proof of this theorem depends on his proof, with Preager, Saxl and Seitz, of the Sims Conjecture which in turn relies on the Classification of the Finite Simple Groups. A CFSGfree proof was suggested by Cameron and completed by Richard Weiss in 1985. Weiss's paper can be viewed online here. The paper proving the Sims Conjecture is cited in (Theorem 65, notes(1))
 Questions regarding infinite distancetransitive graphs are generally open, except in the case of locally finite case which was solved by H. Dugald Mcpherson in 1982. This work and progress on the nonlocally finite case, as of 1998, is described by Cameron in "A census of infinite distancetransitive graphs", Discrete Mathematics, Volume 192, Issues 1–3, 1998, pp 11–26; online here.
Theorem no. 47: The Binomial Theorem

 The definition of binomial coefficients to allow for arbitrary complex powers of the binomial can be generalised still further to allow both parameters to be complex, as explained here by John D. Cook. (This does not impact on the Binomial Theorem whose statement only features the 'top' parameter.)
 The original weblink for this theorem was an absorbing paper by Lawrence Neff Stout (1948–2012): "Aesthetic Analysis of Proofs of the Binomial Theorem" which was slated for but does not appear in, The Humanistic Mathematics Network Journal. It is available at academia.edu and I have uploaded a temporary copy here for quick reference.
Theorem no. 48: Beineke's Theorem on Line Graphs

 Beineke announced this result in 1968 (he referred to line graphs as 'derived graphs'). A proof appeared in 1970 in Journal of Combinatorial Theory; the article can be viewed online here via Elsevier's Open Access Archive.
 The attribution to N. (presumably Neil) Robertson is found on p. 74 of Frank Harary, Graph
Theory, Westview Press, new edition, 1994.
 The number of forbidden subgraphs for characterising line graphs can be lowered under slightly stronger conditions. Thus Ľubomír Šoltés proved in 1994 that, for a graph on at least 9 vertices, forbidden subgraphs V and IX can be ignored, bringing the number down to 7. Then in 2001, with HongJian Lai, he proved that just forbidden subgraphs I, VI and VIII are enough, provided that the graph being tested has minimum degree 7 and is not isomophic to two complete graphs sharing an edge (e.g. subgraph VII). See "Line Graphs and Forbidden Induced Subgraphs", Journal of Combinatorial Theory, Series B, 82, 38–55 (2001). An online copy is available on Lai's website.
Theorem no. 49: Netto's Conjecture (Dixon's Theorem)

 An excellent entry on Cameron's blog offers a sketch of the proof of Dixon's Theorem as well as describing some variants of it.
Theorem no. 50: The EulerHierholzer "Bridges of Königsberg"
Theorem

 An account of this theorem is to be found on page 33ff of the 2nd edition of Edouard Lucas's Récréations Mathématiques, vol. 1, published in 1891, the year of Lucas's tragically early death. The sufficiency of all degrees even is dealt with in a note on page 223 and is, according to this Wikipedia entry, essentially Hierholzer's 1873 argument. Lucas lists in his bibliography the article of Fleury giving an alternative method of construction. This article is a response to Lucas's Récréations Mathématiques entry and proposes a first solution to the construction problem in the version of drawing a figure in one continuous line. I have not seen Lucas's 1st edition to see if his entry differs in the light of Fleury's article. You can see Lucas's 2nd edition online here and Fleury's article is online here (p.257ff).
 There is a nice reallife application to snow ploughing described here (which is actually the Chinese Postman problem but Euler tours are the background).
 And the traditional puzzle is solved, with very nice graphics, for New York here.
Theorem no. 52: The RobertsonSeymour Graph Minors Theorem

 There is a very nice semitechnical overview of 'RobertsonSeymour' theory by Lovász here. Graph minors theory is also very well explained in a series of blog posts by Jim Belk.
Theorem no. 56: Morley's Miracle

 I used to link to the geometry app that created this theorem's illustration. But guaranteeing that a Java app will work for everyone's favourite browser has become too painful. And the app wasn't all that exciting anyway, so it is no longer maintained. But you can look at it here — last time I checked it worked in Internet Explorer.
 The John Mullan quote about Faber and Faber is online here (paragraph 6). The link to Morley's mathematics is tenuous but there is more on Morley junior at Faber and Faber here.
Theorem no. 58: Galois' Theorem on Finite Fields

 It would be ahistorical to say something like "Galois showed constructively that finite fields have prime power order; and Eliakim Moore showed that this construction was the only one possible". Moore, a pioneer of abstract algebra proved the structure theorem which says "this is what finite fields look like". But this answers a question which would not have occurred to Galois! A detailed study of the work which led up to and beyond Moore is given by Frederic Brechenmacher in "A history of Galois fields", 2012, <hal00786662>. The definitive analysis of the Galois archive is of course Peter M. Neumann's The Mathematical Writings of Evariste Galois, European Mathematical Society, 2011
 There is a wonderful knitted GF(16) by the woolythoughts blog.
Theorem no. 59: Germain's Theorem

 Although Germain's Theorem does not play a direct part in the eventual proof of Fermat's Last Theorem (Theorem 9) it has deep connections with other parts of number theory which are still of interest. For instance, it is connected with period lengths of modular Fibonacci sequences via 'Wall's Question' (see Theorem 235, notes(3)). There is also a connection to cryptography via the idea of 'strong' and 'safe' primes. See this, for example.
 The number and distribution of Germain primes are of interest in their own right, see here, for example.
 There is an interesting side story on the Germain–Gauss correspondence in this prizewinning essay by William C. Waterhouse. More on Gauss and Germain and on her work in number theory can be found in Raymond Flood's Gresham College lecture on the subject. Evelyn J. Lamb has this evocative piece which also has some useful onward links.
 There is a nice characterisation of Germain primes by Paolo Leonetti.
Theorem no. 60: The Strong Perfect Graph Theorem

 The qualifier 'strong' distinguishes this theorem from the Perfect Graph Theorem, also conjectured by Claude Berge and proved by László Lovász in 1972.
 Maria Chudnovsky, Neil Robertson, Paul Seymour, Robin Thomas, "The strong perfect graph theorem", Annals of Mathematics, Vol. 164, Issue 1, 2006, 51–229. Online version.
 The 2001 Strong Perfect Graph Theorem for squarefree graphs (no induced 4cycles) of Michele Conforti, Gérard Cornuéjols and Kristina Vušković is proved in "SquareFree Perfect Graphs", Journal of Combinatorial Theory B, 90 (2004) 257–307. A preprint can be found on Cornuéjols website here (Research Papers, section 2).
Theorem no. 61: Moufang's Theorem

 For a proof of Moufang's theorem see Aleš Drápal, "A simplified proof of Moufang's theorem", Proc. AMS, Vol. 139, No. 1, 2011, 93–98. The paper is paytoview online here (but when I googled the name of the paper I found a direct link to the pdf file).
 The smallest nonassociative Moufang loop has order 9 — see Theorem no. 114; the octonions form a loop of order 16 (the negated elements are omitted from our multiplication table for conciseness) and there are four other nonassociative Moufang loops of this order, see the paper by Orin Chein here. The sequence of numbers of nonassociative Moufang loops is A090750.
 The question of whether nonMoufang loops may still obey Moufang's Theorem is addressed here by Izabella Stuhl.
Theorem no. 63: Thales' Theorem

 The story of Thales sacrificing an ox is apocryphal. It has also been attached to Pythagoras. See here for more details.
Theorem no. 64: Neumann's Separation Lemma

 Peter Neumann published his lemma in "The structure of finitary permutation groups", Archiv der Mathematik, Vol. 27, Issue 1, 1976, 3–17, appearing in December. The extension to finite permutation groups beat it into print, appearing in October: B.J. Birch, R.G. Burns, Sheila Oates Macdonald and Peter M. Neumann, "On the orbitsizes of permutation groups containing elements separating finite subsets", Bull. Austral. Math. Soc., Vol. 14, Issue 1, 1976, 7–10; online here. I think Peter Neumann must have told me that both results were discovered in 1974.
Theorem no. 65: Sims' Conjecture

 This theorem was published as "On the Sims Conjecture and distance
transitive graphs", Bull. London Math. Soc, 15 (1983), 499–506. A pdf copy can be found here.
Theorem no. 66: An Erdős–Ko–Rado Theorem on Intersecting Permutations

 The woolythoughts blog offers an alternative view of \(S_4\). They have also done \(S_5\).
Theorem no. 67: Reidemeister's Theorem

 The independent discovery of this theorem by J.W. Alexander and G.B. Briggs ("On types of knotted curves", Annals of Mathematics, Vol. 28, No. 1/4, 1926–1927, 562–586) may be read here.
Theorem no. 68: The Stable Marriage Theorem

 Closely related is the economist Thomas Schelling's work on segregation and diversity. A compelling illustration of these ideas is provided in animated form by The Parable of the Polygons by Vi Hart and Nicky Case.
 An excellent account of Lloyd Shapley and his work is given here by Joseph Malkevitch.
Theorem no. 69: Arrow's Impossibility Theorem

 Arrow's theorem as a result in social science belongs to the general area of social choice theory. A good account from this perspective is given here (Snapshot no. 9/2015) by Victoria Power (also available in German).
 And something less serious!
Theorem no. 72: MacWilliams' Identity

 There is a onevariable version of this theorem corresponding to the nonhomoeneous polynomial obtained by setting \(y=1\) in the definition of \(W_C(x,y)\). The identity becomes
\(W_{C^{\perp}}(x)=C^{1}W_C(1x,1+(q1)x)\) which, using the singleargument version of \(W_C\) becomes \(W_{C^{\perp}}(x)=\dfrac{1}{C}(1+(q1)x)^nW_C\left(\dfrac{1x}{1+(q1)x}\right)\).
 There is a very nice application of MacWilliams, due to Assmus and Maher, to prove nonexistence of projective planes of order 6 modulo 8. An excellent presentation of this work is given by Matroid Union.
Theorem no. 74: Gödel's First Incompleteness Theorem

 The theorem can be stated as "a consistent mathematical system contains assertions for which neither the assertion nor its negation can be proved from the axioms of the system". The version which says "contains truths which cannot be proved" is equivalent by virtue of the fact that, given consistency, nonprovability of a negated assertion is synonymous with truth of the assertion. See Peter Cameron's article in
Gowers et al (eds.), The Princeton Companion to Mathematics, Princeton University Press, 2008, a preprint of which is here.
 Mark ChuCarroll's blog Good Math, Bad Math has posted a very nice 4part introduction to Gödel I: the final part, which indexes the other three, is here.
 A fascinating selfcontained exposition of Gödel's two incompleteness theorems is given by Stanisław Świerczkowski here (Dissertationes Math., 422, 2003, click on the 'POBIERZ ZA DARMO' button) in terms of the theory of 'hereditarily finite sets'. These were the proofs which were chosen as suitable for machine checking in 2015: Lawrence C. Paulson (2015). "A mechanised proof of Gödel’s incompleteness theorems using Nominal Isabelle", Journal of Automated Reasoning. 55, no. 1: 1–37
 A short sketch proof of Gödel's theorem by Raymond Smullyan (c.f. 5000 B.C. and Other Philosophical Fantasies, St. Martin's Press, 1983) is reproduced here.
Theorem no. 75: Gödel's Second Incompleteness Theorem

 There is a delightful 1page description of Gödel 2 in 'words of one syllable' by George Bollos here.
 See also note (3) to Gödel 1.
Theorem no. 76: Brahmagupta's Formula

 Coolidge attributes an equivalent to his quadrilateral area formula to Carl Anton Bretschneider and Friedrich (I think) Strehlke, both 1842. Wikipedia has an entry on Bretschneider's Formula which also credits (also 1842) von Staudt:
$$K=\sqrt{(sa)(sb)(sc)(sd)abcd\cos^2\theta},$$ where \(\theta\) is half the sum of either pair of opposite angles
(in a cyclic quadrilateral opposite angles sum to \(180^{\circ}\) with the half angle giving zero cosine).
 Using the notation of the theorem description, Ptolemy's Inequality says \(ad+bc\geq ef\) with equality if and only \(abcd\) is cyclic. So the subtracted term in Coolidge's square root is always nonnegative (this is immediate from Bretschneider's Formula).
 A clever trapezoid version of Heron's formula due to Miguel Ochoa Sanchez can be found at cuttheknot.org.
 Fascinating material on Indian mathematicians' investigations of cyclic quadrilaterals may be found in Radha Charan Gupta, "Parameśvara's rule for the circumradius of a cyclic quadrilateral", Historia Mathematica, Vol. 4, Issue 1, February 1977, 67–74, online here. Parameśvara's (also known as L'Huilier's) rule states that: circumradius of cyclic quadrilateral with sides \(a,b,c,d\) is obtained on dividing \(\sqrt{(ab+cd)(ac+bd)(ad+bc)}\) by \(4\times\mbox{area of quadrilateral}\).
Theorem no. 78: The ThreeDistance Theorem

 The theorem is often referred to as 'the Steinhaus Conjecture'. However, it doesn't seem to have survived long enough to warrant the title. Stanisław Świerczkowski sent me the following interesting background information on the history and his proof of this theorem:

"The Three Distances Theorem was conjectured by Hugo Steinhaus. When I gave him the proof, he checked it and asked me to write a report for the Academy of Sciences. He presented this report to the Academy (only members could do that) whereupon it was published in the BULLETIN DE L’ACADEMIE POLONAISE DES SCIENCES, Cl. III – Vol. IV, No.9, 1956. The date of his presentation to the Academy is: 29 June 1956. I have here an offprint. If you wish to look at it, and it is not in your library, I could scan it and send you an image by email.
About that time Vera Sós and her husband Prof. Turán visited our University, so they certainly heard about the Three Distances result directly from my colleagues or from me. Of course, Erdős was visiting us too many times. I remember Vera quite well.
In the announcement in the BULLETIN DE L’ACADEMIE POLONAISE mentioned above, there are four related theorems, and I postponed publishing the proofs of these to a later paper. By the time I wrote it, Vera Sos published her proof of the Three Distances Theorem, so I found it simpler to just refer to her paper. My proof (which I do not recall) almost certainly found its way into my Ph.D. thesis on the subject of cyclically ordered groups. This dissertation was submitted to the Polish Academy of Sciences. A recent inquiry disclosed that they cannot find it." 

 There are generalisations of the theorem, see for instance J.F. Geelen and R.J. Simpson, "A two dimensional Steinhaus theorem", Australasian J. Combinatorics, vol. 8, 1993, 169–197, available online here.
 The theorem is often visualised as points plotted around a circle of unit circumference, see this blog entry from Σidiot's blog, for example.
Theorem no. 79: The Fifteen Theorem

 Manjul Bhargava won a Fields Medal in 2014 in part for his work on quadratic forms. He is interviewed by plus magazine on the subject here.
Theorem no. 80: OneFactorisation of Regular Graphs

 This conjecture should probably be taken as originating with Amanda Chetwynd and Anthony Hilton's 1985 paper. In Stiebitz et al, Graph Edge Coloring: Vizing's Theorem and Goldberg's Conjecture, WileyBlackwell, 2012, on p. 259, we find the comment that: "it [the onefactorisation conjecture] 'was going around' in the early 1950s, Hilton [136] was told by G.A. Dirac." And then, "fifteen years before the onefactorization conjecture was published by Chetwynd and Hilton, NashWilliams [233] proposed a much stronger conjecture, which is also known as the hamiltonian factorization conjecture: Let \(G\) be a \(\Delta\)regular graph on \(2n\) vertices, where \(\Delta\geq n\). Then \(G\) has a hamiltonian factorization, i.e., \(G\) is the edgedisjoint union of \(\Delta/2\) Hamilton cycles if \(\Delta\) is even or \((\Delta1)/2\) Hamilton cycles and a linear factor if \(\Delta\) is odd." (The factorisation of \(K_2\times K_5\) is our theorem illustration is an example). However, Anthony Hilton has told me that he has never seen any mention of his conjecture published prior to 1985 and considers it unlikely that it would have been thought about in the 1950s. Certainly Vizing's Theorem was not known until the 1960s. Ref [233] is C. St. J. A. NashWilliams, "Hamiltonian lines in graphs whose vertices have suciently large valencies", in Combinatorial theory and its applications, III, NorthHolland 1970, 813–819. The recent work of Csaba et al (see main theorem description) also resolves NashWilliams' conjecture for large \(n\).
Theorem no. 82: Gruenberg's Theorem on Nilpotent Groups

 The Heisenberg group \(H(n)\) is a group of upper triangular matrices under multiplication; thus for \(n=3\): $$\left(\begin{array}{ccc} 1 & a & c \\ 0 & 1 & b \\ 0 & 0 & 1\end{array}\right)\times \left(\begin{array}{ccc} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{array}\right)
=\left(\begin{array}{ccc} 1 & a+x & c+z+ay \\ 0 & 1 & b+y \\ 0 & 0 & 1\end{array}\right). $$ The multiplication looks a bit obscure when expressed in terms of triples, as in our theorem description. However it is a natural convenience in the analysis of this group, c.f. this interesting paper.
Theorem no. 85: Cayley's Theorem

 I wrote down Peter Neumann's assessment of Cayley's theorem at a talk he gave at Queen Mary University of London during a celebration to mark the 60th birthday of R.A. Bailey:
Theorem no. 86: Noether's Symmetry Theorem

 Another excellent account of Noether's theorem can be found here at the Physics Mill blog.
Theorem no. 87: Lamé's Theorem

 This theorem comes in many different guises, all essentially saying that Euclid's algorithm takes longest (relative to input size) when applied to consecutive Fibonacci numbers. An excellent survey is provided by cuttheknot.
Theorem no. 88: The Cauchy–Kovalevskaya Theorem

 Garry J. Tee kindly sent me a scan of his wellknown article about Kovalevskaya which appeared in the Mathematical Chronicle in 1977. The Mathematical Chronicle Committee in turn kindly gave me permission to host a copy here. I have made pdf (about 3.7MB) and Powerpoint (about 9MB) versions. (The Mathematical Chronicle Committee is arranging for digitisation of Mathematical Chronicle).
Theorem no. 90: Cardano's Cubic Formula

 A more explicit version of this formula (comparable in format to the quadratic formula) can be found here. This version is made fun of in this blog post.
 Thony Christie offers a very detailed discussion of the skirmishes around ownership of solutions to the cubic here. A posting here by Plus magazine is also illuminating.
 Kellie Gutman has made an English translation of Tartaglia's verseform solution of the cubic, brought to us by poetrywithmathematics.blogspot.fr.
Theorem no. 92: The Quadratic Formula

 A valuable list of derivations of the quadratic formula is provided by CutTheKnot.
 Rob J. Low makes a good case for standardising on the quadratic formula for \(ax^2+2bx+c\), viz \(x=(b\pm\sqrt{b^2ac}\,)/a\).
Theorem no. 94: Cayley's Formula

 The second proof of Cayley in Stanley's book, by André Joyal, has a nice application to automata theory by Peter Cameron, described here.
 Also in Cameron's blog: another lovely enumeration yielding \(n^{n2}\).
Theorem no. 95: The Convolution Theorem

 For Gauss and FFT see Michael T. Heideman, Don H. Johnson and c. Sidney Burrus, "Gauss and the History of the Fast Fourier Transform", IEEE ASSP Magazine, October 1984, 14–21. Online here.
 A lovely gentle introduction to Fourier transforms by betterexplained.com.
Theorem no. 96: The Rule of Sarrus

 There is apparently a 4×4 version of the Rule of Sarrus called the 'Rule of Villalobus' due to the Mexican mathematician Gustavo Villalobos Hernández. See comment (4) here which links to a Spanish wikipedia entry which unfortunately appears to have been deleted.
Theorem no. 99: The Happy Ending Problem

 The attribution to Turán of the solution to n = 5 for this problem I found in this article by Imre Bárány. More details are given by Szekeres and Peters in their article proving n = 6.
 There is a very nice description of the Happy Ending problem and Suk's asymptotic solution here, by Quanta magazine.
Theorem no. 101: Kepler's Conjecture

 This was a 'theorem under construction' pending Hales' Flyspeck project to reinforce his original proof with a machineautomated one. This completed in 2014 but the proof was already made bulletproof with his 2012 publication of Dense Sphere Packings: A Blueprint for Formal Proofs. The formal proof is also described by Hales and twentyone other authors in "A formal proof of the Kepler conjecture", Forum of Mathematics, Pi (2017), Vol. 5, e2; online.
 The story of Hales' original proof and publication of Kepler is well told here. Note that his main referee at Annals, namely Gábor Fejes Tóth, is the son of László Fejes Tóth who made the original breakthrough in the conjecture's resolution. By the way, a statement by the Editorial Board at Annals explicitly invites "computerassisted proofs of exceptionally important mathematical theorems", an invitation which I understand to predate Kepler.
 The Flyspeck project is now at GitHub. (The old google code website is still accessible here, as of June 2017, but carries the warning "Google is shutting down google code and we are being evicted.")
Theorem no. 103: The Parking Function Formula

 The bijection to labelled trees establishes the formula but the `book' proof is due to Henry O. Pollack. An account is given in this wonderful presentation by Richard Stanley
 Apparently parking functions and their enumeration originates in Ronald Pyke, "The supremum and infimum of the Poisson process", Ann. Math.
Statist. 30 (1959) 568–576; online preprint.
 The note by Peter Cameron which is the recommended weblink for this theorem was written for the Queen Mary Maths dept student newsletter. Its topic is very nicely explored in this presentation by Thomas Prellberg.
Theorem no. 104: The Goins–Maddox–Rusin Theorem for Heron Triangles

 The triangletocurve conversion rule given in the text (from the Goins–Maddox paper) guarantees to produce an elliptic curve and a rational point on that curve. The actual curve and point depend on the order in which sides \(a,b\) and \(c\) are chosen. For example, the triangle with sides \(3, \frac{10}{3},\frac{17}{3}\), with area \(n=4\), gives \(\rho=1/4\) or \(2\) or \(2/9\), depending on which sides are called \(a,b\) and \(c\). And for the curve depicted, \(\rho=1/4\), neither ordering of the sides recovers the point \(P=(8,24)\) on this curve (which gives the same triangle but with some negative sides)  instead points \((2,6)\) and \((18,102)\) are discovered.
 The version of Heron's formula I have used is one of several given in the wikipedia entry. It is useful for showing the invariance of the rule under sign changes.
 There is a nice account here of searching for congruent numbers (integer areas of rational right triangles).
The associated sequence at oeis is A003273.
Theorem no. 107: The Tverberg Partition Theorem

 The original recommended web link for this page was Stephen Hell's magnificent dissertation
"Tverbergtype Theorems and the Fractional Helly Property". This became unavailable as a direct download but is now located at depositonce.tuberlin.de/handle/11303/1761 and is highly recommended.
 Tverberg's Theorem is a special case of the socalled Topological Tverberg Conjecture. The conjecture is true when \(r\), the number of partition parts, is a prime power but was proved false in general in 2015. It is all verywell described by Gil Kalai: follow the links back from here (you will also find a proof of Tverberg's Theorem itself).
 Also by Gil Kalai, a good overview, marking the birthday of Imre Bárány, of various 'Tverberg' results and conjectures
Theorem no. 109: The BeardwoodHaltonHammersley Theorem

 An interesting counterexample of Alessandro Arlotto and J. Michael Steele puts a limit on how far the hypothesis on the variables \(X_1, X_2, \ldots\) forming the TSP tour may be relaxed
Theorem no. 110: The Robbins Problem

 John Baez has a nice explanation here of a singleaxiom definition of a lattice. The comments contain other interesting links to related topics, including singleaxioms for groups.
Theorem no. 111: De Morgan's Laws

 I found the following on Peter Cameron's mathematical quotations page (under logic):

" ... the contradictory opposite of a copulative proposition is a disjunctive proposition composed of the contradictory opposites of its parts.
... the contradictory opposite of a disjunctive proposition is a copulative proposition composed of the contradictories of the parts of the disjunctive proposition."
William of Ockham (Occam), Summa totius logicae, 14th century (transl. Philotheus Boehner 1955)
Note: This is Ockham's formulation of De Morgan's Laws, more than five hundred years before De Morgan. It is just as clear in his Latin text. 

Theorem no. 112: Bregman's Theorem

 A. Schrijver attributes to Brouwer (presumably L.E.J. Brouwer who was still alive when Minc originally conjectured what became Bregman's Theorem) the observation that an upper bound for permanents of arbitrary nonnegative matrices may be derived directly from the bound for (0,1)matrices. Thus, if \(v = (b_1, b_2, \ldots, b_n)\) is a vector in descending order, and letting \(b_{n+1}=0\), define
$$g(v)=\sum_{k=1}^n(b_kb_{k+1})(k!)^{1/k}.$$
Then the Minc bound is replaced by the product \(\prod_ig(v_i)\) over the rows \(v_i\) of the matrix. (A. Schijver, "A short proof of Minc's conjecture", J. Combin. Theory, A, vol. 25, no. 1, 1978, 80–31.)
 Although computing \(\mbox{per}(M)\), the permanent of an arbitrary matrix \(M\), is hard (specifically, #Phard) even if \(M\) is \(0\mbox{}1\), in the case where \(M\) has nonnegative entries Jerum, Sinclair and Vigoda have given a polynomial algorithm which gives a value arbitrarily close to \(\mbox{per}(M)\) with high probability (FPRAS).
Theorem no. 113: The van der Waerden Conjecture

 The probability calculations in this theorem description may be justified explicitly by observing that column sums add to unity, so that for, say, System 1,
$$\left(\frac12+\frac14+\frac14\right)\times\left(\frac13+\frac58+\frac{1}{24}\right)\times\left(\frac16+\frac18+\frac{17}{24}\right)=1,$$
which, expanding the brackets, equals the sum of the probabilities of every possible product of output failures over the three computers.
 Closely related are questions about permanents of matrices having all row and column sums equal, counting perfect matchings in regular bipartite graphs of equalsize parts. Lower bounds have recieved much attention, notably the Schrijver–Valient Conjecture. Some recent information is here.
Theorem no. 115: The Hardy–Ramanujan Asymptotic Partition Formula

 Hardy and Ramanujan published their asymptotic formula in 1918 in "Asymptotic Formulae in Combinatory Analysis", Proc. London Math. Soc., (2) 17, pp 75–115. But they had published preliminary versions as early as 1916 and the abstract to the 1918 paper appeared in the records of LMS Proceedings for March 1st, 1917 (Hardy himself, as vicepresident, taking the chair). The papers can be viewed online here.
Theorem no. 116: The Polynomial Coprimality Theorem

 The row and column labels in the illustration of this theorem have been supressed for \(n=3\) to save space. They are, in order, \(x^3, x^3+1, x^3+x, x^3+x+1, x^3+x^2, x^3+x^2+1, x^3+x^2+x, x^3+x^2+x+1\), the 4th and 6th being irreducible (irreducible polynomials up to degree 5 over GF(2) are listed here).
 The theorem first appears in S. Corteel, C. Savage, H. Wilf, D. Zeilberger, "A Pentagonal number sieve", Journal of Combinatorial Theory, Series A, vol. 82, no. 2, 1998, 186–192. There is a reprint online here (scroll down to 1998).
 Thomas Hagedorn and Jeffrey Hatley have generalised this result to consider polynomials over the ring \(\mathbb{Z}_{p^k},\ p\) prime: "The probability of relatively coprime polynomials in \(\mathbb{Z}_{p^k}[x]\)", Involve, vol. 3, no. 2, 2010, pages 223–232; online. Their paper reviews several other generalisations.
Theorem no. 120: The Lovász Local Lemma

 An earlier version of this theorem description may be found here. It uses an artificial number theory example to show nonindependent sets which are nevertheless pairwise independent. But the conclusion of the Local Lemma is still true even though its hypotheses are false. The example replacing it gives us a false conclusion and is a bit less artificial (in fact it is based on a scenario from cryptography, described to me by Michelle Kendall, which is not artificial at all).
 The current example concerns an event A_{ij} that two multisets of size 2, each chosen with repetition from the set {1,...,n}, have nonempty intersection. The number of ways that such a pair of intersecting multisets can be chosen is
$${n+1\choose 2}^2\left({n\choose 2}{n1\choose 2}+n{n\choose 2}\right)=\frac12n\left(2n^2n+1\right).$$
This is A081436 in oeis.org. The probability of \(A_{ij}\) is thus
\(\frac12n(2n^2n+1)/{n+1\choose 2}^2\).
The probability of the triple event \(A_{ij}\cap A_{ik}\cap A_{jk}\) may be calculated as \(\frac12n(2n^3+2n^27n+5)/{n+1\choose 2}^3\). When \(n = 3\), this has value \(7/18\), much greater than \(8/27\), the cube of the probability of the single event.
 The original Local Lemma appears in this paper.
 The paper of Carsten Thomassen is here: the result on hypergraph colouring appears as Theorem 5.1.
 Anwer Khurshid and Haredo Sahai, "On mutual and pairwise independence: some counterexamples", Pi Mu Epsilon Journal, Vol. 9, No. 9, 1993, 563–570, looks promising as a source of further false applications of the Local Lemma. Online here (10.3MB pdf).
Theorem no. 121: Lambert's Formula

 How this formula arose out of Lambert's attempts to prove Euclid's 5th postulate is beautifully described here by Evelyn J. Lamb...
 ...who also describes and links to a beguiling interactive tool for tiling the Poincaré disk, part of a fine nonEuclidean geometry resource by Malin Christersson.
Theorem no. 123: The Wedderburn–Artin Theorem

 An interesting article on the significance of WedderburnArtin is provided by Quora (with companion entries on significance of several other theorems)
Theorem no. 124: The Lagrange Interpolation Formula

 The link between Lagrange interpolation and the Chinese Remainder Theorem is discussed here.
Theorem no. 126: The Asymptotic (Half) Liar Formula

 The exponential nature of the Dumitriu–Spencer asymptotic can be appreciated by noting that, while \(U_1(7)\geq 15\) (the example illustrating the theorem), the value of \(U_1(25)\) exceeds \(10^6\), i.e. asking 25 questions is guaranteed to find a value between 1 and a million, in the face of at most one lie. A nice account of this result by
Deryk Osthus and Rachel Watkinson is here.
 The exact value of \(U_1(7)\) is \(16\); \(U_1(25)=1290554\). With such rapidly growing values it is more convenient to ask the liar question the other way round: given a value for \(n\) what is the least number, \(q_k(n)\), of questions which will guarantee a number in the range \(1,\ldots,n\) is determined, in the face of \(k\) lies? This is the version discussed by Osthus and Watkinson. Thus \(U_1(7)=16\) because \(q_1(16)=7\) but \(q_1(17)=8\).
 A comprehensive and very readable survey, "Searching games with errors—fifty years of coping with liars" by Andrzej Pelc, Theoretical Computer Science, 270, 71–109 , is available here via Elsevier's Open Access Archive.
 A tangential but intriguing link regarding the Fano plane, which generates Peter Cameron's trick illustrating this theorem, is this on using the Fano plane to generate poetry.
Theorem no. 127: The Lucas–Lehmer Test

 I like this short blog entry, written by a Mersenne prime hunter, on using the Lucas–Lehmer test.
 There is a good entry here by 3010tangents about how to prove Lucas–Lehmer.
Theorem no. 128: The Euclid–Euler Theorem

 The MacTutor archive offers a good page on the history of perfect numbers.
 I found this wonderfully openended exploration of perfect numbers by Oliver Knill (one of many he has posted).
 There is a sister site to mersenne.org looking for odd perfect numbers. It may or may not be active at the time of reading...
 Article no. 1 at John Voight's site is a nice exploration of odd perfect number properties.
 The current lower bound on odd perfect numbers appears to be Pascal Ochem and Michaël Rao,"Odd perfect numbers are greater than \(10^{1500}\)", Mathematics of Computation, Vol. 81, Issue 279, 2012, 1869–1877.; online.
Theorem no. 129: The Ollerenshaw–Brée Formula

 Evelyn Lamb wrote this very nice tribute to Ollerenshaw on the occasion of her 100th birthday. The entry for her at Agnes Scott is also good on technical details. Another tribute to Ollerenshaw's work on magic squares was carried by the Royal Society Publishing Blog
 A depiction of a mostperfect magic square dating from the 10th century is found in the Jain temple Parshvanatha at Khajuraho in Madhya Pradesh.
 A good source of information on magic squares generally is the website of Francis Gaspalou.
Theorem no. 130: A Theorem on Apollonian Circle Packings

 The papers that inspired this theorem description can be located on Ron Graham's website dated 2003–2006. The theorem as given is from R. L. Graham, J.C. Lagarias, C.L. Mallows, A.R. Wilks, and C.H. Yan, "Apollonian circle packings: number theory", J. Number Theory, 100 (2003), no. 1, 1–45.
Theorem no. 132: Theaetetus' Theorem on the Platonic Solids

 A nice account of the proof of this theorem using Euler's Polyhedral Formula is given by John D. Cook here.
 I recommend this intriguing discussion by Pat Ballew of which of the Platonic solids is 'most spherical'.
Theorem no. 133: The Total Probability Theorem

 In The Doctrine of Chances: Probabilistic Aspects of Gambling, SpringerVerlag, 2010, Stewart N. Ethier writes (p. 68) "The conditioning law (... often called the law of total probability) was used without comment by Montmort and De Moivre. It was formalized in the derivation of Bayes's law."
 An excellent and much more thorough application of probability theory to the deuce rule of tennis is provided here by Chalkdust magazine.
Theorem no. 137: Wallis's Product

 The original web link for this theorem was rsnr.royalsocietypublishing.org/content/54/3.toc where Jacqueline Stedall has two wonderful articles on Wallis's work. The first one is the source of the quotation "perhaps the one real stroke of genius in Wallis’s long mathematical career." These articles were freetoview when I first posted the description, then they became restricted; when I last checked (Jan 2017) they were accessible again. Highly recommended!
Theorem no. 138: Vaughan Pratt's Theorem

 Since polynomialtime solvable problems are automatically in NP this theorem is, since 2002, a corollary of the wellknown algorithm of Agrawal, Kayal and Saxena (see the weblink for this theorem).
Theorem no. 139: Strassen's Matrix Theorem

 Virginia Williams' account of her reduction in ω can be found in overview and technical versions at her website. There is a very good account of her work at Gödel's Lost Letter by Richard Lipton; the latest account by Williams I can find is this, which is rivalled in this article by François Le Galll; there is a short discussion about the problem at cstheory.stackexchange.
 An interesting discussion by Bill Gasarch on limitations of current approaches to getting ω = 2 + ε can be found at Lance Fortnow' & Bill Gasarch's comutational complexity blog.
 The complexity of matrix multiplication is inimately connected with some conjectures in extremal set theory as discussed by Gil Kalai.
Theorem no. 140: A Theorem on Rectangular Tensegrities

 This post by Dave Richardson gives a nice introduction to the subject of graph theory and rigidity.
Theorem no. 144: Lieb's Square Ice Theorem

 It has been discovered that water can form into square ice at room temperature by confining it using layers of graphene.
Theorem no. 145: The Contraction Mapping Theorem

 mathcounterexamples.net has a valuable collection of counterexamples showing that all the hypotheses of this theorem are needed.
Theorem no. 146: The Panarboreal Formula

 The asymptotic formula of Chung and Graham is first mentioned in their 1979 paper "On universal graphs",
Annals of the New York Academy of Sciences, 319 (1979), 136–140; online.
 Some more work on panarboreal graphs and related issues is given in section 6.8 of "Spanning trees – A survey" by Kenta Ozeki and Tomoki Yamashita, 2010.
 The sequence of sizes of panarboreal graphs starts 0, 1, 2, 4, 6, 8, 11, 13, and is OEIS sequence A004401.
Theorem no. 147: The Sophomore's Dream

 T he decimal expansion of \(\int_0^1 x^x dx\) and related material can be found at oeis.org/A083648; \(\int_0^1 x^{x} dx\) is sequence A073009.
Theorem no. 151: The Small Prime Gaps Theorem

 For the sake of contrasting this theorem with subsequent proofs that \(p_{n+1}p_n\leq c\) infinitely often for a constant \(c\), its conclusion may be replaced by \(p_{n+1}p_n\leq (\log p_n)^{1/2+\epsilon}\). The progress towards current knowledge is beautifully described by Terence Tao in this youtube clip.
 The background to Zhang's proof of \(p_{n+1}p_n\leq c\) and subsequent improvements by Maynard and Tao are described by John Friedlander in "Prime Numbers: A Much Needed Gap Is Finally Found", Notices of the AMS, Vol. 62, No. 6, June/July 2015, 660–664, online here. A more technical overview is given by Andrew Granville, "Primes in intervals of bounded length", Bull. Amer. Math. Soc., 52 (2015), 171–222, online here.
Theorem no. 152: De Moivre's Theorem

 Here is a cute demonstration, using De Moivre, that the series \(\cos(1),\cos(2),\cos(3),\ldots\) does not converge.
Theorem no. 160: The Classification of Archimedean 4Polytopes

 Tony Phillips' review (a 2.4MB download from here) of Tony Robbin's Shadows of Reality, Yale University Press, 2006, contains much fascinating material on 4dimensional solids.
 The Historia Mathematica article by Irene PoloBlanco is chapter 5 of her excellent 2007 University of Groningen thesis Theory and History of Geometric Models which may be found online here.
 If only Alicia Boole Stott could have tried the virtual reality game Hypernom!
Theorem no. 161: Quadratic Nonresidue is ZeroKnowledge Provable

 There is at least one reallife illustration of zeroknowledge proofs in the field of nuclear disarmament. A follow up.
 Jeremy Kun has a fine series of blog posts on zeroknowledge proofs. Follow from here.
Theorem no. 163: The Friendship Theorem

 A useful little entry at The Futility Closet links to an elementary (purely graphtheoretic) proof of this theorem by Judith Longyear and Torrence Parsons.
Theorem no. 164: The Diaconis–Holmes–Montgomery CoinTossing Theorem

 A full account of this theorem, including a wonderful account of experimental confirmation, is given here (5MB pdf file). My illustration of the theorem is partly based on figures 3 and 4 of this paper.
Theorem no. 166: Haken's Unknot Theorem

 Further developments on the complexity of determining unknottedness are described by Gil Kalai here.
 Knot theory deals with embeddings of the 1sphere \(S^1\)^{} (a closed curve) into 1 + 2 = 3 dimensions. More generally we can embed the nsphere \(S^n\)into n + 2 dimensions. Thus \(S^2\), a hollow sphere, is embedded into 4 dimensional space. And we can ask about the decision problem: is our embedding the 'unknot'? For \(n\geq 3\) the answer is that the question is undecidable: this was proved in 1996 by Alexander Nabutovsky and Shmuel Weinberger. However, decidability in the case \(n=2\) remains an open problem. A wonderful discussion of these issues is given here by Bjorn Poonen.
Theorem no. 167: The Lindemann–Weierstrass Theorem

 The proof of the transcendence of \(e\), Charles Hermite's breakthrough 1873 result, is very clearly described here.
Theorem no. 170: Machin's Formula

 Machin's Formula has an alternative statement viz \(\tau/8=4\cot^{1}5\cot^{1}239\), thanks to the relationship between \(\cot^{1} x\) and \(\tan^{1}(1/x)\). This version is somewhat neater (although the inverse cotangent function is not directly available on calculators or in spreadsheets) and is preferred by some writers c.f. Pat's Blog.
 Another entry at Pat's Blog offers an interesting synopsis of the history of Machin's series.
Theorem no. 175: The Bungers–Lehmer Theorem on Cyclotomic Coefficients

 The famous proof of Wedderburn's Little Theorem by Ernst Witt is based on cyclotomic polynomials and is a great tangent to follow. See the recommended weblink on that page.
 An exciting new development in the study of cyclotomic coefficients is the discovery by Gregg Musiker and Victor Reiner of a topolocial interpretation. See this, for example.
Theorem no. 177: The Heine–Borel Theorem

 Nicole R. Andre, Susannah M. Engdahl and Adam E. Parker give a wonderful early history of this result in "An Analysis of the First Proofs of the HeineBorel Theorem", Convergence, Vol. 9, 2012, online here.
 The notion of compactness is given a very good intuitive treatment by Evelyn Lamb here.
Theorem no. 178: Sendov's Conjecture

 There is a nice snapshot of Dégot's 'large n' proof of Sendov at about p. 62 of this cornucopia by Pamela Gorkin
Theorem no. 180: The Greibach Normal Form Theorem

 The algorithm converting \(G\) to Greibach Normal Form on \(O(G^4)\) symbols is given in Norbert Blum and Robert Koch, "Greibach Normal Form transformation revisited", Information and Computation,
Vol. 150, Issue 1,1999, pages 112–118; online.
 Prof. Greibach was kind enough to send me a few comments on this theorem which I quote below:

Although my original definition of GNF is as you describe, in my
class notes I now permit S > emptystring if S does not appear on
the right hand side of any production and similarly for CNF (Chomsky
Normal Form) as in the notes you attach, so that all contextfree
languages are covered.
As far as I know, GNF is not used in any grammarpda transformations
directly; it is essential to conversion to a pda without epsilonrules,
i.e. to a nondeterministic pda which must read a new input each unit
of time (I usually call this quasirealtime). Indeed, the fact that
GNF suffices for contextfree languages is equivalent to the fact
that nondeterministic pda are equivalent in power to quasirealtime
pda.
My normal form was proven in 1962 and appears in my 1963 thesis but,
as you note, the first full publication was in 1965. 

Theorem no. 181: Archimedes’ Equiareal Map Theorem

 Bradley Carroll has a very nice series of pages on Archimedes' achievements, where we find (see this page) that Archimedes himself regarded his results on areas and volumes of curved bodies to be his finest work. The spherecylinder surface area ratio is quoted there as 2/3 whereas our version says the surface areas are equal; this is merely because our cylinder has no top or bottom. To be specific, the ratio for a sphere of radius \(r\) and a cyclinder of radius \(r\) and height \(h\), is \(2\tau r^2/(\tau r h+\tau r^2)\); we have \(h=2r=2\) and omit the second term in the denominator.
 I could not resist invoking E.T. Bell's celebrated triumvirate in connection with this theorem. That is shameless popularism, though, since I entirely subscribe to Thony Christie's injunction "Context is everything".
 The attribution to GaussviaNewton of differential geometry is even more shameless. In Dirk Jan Struik's classic twopart "Outline of a History of Differential Geometry", the whole of part 1 is preGauss (Isis, vol. 19, no. 1, 1933, pp. 92–120) with the key modern players being Clairaut, Euler and Monge. However, Gauss's work in the 1820's pervades a large proportion of part 2 (Isis,
vol. 20, no. 1, 1933, pp. 161–191), and the first and second fundamental forms belong to intrinsic differential geometry which was intrinsically Gauss.
Theorem no. 182: The Girard–Newton Identities

 A short survey of elementary proofs of these identities, together with an elegant new proof using matrix algebra, is given in Dan Kalman, "A Matrix Proof of Newton's Identities", Mathematics Magazine, Volume 73, Number 4, October 2000, pp 313  315 (online here, dated 8/16/99)
Theorem no. 186: The Insolvability of the Entscheidungsproblem

 Some contextual information is given in a presentation I gave at Rewley House on 23 June 2012.
 An good source of undecidable problems is this 2012 survey by Bjorn Poonen. Related links on Diophantine undecidability are: James P. Jones paper "Diophantine Representation of the Fibonacci Numbers",
and this paper by Yuri Matiasevich (in which some of the characters seems to print strangely but not unreadably).
 Related theorems are: The DPRM Theorem, Haken's Unknot Theorem, Reidemeister's Theorem, and Gödel's First and Second Incompleteness Theorems.
 Details of Jack Copeland's Essential Turing are given here; Charles Petzold's reading guide to Turing's 1936 paper is listed here. Biographies of Turing are listed here.
 N.B. A much more extensive list of Turingrelated material came out of an OUSSA summer school I gave at Oxford in July 2012.
 Hilbert's 1930 "Wir müssen wissen, Wir werden wissen" radio address is online here with a transcription and an accompanying English translation.
 There is a poetry version of the proof of nondecidability of the Halting problem here by Geoffrey K. Pullum, as I learnt from Pat'sBlog.
 At the heart of Turing's result is the demonstration that not all functions from the natural numbers of the natural numbers can be computed. Joel David Hamkins has the intriguing result that any function is computable if the right model of arithmetic is chosen. This means arithmetics which satisfy the axioms of the natural numbers but which contain additional 'nonstandard' numbers. A good introduction is provided by John Baez.
 This blog post from Gödel's Lost Letter is an excellent source on proving the unsolvability of the Halting Problem.
Theorem no. 188: alKāshi's Law of Cosines

 Garry J. Tee kindly provided the following amplification on the history of trigonometric functions: "Hipparchus in (c130) invented the chord, the first trigonometric function, and he constructed a short table of values of the chord function. (On a sphere of radius R, the chord of angle x is the distance between 2 points on the sphere subtending angle x at the centre). By the 5th century, Hindu astronomers had replaced the chord by the more convenient sine function, with \(2R\sin x = \mbox{chord}(2x)\). In 499, Aryabhata commenced his renowned astronomical treatise “Aryabhatiya” with a short table of sines."
Theorem no. 189: The Handshaking Lemma

 Architectural historian Dr Lynn Pearson kindly sent me the following comments in response to a query regarding the origins and attribution of the tiling pattern illustrating this theorem:

"While investigating the 6/7 murals query, I chanced upon the QM Physics archives website; the brochure about the new Physics Building (1962) is available at
ph.qmul.ac.uk/sites/default/files/brochure1963.pdf;
this details the six panels. I too made it six as there are six architectural 'bays'. But I see what you mean about the different section you have used for your theorem. Looking at the six panels, the middle 4 have the main pattern in white on a blue ground, with various small sections of coloured tiling around. But the two end panels have the blue section, and another smaller vertical section on a gold background. The one furthest from the road has the precessing orbit, plus what looks to me like ellipses/catenary curves?? – see archive pic:
ph.qmul.ac.uk/sites/default/files/alumni/donat_01.jpg
and
ph.qmul.ac.uk/sites/default/files/alumni/donat_20.jpg.
Are these two elements of the panel connected mathswise?
And the same goes for your panel, the one nearest the road – that's 'spreading of dislocations from a FrankRead source' plus the diagram you have in your theorem. Are these things linked in some way? It seems that the designers chose to 'finish off' the series of murals with slightly more ornate ones at each end.
Anyway, all these panels were by Carter's, but my notes from the Carter's photographic archive at Poole Museum show that the firm worked closely with the building's architects, Playne & Lacey, on the project. A man called R. Khosla, who worked for the architects, helped in the design of the 6 panels, but left before they were complete. As the head of design at Carter's, A. B. Read did much of the mural work; I'd think he completed the job. The tile painting would have been done by the firm's (lady) artists. I think that is as good an attribution as you will get." 

Dr Pearson also provided a link to a relevant article of hers, although for copyright reasons its images cannot be displayed. And there is a little more in the Tile Gazetteer.
 From plus magazine: applying the double counting argument proof of the Handshaking Lemma to the complete graph on n + 1 vertices is a neat way of proving that $$1+2+\ldots + n = \frac12n(n+1).$$
 Other impressive applications of the Handshaking Lemma: the proof of Sperner's Lemma in 2D; and this proof that the socalled 'LightsOut' game is solvable for any graph in which all vertices are initially 'turned on'.
Theorem no. 191: L'Hospital's Rule

 A nice 'double' example of L'Hospital in action is the proof that \(\ln(x)\tan(x) \rightarrow 0\) as \(x\rightarrow 0\), starting in the \(\infty/\infty\) form as \(\ln(x)/\cot(x)\) and then transferring to the \(0/0\) form (twice).
 The expression \(0^0\) gets a thorough investigation by Michael Huber and V. Frederick Rickey in "What is \(0^0\)", Convergence, Vol. 5, 2008. Online here. Other good treatments are this blog post by David A. Tanzer (with over 50 very informative reader responses) and this from askamathematician (which has over 1000 responses, which I haven't read but I suppose there must be some interesting stuff there as well!)
Theorem no. 194: Wilson's Theorem

 The `if' converse to the theorem follows because if \(n\) is composite then some \(d\),\(1<d<n\), divides \(n\). Then \(d\) appears as a factor of \((n1)!\) and therefore cannot divide \((n1)!+1\), in which case, neither can \(n\).
 Fredrik Johansson describes a neat trick for reducing the computation required for testing primality via Wilson's Theorem.
 A combinatorial argument is described here which seems in the same spirit as P.G. Anderson et al's combinatorial lemma.
 It seems convenient to record here Gauss's generalisation of Wilson's Theorem: if \(n>2\) then
$$\prod_{\stackrel{k=1}{\gcd(k,n)=1}}^n\hspace{.2in}k = \left\{\begin{array}{rcl}
1\mbox{ mod }n & & n=4, p^m, 2p^m,\\
1\mbox{ mod } n &&\mbox{otherwise.}\end{array}\right.$$
A very nice proof is given here (under "Expository Papers").
Theorem no. 196: The Cantor–Bernstein–Schröder Theorem

 There is a more here the history of this theorem.
 A graphtheoretic proof of this theorem generalises to one about paths, as described in Reinhard Diestel & Carsten Thomassen, "A
CantorBernstein theorem for paths in graphs", American Mathematical Monthly, February 2006, online
here.
Theorem no. 197: The Robin–Lagarias Theorem

 A wonderful compilation of proofs of the nonconvergence of \(\zeta(1)\) is: Steven J. Kifowit and Terra A. Stamps, "The Harmonic Series Diverges Again and Again", The AMATYC Review, Vol. 27, No.2, Spring 2006. A preprint can be found here.
Theorem no. 199: Fermat's TwoSquares Theorem

 The representation of a prime \(p=4n+1\) as a sum of two squares is unique (upto order of summands). A nice proof using Gaussian primes, is given here.
 Strictly speaking, Lagrange's Lemma only goes one way: if \(p\) is congruent to 1 mod 4 then \(1\) is a quadratic residue mod \(p\). Lagrange used his lemma in 1773 to give a simpler proof than Euler's of Fermat's theorem. There is a short discussion in chapter 6 of Stillwell's Elements of Number Theory, Springer 2003 (see sections 6.5 and 6.8 and chapter 9 for the converse). The lemma is an instance of the Law of Quadratic Reciprocity (see this, for example).
Theorem no. 201: Jensen's Inequality

 This theorem's illustration originally featured soup cans based on Andy Warhol's Campbell's Soup pop art. However, Campbell Soup Company declined to give me permission for this use "in part because the image you have used is not a reproduction of our famous trademarks, but rather what we consider a 'mutilation' of our marks." If you would like to view this mutilation for yourself let me know and I will smuggle you a copy!
 A good reallife application from the world of finance is given here, taken from Sam L. Savage, The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty, John Wiley & Sons, paperback edition, 2012.
Theorem no. 202: The Freidlander–Iwaniec Theorem

 The question of whether there are infinitely many primes of the form \(n^2+1\) is the first of Landau's Problems, all unsolved as of April 2017.
Theorem no. 203: Euler's Continued Fraction Correspondence

 Tony Foster gives a continued fraction in terms of cubes for \(\pi=\tau/2\) via a nice exploitation of Nilakantha's series in the same vein as the derivation by Douglas Bowman. Suggests a nice exercise to give the corresponding result for \(\tau\).
Theorem no. 205: The Classification of the Semiregular Tilings

 There is more on Kepler's investigation into plane tilings in this lovely article by plus magazine. In fact they have a whole collection of tiling articles!
 The general subject of plane tilings is deep and wide: for example, the decidability of whether a given single tile will tile the plane is an open question (very elegantly introduced by Chaim GoodmanStrauss in "Can't Decide? Undecide!", Notices of the AMS, Vol. 57, No. 3, 2010, 343–356, online here).
Theorem no. 206: Euler's Formula

 A sweet, and sweetly presented, proof of Euler's Formula, given by Math With Bad Drawings, makes a very nice example of solving differential equations by separation of variables.
 The appearance of a protractor (angle measurer) in the illustration of this theorem gives me an excuse to link to this delightful little History of Protractors from Life Through a Mathematician's Eyes.
Theorem no. 207: The EratosthenesLegendre Sieve

 A very nice motivation for sieving in number theory is given by Terence Tao here.
Theorem no. 208: Torricelli's Trumpet

 In response to my query, Paolo Mancosu kindly gave me following comments on the origins of Torricelli's result and his methods:

"I am quite sure [that] Oresme and Fermat and Roberval certainly did not anticipate Torricelli's discovery. Fermat wrote about similar solids (de infinitis hyperbolis) after Torricelli. As for Roberval, I cite that report of Mersenne (given by Torricelli) where it is reported that, according to Mersenne, Roberval had written some kind of speech claiming that Torricelli's result was impossible! Had he anticipated him, he would certainly have claimed priority rather than trying to prove that the result was impossible. You are right that Torricelli does not prove that the lateral surface is infinite. I do not know who first did that." 

 In his review (Notre Dame J. Formal Logic, vol. 40, no. 3, 1999, 447–454) of Mancosu's Philosophy of Mathematics & Mathematical Practice in the Seventeenth Century, Craig Fraser says (p. 448) "Torricelli discovered the remarkable fact that the solid of revolution obtained by rotating the hyperbola \(y=1/x\) about the \(x\)axis has finite volume and infinite surface area." I have found no other evidence that Torricelli calculated the surface area of his solid, however.
 Dave Richeson's wellknown Division by Zero blog has provided a template for constructing a paper version ofTorricelli's Trumpet (under the pseudonym Gabriel's Horn). (Possibly the arxiv version of Richeson's paper is more of a permalink than the one given in the blog entry.)
 A very nice animated illustration of indivisibles, applied to circular area, is given by Matt Henderson here.
Theorem no. 209: The Erdős Discrepancy Problem

 This was originally posted as a 'theorem under construction': September 2015 brought news of Terence Tao's complete resolution of the conjecture. This made it the 2nd 'declassification', after Kepler's Conjecture. The role of the Polymath in Tao's proof has been commented on helpfully by Gowers here.
 There is much more to the background to the conjecture, and approaches to it, at the official Polymath 5 page.
 The sequence of length 1160 appearing in the table in this theorem description, reproduced from Konev and Lisitsa's paper arxiv.org/abs/1402.2184, is available in Excel 2003 here. The first 11 terms:
− + + − + − − + + − +
happen to constitute a maximumlength sequence with discrepancy C=1. The terms sum to +1 so continuing with a +1 would give a summation to 2; the evenindex terms sum to −1, so continuing with a − would give a summation to −2. The sequence 0,11,1160, ... is oeis.org/A237695; it is known (Konev and Lisitsa) that the next term exceeds 130000.
 There is a very good description of Konev and Lisitsa's proof of \(C=2\) by Richard Lipton and Ken Regan here.
 Erdős mentions Nikolai Chudakov (spelt 'Tchudakoff') in connection with this conjecture here (in Problem 49).
 Although Mathias' paper was published in a 1997 tribute volume for Erdős, this was actually the proceedings of Erdős' 80th birthday celebration, held in March 1993.
 Terence Tao has a valuable blog entry on 'nearcounterexamples' to the conjecture and its unexpected relationship to the 'Elliott Conjecture'. (But see note 1!)
Theorem no. 210: The Basel Problem

 The proof given in the description of this theorem is called the 'Lewin argument' by Kalman and McKinzie who cite its first known appearance as Leonard Lewin's Polylogarithms and Associated Functions, Elsevier Science, 1981 (although they stress that Lewin did not claim credit). The book is an update of Lewin's earlier Dilogarithms and Associated Functions, Macdonald, 1958, in which the same material may be found (chapter 1, section 3.1). Both books are out of print, sadly. There is a valuable review of the former by Richard Askey in Bull. AMS, vol. 6, no. 2.
 Regarding Euler's solution to the Basel Problem, the accepted sequence of events appears to be: discovered in 1734, presented in 1735, published in 1740.
The Euler Archive gives the date of presentation of Euler's paper as December 5,1735, but indexes it with the 1734 presentations. It might appear that 'December 5, 1734' is the correct date. In "Euler and the Zeta Function" Raymond Ayoub gives 1934 as the year of "Euler's first triumph" and says "Euler communicated his result to Daniel Bernoulli and, while unfortunately this letter has been lost, the reply does exist". However, the reply is dated in the Euler Archive as 12th September 1736. Like many of Bernoulli's letters to Euler it deals with several matters giving no clue as to when they arose but it would seem more consistent with a 1735 letter from Euler than a 1734 one.
Theorem no. 211: Willans' Formula

 The reduction, for composite \(k=a\times b\), of \(\sin^2(1+(k1)!)\tau/4k)\) to \(\sin^2(\tau/4k)\) requires justification. Indeed \(1+(k1)!\equiv 1\!\!\!\mod k\) because both \(a\) and \(b\) will divide \((k1)!\). And the quotient of \((k1)!)/k\) will be a multiple of 4 when there are sufficient even factors in \((k1)!\), which is when \((k1)/2>2+\log_2 k\). This occurs for \( k >12\) (but \( k = 6, 9,10,12\) all reduce to \(\tau/4k\) since the greatest power of 2 in their factorisations is low).
 See note 1 for Theorem 194 regarding the computation required to implement Willans' formula.
 The famous 26variable polynomial of Jones–Sato–Wada–Wiens whose positive values, over the positive integers, are precisely the prime numbers, is also based on Wilson's Theorem. There is a nice explanation here.
Theorem no. 212: Vizing's Theorem

 Details regarding Gupta's discovery of this theorem are supplied in the preface of
Michael Stiebitz, Diego Scheide, Bjarne Toft and Lene M. Favrholdt, Graph Edge Coloring: Vizing's Theorem and Goldberg's Conjecture,
WileyBlackwell, 2012:

"Vizing's bound was discovered independently by Ram Prakash Gupta during his Ph.D. studies, mostly at the Tata Institute of Fundamental Research in Bombay, 1965–1967, supervised by Sharadchandra Shankar Shrikhande, and stimulated by Claude Berge. Also Gupta's proof was based on a variation of the fan idea (discovered independently by Gupta), and it was extended to locally bounded infinite graphs i.e. infinite graphs with a finite maximum degree." 

(Curiously, Gupta's rather nominal Wikipedia entry says his supervisor was C.R. Rao, at the Indian Statistical Institute, Calcutta, and their Math Geneology entries support this. However, Rao is a statistician while Shrikhande is a combinatorialist. I suspect there are two R.P. Guptas.)
 The 'fan idea' is the basis of textbook proofs of Vizing's theorem (but not the proof from Schrijver which I have chosen to link to) and extends to prove the generalisation to graphs with multiple edges: \(X'(G)\leq \Delta +m\), where \(m\) is the maximum edge multiplicity of \(G\). A good account is here.
Theorem no. 213: the 6Circles Theorem

 This theorem is sometimes referred to as the MoneyCoutts Theorem although it is not clear why MoneyCoutts deserves more credit than Evelyn or Tyrrell. The name 'Six Circles' is a bit ambiguous: cuttheknot, for instance, has three quite distinct theorems which qualify for the name (located here, alphabetically).
 Although Tyrrell has been described as a 'professional' and Evelyn and MoneyCoutts as 'amateurs', an obituary of Evelyn (by Tyrrell) appears in Bull. London Math. Soc., vol. 9, no. 3, 1977. He published professionallevel work during the 1930s and then again in the 1960s.
 A generalisation by Serge Tabachnikov in a different direction from that discussed in this theorem description, namely from triangles to ngons, is given in "Going in Circles: Variations on the MoneyCoutts Theorem", Geometriae Dedicata,
80, 2000,
201–209, online
here.
Theorem no. 215: Wedderburn's Little Theorem

 Multiplication in the quaternions is described in the description of Moufang's Theorem; you can check that the given multiplication table for the Dickson nearfield of order 9 is identical to quaternion multiplication under the isomorphism:
$$\left(\begin{array}{rrrrrrrr}
1 & a & b & c & d & e & f & g \\
1 & 1 & i & j & k & i & k & j
\end{array}\right)$$
whereby the multiplication is seen to be almost commutative in the sense that the table is skew symmetric.
 Zinovy Reichstein has drawn my attention to an uncomfortable but unavoidable footnote: an elegant, onepage, grouptheoretic proof of Wedderburn's Little Theorem was published by the socalled Unabomber, Ted Kaczynski, while a PhD student at the University of Michigan. A reference can be found here.
 Onepage proofs continue to appear. E.g. John Schue, "The Wedderburn Theorem of Finite Division Rings", Amer. Math. Monthly, Vol. 95, No. 5 (May, 1988), pp. 436437 (using properties of field extensions); Nicolas Lichiardopol,"A New Proof of Wedderburn's Theorem", Amer. Math. Monthly, Vol. 110, No. 8 (Oct., 2003), pp. 736737 (ring theory, exploiting, like Kaczynski's proof, an initial lemma from number theory).
Theorem no. 216: Irrationality of Circumference of Unit Circle

 A more detailed account of Lambert's irrationality proof is given at here at math.stackexchange.
Theorem no. 217: Taylor's Theorem

 An alternative justification for Hugh Worthington's Rule is given by Colin Beveridge here. The explanation illustrating theorem no. 217 is by Tony Forbes, M500 magazine, issue 260, p. 17. He observes that using degrees instead of radians allows an even better approximation: \(\displaystyle \tan^{1}\frac{a}{b}\approx \frac{172a}{b+2c}\) where \(c=\sqrt{a^2+b^2}\).
 A stepbystep proof of the Lagrange remainder form of Taylor's theorem is given by Gowers here.
Theorem no. 218: The Riemann Rearrangement Theorem

 Riemann's habilitation thesis "Ueber die Darstellbarkeit einer Function durch eine trigonometrische Reihe" was published posthumously in 1967. A facsimile can be found here and the text is transcribed here. An English translation is available although it may be out of print. Riemann's habilitation work is discussed in detail in Detlef Laugwitz (transl. Abe Shenitzer), Bernhard Riemann 1826–1866: Turning Points in the Conception of Mathematics, Birkhauser, 2nd printing, 2008. A French translation is here (§1–§8) and here (§9–§13) (presumably the one by Darboux and Houel, c.f. these notes, although no credit is given).
 It seems worthwhile to give an English translation of Riemann's proof of his rearrangement theorem (from §3 of his thesis):

"In Crelle’s Journal in January 1829 a memoir by Dirichlet appeared in which rigorous conditions were established for representing, by trigonometric series, functions which are integrable and which do not possess infinitely many maxima or minima.
"He discovered the correct path to follow to solve this problem by consideration of the fact that infinite series fall into two classes according to whether or not they remain convergent when all their terms are made positive. In the first class, the terms may be permuted in an arbitrary manner; whereas in the second class, the value of the series depends on the ordering of the terms. Indeed, if one denotes, in a series of the second class, the positive terms by
$$a_1,a_2,a_3,\ldots,$$
and the negative terms by
$$b_1,b_2,b_3,\ldots,$$
it is clear that \(\sum a\), and similarly \(\sum b\), must be infinite; for if both sums were finite then the series would still be convergent on giving all terms the same sign; if just one of the sums where infinite, then the series would diverge. It is now clear that the series, if its terms are placed in a suitable order, may take an arbitrary given value \(C\); for if one takes alternately the positive terms of the series until its value exceeds \(C\), and then the negative terms until the value falls below \(C\), the difference between this value and \(C\) will never exceed the value of the term immediately preceeding the most recent change of sign. Now the \(a\) values, and similarly the \(b\) values must eventually become infinitesimally small as their indices increase, and thus the differences between the series sum and \(C\) must also become infinitesimally small, as one extends the series sufficiently long, which is to say that the series converges to \(C\).
"It is only series of the first class which are amenable to the laws governing finite sums; only they may be considered as the collection of their terms; those of the second class may not be so considered: a circumstance which was missed by mathematicians of the last century, in the main because series which extend according to ascending powers of a variable belong, generally speaking (which is to say, with the exception of certain exceptional values of that variable), to the first class.
" 

 Regarding the rearrangements of Leibniz's series given in figure A of the theorem description, it is remarkable that closed forms may be given to their sums (allowing for special functions). Thus for the highest valued rearrangement shown (approx. 0.95868) we have (thanks to Maple):
$$\sum_{k=0}^{\infty}\left(\frac{1}{8k+1}+\frac{1}{8k+5}\frac{1}{4k+3}\right)=\frac14\gamma\frac34\ln 2+\frac{1}{16}\tau\frac18\Psi\left(\frac18\right)\frac18\Psi\left(\frac58\right),$$
where \(\gamma\) is the EulerMascheroni constant, \(\tau\) is circumference of unit circle, and \(\Psi\) is the digamma function (the slope of the log of the gamma function).
 The alternating harmonic series provides an even more fascinating example of Riemann's theorem in the hands of Larry Riddle in this article which originally appeared in Kenyon Mathematics Quarterly, vol. 1, no. 2 (1990), 6–21.
 In another version, Riemann's theorem tells us that a series is absolutely convergent if and only if every rearrangement converges. This becomes a test for absolute convergence, in principle, if it can be shown that a finite number of convergent rearrangements is enough. This proposed 'rearragement number' is the object of study by Andreas Blass, Will Brian, Joel David Hamkins, Michael Hardy and Paul Larson.
Theorem no. 219: Integration by Parts

 A very attractive discussion about "striking applications of integration by parts" is ongoing at stackexchange.
 Ian Bruce wrote to me of his experience with his valuable project 17centurymaths: "Most of the elementary calculus material can be found in Euler's Differential and Integral Calculus books, and in fact he starts Book I on Integration with integration by parts; Ch.1 of this book is Top of the Pops in my line of business, and gets first place consistently in downloads, followed by Newton's definitions & Axioms ...
Euler's work is still highly readable, and more so than others of that age and before; in fact he seems to have set the standard for generations of mathematicians to come."
 Ernst Hairer has provided an elegant geometrical interpretation of integration by parts, which may be viewed here.
 There is a nice description here by Murray Bourne of an alternative to integration by parts called the Tanzalin Method which is apparently commonly used in Indonesia. As you will see, it too can lead to infinite series!
Theorem no. 220: The Pappus–Guldin Theorems

 According to Andrew Leahy's article, no proof by Pappus of his theorems has been discovered and Paul Guldin gave no proof, the first known proof being supplied by Giannantonio Rocca in 1644.
 There is a very nice discussion of the 17th century precalculus debate in Chapter 5 of Amir Alexander's Infinitesimal: How a Dangerous Mathematical Theory Shaped the Modern World, Oneworld Publications, 2014, with a generous extract here.
 Peter Harremoës has drawn my attention to a little irony: the surface area, \(\tau^2rR\), of the torus is given incorrectly about 2.5 mins into this film debate on the π vs τ question as an argument in favour of the former!
Theorem no. 221: The InclusionExclusion Principle

 Attributions of InclusionExclusion often include the name of Poincaré (e.g. 'formule du crible de Poincaré') and this seems a bit obscure. In Encyclopaedia of Mathematics, Supplement III: 3 (ed. Michiel Hazewinkel, Kluwer, 2002) this attribution carries a reference to Poincaré's book Calcul des probabilités, GauthierVillars, 1896, and this book may have been influential in making the principle widely known in France.
 InclusionExclusion may be generalised in several ways. A good example is given here by Stewart Weiss; probably the most famous is due to GianCarlo Rota and is described by Peter Cameron in Lecture 9 of this course. Rota's original article can (and should!) be read here.
 Stewart N. Ethier has contributed the following: "the expected number of boxes needed for a full set of coupons has the nice formula \(\displaystyle n\!\left(1+\frac12+\frac13+\ldots+\frac{1}{n}\right)\), which either can be derived from [the inclusionexclusion formula] (via Theorem 1.4.2 of my book [The Doctrine of Chances: Probabilistic Aspects of Gambling]), or can be derived directly by writing the random variable of interest as the sum of \(n\) independent geometric random variables with success probabilities \(1, (n1)/n, (n2)/n, \ldots, 1/n\) (and using the fact that a geometric(\(p\)) random variable has mean \(1/p\))."
 A nice application of InclusionExclusion is to vary the standard combinatorial question "How many ways to put n indistinguishable balls into k distinguished boxes" by adding "so that no box gets more than C balls". An excellent explanation of the answer is given here by Brian M. Scott.
Theorem no. 222: Faulhaber's Formula

 For a general survey of properties of Bernoulli numbers, Pascal Sabah and Xavier Gourdon's article here is excellent.
 Knuth has a fascinating reflection here on what Faulhaber achieved and how he achieved it.
 Seki's discovery of the Bernoulli numbers is described in Silke WimmerZagier and Don Zagier's chapter in Eberhard Knobloch, Hikosaburo Komatsu and Dun Liu (eds.), Seki, Founder of Modern Mathematics in Japan: A Commemoration on His Tercentenary, Springer, 2013. It may be found online here.
Theorem no. 223: Tutte's Golden Identity

 There is a description of Tutte's work on graph polynomials in the excellent obituary by Arthur Hobbs and James Oxley which appeared in Notices Amer. Math. Soc. 51 (2004), 320330.
Theorem no. 224: Green's Theorem

 Paul Nahin, whose Inside Interesting Integrals is the recommended further reading for this theorem, also writes interestingly about its background in Chapter 7 of An Imaginary Tale: The Story of \(\small\underline{\sqrt{1}}\), Princeton University Press, 1998.
 An important exhibition commemorating Green was held at the University of Nottingham in the autumn of 2014 and the curator's blog post is very interesting.
Theorem no. 225: The Spherical Law of Cosines

 You can find latitudes and longitudes of cities, and compute greatcircle distances between them here.
 Pat Ballew has drawn my attention to a dual version of this theorem, relating three angles A, B, C and one side, say c:
$$\cos(C)=\cos(A)\cos(B)+\sin(A)\sin(B)\cos(c),$$
(this is referred to by Van Brumellen in Heavenly Mathematics as the "Law of Cosines for Angles").
 A nmemonic of Napier for spherical trigonometry (also from Van Brumellen's book, I think) has been nicely summarised by John D. Cook here.
 plus magazine have provided a very nice introduction to longitude and latitude.
Theorem no. 226: Wolstenholme's Theorem

 The converse of Wolstenholme's Theorem, that \({2n1\choose n1}\not\equiv 1 \hspace{0.05in}\mod n^3\) for all composite values of \(n\), is a famous open question. It is known to be true for even \(n\) and for all \(n<10^9\). See for example, Vilmar Trevisan and Kenneth Weber, "Testing the converse of Wolstenholme's theorem", Matemática Contemporânea, 21 (2001), 275–286; online.
 A generalisation of Wolstenholme due to James Whitbread Lee Glaisher in 1900, says that \({kp1\choose p1}\equiv 1 \hspace{0.05in}\mod n^3\) for any prime \(p\geq 5\) and any positive integer \(k\). In this case the converse does not hold. Small counterexamples exist for \(p=4,9,25\), for example (thus \(p=4,n=33\) gives \({131\choose 3}\equiv 1 \hspace{0.05in}\mod 64\).
 A proof of Babbage's \(p^2\) prototype of the theorem is given here.
 More on the 'harmonic numbers' context for Wolstensholme can be found in ZhiWei Sun, "Arithmetic theory of harmonic numbers", Proc. Amer. Math. Soc., 140, no. 2, 2012, 415–428, online here.
Theorem no. 227: Cauchy's Theorem in Group Theory
 A thorough analysis of the origin of Cachy's 1845 'Mémoire sur les arrangements ...', in which his theorem is asserted, has been given by Peter M. Neumann, 'On the date of Cauchy's contributions to the founding of the theory of groups', Bulletin of the Australian Mathematical Society, vol. 40, 1989, 293–302; online.
 Incidentally to the choice of \(D_{10}\) to illustrate this theorem, is a corollary to Cauchy's theorem that any group of order twice an odd prime is either cyclic or dihedral. This is Prop. 3.34 in our recommended book, Smith and Tabachnikova's Topics in Group Theory, Springer, London, 2000.
 Michael Meo claims, in his article "The mathematical life of Cauchy's Group Theorem", Historia Mathematica, vol. 31, issue 2, pp. 196–221, that Cauchy's proof of his theorem contains an 'egregious error' and that a subsequent attempt by Dedekind in the 1850 is also incomplete. This would suggest that the first complete proof of the theorem comes as a corollary to its own generalisation (Sylow's theorems of 1872). However it seems fairer to assume that Cauchy was at least completely in command of his material. Meo's article is available here via Elsevier's open access archive.
Theorem no. 228: Fisher's Inequality

 Bose's paper containing his short proof of the inequality can be read here.
Theorem no. 229: Poncelet's Porism

 An elegant and (relatively) simple proof is given by Lorenz Halbeisen and Norbert Hungerbühler in "A Simple Proof of Poncelet’s Theorem
(on the occasion of its bicentennial)", American Mathematical Monthly, in press, (preprint). For a modern proof, from algebraic geometry, see this by David Speyer.
 A bicentennial survey of past and current research into Poncelet's theorem is given by Vladimir Dragović and Milena Radnović in "Bicentennial of the Great Poncelet Theorem (1813–2013): Current advances",
Bull. Amer. Math. Soc., Vol. 51, No. 3, 2014, 373–445.
 The example given of a quadrilateral inscribed in and circumscribing two eillipses is a cheat! The parameters were chosen by trial and error to give an adequate illustration: outer ellipse is centered on the origin and inclined at \(\tau/8\) to \(x\) axis; major radius \(a = 9.3\), minor radius \(b = 4.1\); inner ellipse is centered at \((1.05046,1.3)\) and has no inclination; major radius \(c = 4.0448\), minor radius \(d = 3.22\).
 Some very good notes by Tony Forbes are available here (about 1.5MB), including detailed instructions for creating genuine examples for all combinations of conics (not just ellipses) and also giving a brief description of the link to elliptic curves.
 Poncelet's Porism is also known as his Closure Theorem for reasons made beautifully clear in Jonathan King, "Three Problems in Search of a Measure", The American Mathematical Monthly, Vol. 101 (1994), pp. 609–628, online here. We find in the same article that the theorem is intimately related to Gelfand's Question!
Theorem no. 230: Ore's Theorem in Graph Theory

 Bondy's short proof appears in "Short proofs of classical theorems", J. Graph Theory, Vol. 44, No. 3, 2003, 159–165. The algorithmic interpretation given here is similar in spirit to an adaptation of Ore's original proof by E.M. Palmer, "The hidden algorithm of Ore's theorem on Hamiltonian cycles", Computers & Mathematics with Applications, Vol. 34, No. 11, 1997, 113–119
 Apart from the leftmost, the graphs illustrating this theorem were generated in Maple. However I manually replaced the vertices in order to get the permuted numberings (I was too lazy to work out how to get Maple to do this).
Theorem no. 231: Sophie Germain's Identity

 The correspondence of Sophie Germain is online at the Bibliothèque nationale de France via gallica. The letter reproduced here is located by searching for '9118' under 'Manuscripts'.
 Leonard Dickson's History of the Theory of Numbers, Volume I: Divisibility and Primality can be read online here courtesy of archive.org. The references to Euler and Germain are on pages 381 and 382, respectively.
 The letter from Euler to Goldbach cited by Dickson can be read at the Euler Archive (August 28 in the 1742 correspondence). Not every letter from Euler to Goldbach of that year is online but it seems clear that this is the one which Dickson intends.
Theorem no. 232: The Riemann Explicit Formula

 An excellent account of the explicit formula, starting from scratch, is this at medium.com by Jørgen Veisdal.
 The relationship between the distribution of primes and (logarithmic spirals) has a rich history. A good example is given by Matthew Watkins here; the idea of spotting patterns in prime spirals goes back to (at least) Ulam in 1963. A nice variant by Edmund Harriss can be found here, and there is a very elegant 3D conical spiral by Dan Bach here.
 The weblink for this theorem by Matthew Watkins offers a very clear account of Riemann's formula in the Chebyshev \(\psi\) function version (as preferred in the Wikipedia entry for example). He has much more on the Riemann Hypothesis here; and indeed, the whole prime distribution story is the subject of his trilogy of books (with illustrator Mark Tweed) Secrets of Creation.
 There is a famous Bonn University inagural lecture by Don Zagier on the subject of Riemann's prime counting function which can be found in English translation here.
 Much intriquing recent commentary on the Riemann Hypothesis, including an extended essary by Alaine Connes, can be found starting at this post from Not Even Wrong.
Theorem no. 233: The Circle Area Theorem

 The Archimedes proof of this theorem still qualifies as a textbook one, e.g. here, although calculus variants of the \(\int_0^{r\tau}\frac12r\mbox{dt}\) variety are presumably more respectable from a modern perspective.
Theorem no. 234: A Generalised Hlawka Inequality

 The original source for Hlawka's Inequality is Hans Hornich, "Eine Ungleichung für Vektorlängen",
Mathematische Zeitschrift,
Volume 48, Issue 1, (1942/43), 268–274. It may be viewed online here thanks to the Göttinger Digitalisierungszentrums. (Hornich says merely "For the special case m = 1, n = 2, Herr Hlawka has given me a purely algebraic proof..." so that the name Hornich–Hlawka as preferred by de.wikipedia.org seems more appropriate. However 'Hlawka' seems to be the generally adopted nomenclature.)
 The original result of Dragomir Djoković appeared in "Generalizations of Hlawka's inequality", Glasnik MatematickoFiziki i Astronomski, Ser. II, vol. 18, (1963), issue 3, 169–175. D.M. Smiley & M.F. Smiley's paper is "The polygonal inequalities", Amer. Math. Monthly, Vol. 71, No. 7 (1964), 755–760. In both papers something more general is proved for the sequence of \(n\) vectors: that for \(2\leq k<n\) we have $$d_k\leq {n2\choose k2}d_n+{n2\choose k1}d_1,$$ using the notation in the statement of the theorem. The inequality as stated is found by summing over \(k\). Djoković and Smiley & Smiley also gave conditions for equality.
 A nice derivation of Hlawka's Inequality from the Ptolomeic Inequality is given by Alice Simon and Peter Volkmann in Annales Mathematicae Silesianae, Vol. 9, 1995, 137140. The article is online here.
Theorem no. 235: A Theorem on Modular Fibonacci Periodicity

 The period lengths of the moduloreduced Fibonacci sequences continue to be the subject of intensive research. A good recent (2012) example is here. They also go by the name of Pisano periods. (after Leonardo Pisano aka Fibonacci). They are sequence A001175 at oeis.org.
 Of particular interest is the socalled 'Wall's Question': for a prime \(p\), is it possible that the period mod \(p\) and mod \(p^2\) should be equal? The question has links to the Fermat's Last Theorem via Germain's Theorem. See, Klaška, J., "Criteria for testing Wall's question", Czechoslovak Mathematical Journal, vol. 58 (2008), issue 4, pp. 12411246, online.
 D.D. Wall's paper is "Fibonacci series modulo m", The American Mathematical Monthly, Vol. 67, No. 6, 1960, 525–532. It does not appear to be freeaccess online. Covering rather the same material is a roughly contemporary paper, "The Fibonacci matrix modulo m" by the Caltech physicist David W. Robinson was published in the 2nd ever issue of Fibonacci Quarterly and this is free online here.
 The papers of Morgan Ward on linear recurrences are a good source of information on modular periodicity. They appears in Transactions of the American Mathematical Society and are free online here (1931) and here (1933). The main result from 1931 is that if \(m\) has prime decomposition \(p_1^{a_1}p_2^{a_2}\cdots p_n^{a_r}\) then period length mod \(m\) is equal to the LCM of period lengths mod \(p_i^{a_i}, i=1,\ldots, r\).
Theorem no. 236: Kemeny's Constant

 The directed graph modelling Alice's casino is a finite automaton which finds the remainder of an input binary number (with \(H=1\) and \(T=0\)) mod 8. Doubling appends a zero to a binary number; adding 1 thereafter appends instead a 1, so the action of the automaton is the same as step (3) in the casino game. Another example of such an automaton illustrates The Pumping Lemma.
 There is an interesting contrast between \(K\), the expected time to reach the stationary distribution, and the probability of reaching the distribution in fewer than \(K\) steps. The latter will be greater than \(1/2\) (to compensate for the occasional long runs). So Bob will often find himself losing money to Alice but he will be seduced by the prospect of a long run, just as in any lottery you hardly ever win anything but play for the prospect of a jackpot.
 An interesting question from Piers Myers is: can other averages for time to stationary distribution, e.g. median, also have constant values for Markov chains? For the 8state chain used here the answer for median values appears to be, roughly, yes, according to simulations: 2500 runs from each starting state to a target state selected u.a.r. gave median times 5,4,5,4,4,4,4,5. But Piers points out that this cannot hold in general: the chain \(\left(\begin{array}{cc} 0 & 1 \\ 1/100 & 99/100\end{array}\right)\) has stationary distribution \((1/101, 100/101)\); Kemeny's constant is \(100/101\); but median time to reach stationary distribution is 1 from state 1 and 0 from state 2.
 A very nice Markov chain animation provided by setosa.io might be of use for 'visualising' Kemeny's constant for small chains
Theorem no. 237: Sylvester's Catalecticant

 The last section of "On the AlexanderHirschowitz Theorem" by Maria Chiara Brambilla and Giorgio Ottaviani is very good on the history of Waring's problem for forms. Also very good are the opening pages of Power Sums, Gorenstein Algebras, and Determinantal Loci, Springer 1999, by Anthony Iarrobino and Vassil Kanev. This paper by Zach Teitler and Alexander Woo gives a good picture of the current state of play.
 Zach Teitler provided me with much help in getting to grips with the subtleties of Sylvester's work to the point where I felt it worth quoting his comments verbatim, in the form of some additional notes.
 A classic paper in the theory of binary forms is Joseph P. S. Kung and GianCarlo Rota, "The invariant theory of binary forms", Bull. Amer. Math. Soc. (N.S.), Vol. 10, No. 1, (1984), 27–85, online here.
 Bruce Reznick draws my attention to the charming description of Sylvester on his work on binary forms:

"I discovered and developed the whole theory of canonical binary forms
for odd degrees, and, as far as yet made out, for even degrees too, at one
evening sitting, with a decanter of port wine to sustain nature's flagging
energies, in a back office in Lincoln's Inn Fields. The work was done, and
well done, but at the usual cost of racking thought—a brain on fire,
and feet feeling, or feelingless, as if plunged in an icepail. That night
we slept no more." 

(which can be found on p. xxiv of The Collected Mathematical Papers of James Joseph Sylvester: Volume 4, 18821897). Bruce observes, aptly I think, "If he had been known as a writer, rather than as a mathematician, this would be a famous
quote!" (Of course Sylvester was proud of, although not remembered for, his poetry, see Chapter 8 of Karen Hunger Parshall's, James
Joseph Sylvester: Jewish Mathematician in a Victorian World, The Johns
Hopkins University Press, 2006.) By the way, an excellent slideshow by Bruce Reznick on representations of forms can be found here.
Theorem no. 238: Euler's Even Zeta Formula

 Euler discovered his formula in 1739 and it appeared in De seriebus quibusdam considerationes which can be read in the original Latin and in German or English translation as entry E130 at the Euler Archive. The role of the Bernoulli numbers was made explicit by Euler in his 1755 classic textbook Institutiones calculi differentialis cum eius usu in analysi finitorum ac doctrina serierum, volume 1 which is entry E212.
 Max Woon (publishing as See Chin Woon) gave his binary tree generation of the sequence of Bernoulli numbers in "A Tree for Generating Bernoulli Numbers", Mathematics Magazine,
Vol. 70, No. 1, 1997, 51–56. A generalisation to arbitrary complex sequences using elementary methods has been given by Petr Fuchs: "Bernoulli numbers and binary trees", Tatra Mountain Mathematical Publications, 20 (2000), 111–117, online (postscript) here.
 Although there may be no direct calculations of \(\zeta(2n+1)\) in terms of Bernoulli numbers, there are infinite series formulae. The most famous approach is probably Ramanujan's, see Section 3 of this nice overview of Ramanujan's notebooks by Bruce C. Berndt: the approach is given a thorough workout by Marc Chamberland and Patrick Lopatto in "Formulas for Odd Zeta Values and Powers of \(\pi\)", Journal of Integer Sequences, Vol. 14 (2011), Article 11.2.5, online here. The bestknown formula is a special case of Ramanujan's first discovered by Mathias Lerch in 1901: if \(n\) is odd then $$\zeta(2n+1)=\frac12\tau^{2n+1}\sum_{k=0}^{n+1}(1)^{k+1}\frac{B_{2k}}{(2k)!}\frac{B_{2n+22k}}{(2n+22k)!}2\sum_{k=1}^{\infty}\frac{k^{2n1}}{e^{k\tau}1},$$ whereby \(\zeta(2n+1)\), for large, odd \(n\), is very close to a rational multiple of \(\tau^{2n+1}\).
Theorem no. 239: Kuratowski's 14Set Theorem

 This theorem was apparently made famous when it featured as an exercise in John L. Kelley's General Topology (first published Van Nostrand, 1955). It again appears as an exercise in James Munkres' Topology (chapter 2, section 17, exercise 21) where it is also required to find a set which generates the maximum of 14. The answers to Munkres' exercises are helpfully provided by Jesper Michael Møller here.
 The particular 14set chosen for my illustration was generated with the help of Mark Bowron's fun interactive diagram.
 There is actually no need to state this as a theorem about topological spaces. P. C. Hammer, "Kuratowski’s closure theorem", Nieuw Archief voor Wiskunde, 7 (1960),
74–80, has shown that Kuratowski's theorem remains true for a more abstract closure operator defined settheoretically. There is a nice discussion by Jeffrey Shallit, Ross Willard here.
 For French speakers, this blog account of the theorem by Blogdemaths is very fine.
Theorem no. 240: The Jones Knot Polynomial Theorem

 Strictly speaking, our presentation of this theorem uses the 'normalised bracket polynomial'. The substitution \(x=q^{1/4}\) is used in the Jones polynomial proper (as recorded in the Knot Atlas, for example). I asked Louis Kauffman about this and he explained it as " a historical accident having to do with the fact that Jones defined the invariant in a different way (via a representation of the braid group to a Temperley—Lieb algebra) than I did by using the bracket state sum. The state sum is close to physics via ideas in statistical mechanics. The Temperley—Lieb algebra is close to physics also."
 The definitive published resource for relationships between knot theory and physics must be Louis Kauffman's Knots and Physics, World Scientific, 4th revised edition, 2013. Kauffman's webpage is also an essential visit, with such gems as his "New Invariants in the Theory of Knots" (an Amer. Math. Monthly writeup of his 1987 breakthrough).
 There is apparently no convention for orienting links when calculating the writhe of a multicomponent link. However, if a link has more than one component then the orientations of individual components only changes the value of the Jones polynomial by a power of its variable. See, for example, Sandy Ganzell, "Jones polynomials of unoriented links", online here.
 One of the biggest questions in knot theory is whether the Jones polynomial distinguishes the unknot, that is, can any knot \(K\) other than the unknot have \(J(K)=1\)? In the case of links with more than one component the Jones polynomial cannot distinguish the unlink, as shown for example by Shalom Eliahoua, Louis H. Kauffman and Morwen B. Thistlethwaite in "Infinite families of links with trivial Jones polynomial", Topology, vol. 42, no. 1, 2003, 155–169, online via Elsevier Open access. It is known, Haken's Unknot Theorem, that distinguishing the unknot is decidable, but by an algorithm that is vastly more complex than evaluating the Jones polynomial.
Theorem no. 241: The Large Prime Gaps Theorem

 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not mean something approaching Cramér's 1936 conjecture.
 The composite sequence in our example is \(m+k\) where \(m=293357\) and \(k=1,\ldots,25\). The fact that \(Y(17)=25\) does not mean that larger \(k\) values will give primes: in fact \(293357+k\) is composite until \(k=42\). By the way, online Chinese Remainder Theorem solvers generally don't appear to accept congruences of the form \(m=a_p\!\!\mod p\) (this by MathCelebrity.com is an exception) but solving with positive \(a_p\) and then negating the answer is fine, as can be seen immediately at (Theorem 5, notes(1)).
 Work on long prime gaps has historically used the Jacobsthal function: \(j(n)\), for positive integer \(n\), is the smallest positive integer \(m\), such that every sequence of \(m\) consecutive integers contains an integer coprime to \(n\) (alternatively, \(j(n)\) is the maximal gap between integers coprime to \(n\)).The first thirty values A048669 are \(1, 2, 2, 2, 2, 4, 2, 2, 2, 4, 2, 4, 2, 4, 3, 2, 2, 4, 2, 4, 3, 4, 2, 4, 2, 4, 2, 4, 2, 6,\ldots\). Ford et al's paper observes that \(Y(x)=j(P(x))1\) (with \(P(x)\) the product of primes not exceeding \(x\)).
 An excellent article about Cramer's model of the prime numbers and his conjecture is this by Andrew Granville, hosted at Chance News, who also have a whole series of lectures on probabilistic number theory, with part 2 focussing on Cramer's work.
Theorem no. 242: The Pólya–Redfield Enumeration Theorem

 The description of this theorem makes a simplification by going straight from a set of labels \(L\) to the formal power sums \(\sum x_i^k, x_i\in L,\) substituted into the cycle index. More properly, we should associate a weight with each label and it is power sums of weights which are substituted. Thus, for example, our redblue edge colourings of the tetrahedron each 'choose' a subset of the edges (say, the blue edges). We can think of this as having a label 'absent' and a label 'present' with weights 1 and \(x\), respectively. And we can enumerate, say, the number of different 3sets up to tetrahedral symmetry by substituting into the cycle index the polynomials \(1+x^k, k=1,\ldots,3\). More formally still, we can define a 'figurecounting series' \(A(t)=\sum a_it^i\) in which \(a_i\) is the number of 'figures' (labels) having weight \(i\). Then what is substituted into the cycle index are the polynomials \(A(t^k)\). This allows \(A(t)\) to be an infinite sum (with constant coefficients!). In Peter Cameron, Permutation Groups, Cambridge University Press, 1999, section 5.13, this approach gives the enumeration of \(n\)sets up to symmetry via the figure counting series \(A(t)=t^0+t^1=1+t,\) the weights being 0 and 1.
 There is a lovely body of theory called 'orbital combinatorics' which combines enumeration up to both symmetry and structure. It originates in Peter J. Cameron, Bill Jackson and Jason D. Rudd, "Orbitcounting polynomials for graphs and codes", Discrete Mathematics, Vol. 308, Issues 5–6, 2008, 920–930, online here. There is a more uptodate overview on Cameron's blog and see also this presentation.
Theorem no. 243: A Theorem of Anderson, Cameron and Preece on Groups of Units

 A good presentation of this work in context by Peter Cameron, as well as much else of relevance and interest, are linked from his blog entry on the Donald Preece Memorial Day.
 The fact that \(3\) is a quadratic residue mod \(p\) for an odd prime \(p\) if and only if \(p\equiv 1 \pmod 6\) is a textbook exercise, c.f. Problems 9.3, no. 5(a) in David M. Burton, Elementary Number Theory 7th edition, McGrawHill, 2010.
Theorem no. 244: The LYM Inequality

 There is a classic probabilitistic proof of the LYM inequality due to Peter Frankl, "A probabilistic proof for the lyminequality", Discrete Math., Vol. 43, Issues 2–3, 1983, p. 325; online. A slightly more accessible account is here "Notes and Exercises 4".
Theorem no. 245: The Alternating Series Test

 A good source of information on the origins of the concept of series convergence is Giovanni Ferraro, "Convergence and formal manipulation in the theory of series from 1730 to 1815", Historia Mathematica, Vol. 34, Issue 1, 2007, 62–88, online here.
 There are some good discussions about alternating series at mathstackexchange.com: for example this, and this.
 The first proof of divergence of the harmonic series is attributed to Nicolas Oresme in the 14th century. See note(1) to Theorem 197 for a collection of proofs.
 A wellknown occurrence of the harmonic series is in creating a large overhang when stacking overlapping blocks: the stack remains stable when the overlaps are, successively, \(1/2,1/4,1/6,\ldots\), giving a total overhang of \(\frac12 H_n\), \(H_n\) being the \(n\)th partial sum of the harmonic series. Thus an unbounded overhang is possible. But in fact much more can be achieved, as demonstrated by Paterson and Zwick in 2007. See this prepint, for example.
 Mathwithbaddrawings has a lovely entry about how slowly the harmonic series diverges and the curious fact that, omitting terms containing the digit 9 restores convergence (the Kempner series).
Theorem no. 246: Euler's Product Formula for ζ(s)

 The convergence of Euler's formula for real \(s>1\) is sometimes attributed to Kronecker in the 1870s. However, it seems likely that convergence issues would have been resolved before that, even by the time of Riemann's investigations in the 1850s. Some good background is given here.
 Euler's formula is the wellspring from which emerged group representations and harmonic analysis in the masterful account by Anthony Knapp in the April 1996 issue of the AMS Notices.
Theorem no. 247: Euler's Product Formula for Sine

 My sketch derivation of Euler's cotangent series follows Konrad Knopp (transl. Young), Theory And Application Of Infinite Series, Dover edition, 1990. The book is online here via archive.org and the relevant text can be found on page 205ff. I should reiterate that all that appears in my description is referred to by Knopp as an "in general faulty mode of passage to the limit" (his emphasis), which he spends two pages making rigorous (even confined to the reals). The derivation of Euler's cotangent formula from the halfangle formula is attributed by Knopp to Heinrich Schröter, 1868. There is more on its history in Chapter 11 (§ 2) of Reinhold Remmert's, Theory of Complex Functions, Springer, 1991, (transl. Robert Burckel), which is online here via archive.org.
 A nice blog entry about Euler's formula by Jim Belk discuss an alternative rigorous derivation via nonstandard analysis.
 Also by Jim Belk, an explanation of why convergence of infinite products is subsumed within convergence of infinite series.
 Euler's formula's LHS may be replaced with the more elegant (from this website's perspective!) \(\sin\tau x\) but at the expense of elegance in the RHS, where the factors in the product become \(14x^2/k^2.\) The halfangle formula which is our starting point now relates to a function \(f(x)=\cot \tau x\) and is given as \(f(x)=\frac12(f(x/2)+f(x/2\pm 1/4))\).
 The halfangle formula used to derive the cotangent expansion, \(f(x)=\frac12(f(x/2)+f((x\pm 1)/2))\), suggests the question what does the difference of the two terms yield. In fact the answer is the cosecant: \(\frac12(f(x/2)f((x\pm 1)/2))=\mbox{cosec}(\tau x/2)\).
Theorem no. 248: A Theorem about Gaussian Moats

 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not mean a proof that Gaussian moats can have unbounded width.
 The widest Gaussian moats found thus far have width \(D=6\), found in 2004 by Nobuyuki Tsuchimura: "Computational Results for Gaussian Moat Problem", IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Volume E88A Issue 5, May 2005, 1267–1273, online here.
 Ellen Gethner has kindly supplied information on the origins of Gordon's Gaussian Moat problem:

"One of the most difficult aspects of the problem was in finding out who actually posed it; the paper by Jordan and Rabung attributes that to Erdős. I had the opportunity to ask
Erdős about the problem at an analytic number theory conference at the University of Illinois in 1995. I remember going to a conference party at someone’s home, and there was Erdős on a chaise lounge in the middle of an enormous back yard with no other chairs in sight and 100+ mathematicians milling around him. I had a fairly lengthy conversation with him about the Gaussian Moat problem; I learned right away that he hadn’t posed the problem and he wasn’t sure who had. I asked him if he thought the conjecture was true (i.e., that there are indeed arbitrarily wide Gaussian Moats) and his response was a pause followed by “what do YOU think?” I answered that I thought the conjecture was correct, to which he responded “so do I!”
In the meantime, later on at the same conference, I happened to be in a car with several other mathematicians on the way to a session. One of the mathematicians was Basil Gordon; I had heard from one of his PhD students that Gordon might be able to help me find who had posed the problem. During that car ride, I asked my question, and oddly enough, Gordon turned out to be the original poser! I think (I’m a little fuzzy here) that he said that he had posed the problem during a session of one of the ICMs. In any case, all of the encounters were purely serendipitous and in looking back on the whole thing, I’m surprised that I succeeded in solving some of the mysteries.
" 

(the "paper by Jordan and Rabung" is J. H. Jordan and J. R. Rabung, "A conjecture of Paul Erdős concerning Gaussian primes", Math. Comp., vol. 24 (1970) 221–223. They construct a 4 moat, i.e. they show that steps of size at least \(D=4\) are required to reach infinity.)
 I found these notes (pdf) on Gaussian integers by Christian Wuthrich of great value, as also these by Noah Snyder.
Theorem no. 249: Bézout's Identity

 It should be noted that the discrete logarithm problem is solved by quantum computers: Peter W. Shor, "Algorithms for quantum computation: discrete logarithms and factoring", in Proc. 35nd Annual Symposium on Foundations of Computer Science (Shafi Goldwasser, ed.), IEEE Computer Society Press, 1994, 124–134 (last time I looked there was an online copy here, Shor's arxiv version is also available but this is expanded from the FOCS publication and is a bit less accessible, I think.)
Theorem no. 250: the Power of a Point Theorem

 Michael N. Fried, the author of "Mathematics as the science of patterns", the recommended weblink from this theorem, gave me the following insight, which I find charming and valuable:

\(h\), as you defined it, can be thought of in a slightly different way. Consider the function \(f(x,y)=(xa)^2+(yb)^2r^2\). Its zeros are the points on a circle with center \((a,b)\) and radius \(r\); but its value at an arbitrary point \(P(x,y)\) is the power of \(P\) with respect to that circle. Students typically learn to solve equations \(f(x,y)=0\), that is, to find the curve they represent, but then ignore the values of \(f\) at other points. For example, the curve given by \(f(x,y)=ax+by+c=0\) (normalized so that \(a^2+b^2=1\)) is of course a straight line \(L\), while the value of \(f\) at other points is the distance of between those points and \(L\) (including of course those points whose distance from \(L\) is zero!). 

Michael Fried has, by the way, a fascinating presentation comparing the relevant bits of Euclid with Steiner's geometry (it is online here but appears to be audio only).
 CutTheKnot offers a very nice application of the Intersecting Chords Theorem and there is this memorable take on the theorem by Solve My Maths.
Theorem no. 251: The Hanani–Tutte Theorem

 Hanani's original 1934 paper, published under his Polish birth name of Chaim Chojnacki as "Über wesentlich unplättbare Kurven im dreidimensionalen Raume", Fundamenta Mathematicae, Vol. 23, Issue 1, 1934, pages 135–142, can be read online here. Bill Tutte's 1970 paper "Toward a theory of crossing numbers", Journal of Combinatorial Theory, Vol. 8, Issue 1, 1970, pages 45–53, can be read online here.
 The algebraic specification of planarity is independently credited to WenJun Wu, "On the planar imbedding of linear graphs", Journal of Systems Science and Mathematical Sciences, 1985 Issue 4, pages 290–302, with independent work of some others preceding it. More details at the Wikipedia entry for the theorem.
 Although we can solve equations to turn a graph drawing into one which has only evenly crossing independent edge pairs it is not obvious how to turn the result into a drawing with no crossings at all. For this we need a direct algorithmic proof of Hanani–Tutte and this was first provided by Michael J. Pelsmajer, Marcus Schaefer and Daniel Štefankovic in "Removing even crossings", Journal of Combinatorial Theory B, Vol. 97, Issue 4, 2007, pages 489–500, online here.
 For small graphs, testing solvability of the Tutte–Wu equation system allows planarity testing without recourse to graph algorithms or data structures. However, for large graphs this does not compete realistically with known linear algorithms since the worstcase running time is \(O(n^6)\) where \(n\) is the number of vertices of the graph (strictly, the number of edges is involved in the running time but a nonlinear number of edges guarantees nonplanarity by the \(3n6\) bound, see Kuratowski's Theorem).
Theorem no. 252: Bertrand's Ballot Theorem

 As so often, the theorem is not accurately named, since it was apparently first stated by William Allen Whitworth in 1878. The general form I have given is due to neither, having evolved from the original case \(k=1\). Details may be found in Marc Renault, "Four Proofs of the Ballot Theorem", Mathematics Magazine, vol. 80, no. 5, 2007, pp 345–352; online via Renault's Ballot Problem page (which is the recommended weblink from the theorem page).
 Just to fill in the details, there is a claim on the theorem page that removing a sequence of \(k\) \(a\)'s and a \(b\) from a cycle with \(n\) \(b\)'s and \(m=n(k1)+S\) \(a\)'s, "gives a new cycle in which the surplus \(S\) is reduced by exactly 1." Indeed, with \(M=mk\) \(a\)'s and \(N=n1\) \(b\)'s remaining, we have \(M=n(k1)k+S=N(k1)+k1k+S=N(k1)+(S1).\) (The proofs of the Cycle Lemma I have seen invoke the pigeon hole principle to repeatedly remove the \(a\)\(b\) sequences. But I find this obscure — what are the boxes? Saying 'a counting argument shows...' would seem more appropriate. And I have gone so far as to actually spell out the counting argument.)
