Theorem of the Day
is maintained by Robin Whitty. Comments or suggestions are welcomed by me.
"Theorem of the Day" is registered as a UK Trademark, no. 00003123351. All text and images and
associated .pdf files © Robin Whitty, 2005–2021, except where otherwise
acknowledged. See FAQ
for more.
Website terms and conditions


Notes
Supplementary notes for some of the listed theorems are provided below. Any suggestions for additions or corrections can be emailed to me and will be most welcome.
Theorem no. 1: The Four Colour Theorem

 The original announcement (September 1976) by Appel and Haken of their proof is available on free access here courtesy of Project Euclid. The full publication followed a year later: Part I and Part II (there are also microfiche supplements).
 A 1998 update on proving the 4CT (still computerassisted) is given by Robin Thomas here (direct pdf download, 270K)
 The obvious progression from the sophisticated computerassisted proofs of 4CT to formalised, computergenerated proofs, is discussed here (direct pdf download, 2.6MB)
 Brendan McKay "A note on the history of the fourcolour conjecture", Journal of Graph Theory,
Vol. 72, No. 3, 2013, 361–363, has given the earliest publication date for the Four Colour Conjecture as 1854 (it was discussed in correspondence in 1852). A preprint is here.
 There is a reference by Isabel Maddison to a "slightly different form" of the mapcolouring question due to Möbius and his amateur mathematician friend Adoplh Weiske, publicised by Möbius in 1840 ("Note on the history of the mapcoloring problem", Bull. Amer. Math. Soc., Volume 3, Number 7, 1897, page 257; online here). On p. 146 of Alexander Soifer, The Mathematical Coloring Book: Mathematics of Coloring and the Colorful Life of its Creators, Springer, 2009, you can find the details: a country is to be divided into 5 regions each bordering every other. Essentially, prove that \(K_5\) is nonplanar (cannot be the graph of a map drawn in the plane), so perhaps more a precursor to the deeper Hadwiger's Conjecture than the fourcolour conjecture.
 In 1976, concurrently with the proof of 4CT, Richard Steinberg, while a PhD student of Bill Tutte at Waterloo, conjectured that any graph having no cycles of length 4 or 5 should be 3colourable. A counterexample was found in 2016.
 Chris Budd tests the hypotheses of the 4CT (in its original mapcolouring formulation).
Theorem no. 2: The Fundamental Theorem of the Calculus

 Making the accumulation function \(\int_0^x f(t)dt\) of Part II of the theorem the starting point for explaining the whole Fundamental Theorem makes good pedagogical and aesthetic sense, as argued by McQuillan, D. and Olsen, D. M., "A Truly Beautiful Theorem: Demonstrating the Magnificence of the Fundamental Theorem of Calculus," Journal of Humanistic Mathematics, Volume 6 Issue 2, pages 148160. Online here.
 Part 1 and 2 of this theorem are not converse. Indeed a counterexample disproving the converse of Part 2 is provided by the alwaysinsightful mathcounterexamples.net.
 This theorem is the choice of Amie Wilkinson in Episode 1 and Aris Winger in Episode 64 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 3: The Bruck–Ryser–Chowla Theorem on Finite Projective Planes

 Original sources for this theorem:
 Bruck, R. H. and Ryser, H. J., "The nonexistence of certain finite projective planes", Canadian Journal of Mathematics, Vol. 1, Issue 1, 1949, pp. 88–92; online.
 Chowla, S. and Ryser, H.J., "Combinatorial problems", Canadian Journal of Mathematics, 2: 1950, pp.93–99; online.
 Lam, C.W.H., Thiel, L. and Swiercz, S., "The nonexistence of finite projective planes of order 10", Canadian Journal of Mathematics, Vol. 41, Issue 6, 1989, pp. 1117–1123; online.
 How to draw finite projective planes in the Euclidean plane is somewhat a matter of taste or convenience. I have chosen in the theorem description to illustrate the order 2 plane with a 'broken' middle circle; it is often drawn with this circle completed. Neither is correct if you require that intersections of lines of the projective plane correspond to intersections of lines drawn in the Euclidean plane. The issue (thanks to Dr. Pravas K for drawing my attention to it) is discussed here.
Theorem no. 4: Euclid's Infinity of Primes

 Our description of Euclid's theorem follows conventional practice in casting the proof as 'by contradiction'. One may take issue with this: see Michael Hardy and Catherine Woodgold, "Prime Simplicity",
The Mathematical Intelligencer,
December 2009, Volume 31, Issue 4, pp 44–52; online (paywalled). A good overview of the issue is given here.
 A magisterial survey of proofs of this theorem, with extensive cross indexing, is given by Romeo Meštrović in "Euclid's theorem on the infinitude of primes: a historical survey of its proofs (300 B.C.–2017) and another new proof"; online.
 Another account of the "infinitude" of proofs of this theorem can be found, encapsulated in an elegant contextual discussion, here.
 Among the proofs to be found via (3), Fürstenberg’s 1955 topological proof has been given a more gentle exposition by TaiDanae Bradley.
 Among the proofs not found via (3) (it would have appeared too late, I think) is this lovely oneline proof by contradiction by Sam Northshield, with products taken over all primes \(p\), supposedly finite in number and with \(P\) denoting the product of all these primes:
$$0<\prod_p \sin(\tau/2p)=\prod_p \sin(\tau/2p+\tau P/p)=0.$$ (As usual, \(\tau\) denotes circumference of unit circle. More details are given by John D Cook here and CutTheKnot here, and by Northshield himself here.
 Another intriguing approach is taken by Christian Elsholtz in "Fermat's Last Theorem Implies Euclid's Infinitude of Primes"; online, in which FLT and many other classic theorems, not all number theoretic, are shown to be false in a world where there are only finitely many primes.
 The arxiv preprint by Chris Caldwell and Yeng Xiong cited on this theorem page was published as "What is the Smallest Prime?" Journal of Integer Sequences, Vol. 15 (2012), Article 12.9.7; online. (I've kept to the arxiv citation on the theorem page because it's a much shorter URL. It also has facsimiles of historical documents which are typeset in the published article: although the resulting pdf is much smaller this seems a pity). Another discussion of the role of 1 as a nonprime is here by Evelyn Lamb.
 This theorem is the choice of Ken Ribet in Episode 22 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 5: The Chinese Remainder Theorem

 John D. Cook gives a neat description of how our congruence system \(x\equiv y_i\pmod{n_i}\) may be solved as
$$y_1(N/n_1)^{\varphi(n_1)}+y_2(N/n_2)^{\varphi(n_2)}+\ldots + y_r(N/n_r)^{\varphi(n_r)},$$
where \(N=n_1n_2\cdots n_r\) and \(\varphi\) is Euler's totiant function. Thus our system with \(y_1=3, y_2=4,n_1=4, n_2=5\) is solved by \(3\times 5^2+4\times 4^4=1099\pmod{20}=19.\)
 A good online CRT solver is this by MathCelebrity.com which gives all the working and accepts negative number inputs.
Theorem no. 6: The Fundamental Theorem of Algebra

 The current page describing this theorem replaces an older version using a quadratic polynomial, easier to assimilate perhaps but it seemed more revealing to have a cubic with both real and imaginary roots. The old version is archived here.
 Daniel Litt gives a nice 'minimal' proof of the theorem.
 Daniel J. Velleman offers "The Fundamental Theorem of Algebra: A Visual Approach", The Mathematical Intelligencer, December 2015, Volume 37, Issue 4, pp 12–21, preprint found online here (March, 2021).
 Paul Taylor has posted this English translation of "Gauss's second proof of the fundamental theorem of algebra" (thanks to John D. Cook for drawing my attention to this).
 Featured in Math Scholar's thread Simple proofs of great theorems.
Theorem no. 7: The Fundamental Theorem of Arithmetic

 This theorem description replaces an older version which had a more explicit illustration of walks defined by prime factorisations. The new version has a more sophisticated plot and the accompanying text sketches the proof of the theorem and has more on Goldbach. For those who prefer something more simpleminded (less crowded!) I have left the old version here.
 Symbolically, this theorem asserts a unique (up to order) representation of a positive integer \(n\) as a product of powers of primes:
$$n=p_1^{a_1}p_2^{a_2}\cdots p_r^{a_r},$$
(with 1 being by convention the value of the empty product). This representation is implicit or explicit in many proofs in elementary number theory. It is also explicit in various calculations, e.g. in counting certain magic squares (Theorem 129) or in calculating periods of modular Fibonacci sequences (Theorem 235, notes(4)).
 Although Goldbach does not imply that every point \((2k,2)\) will eventually appear on the walks plot illustrating this theorem, Andrzej Schinzel has shown ("Sur une conséquence de l'hypothèse de Goldbach", Bulgar. Akad. Nauk. Izv. Mat. lnst., 4 (1959) 35–38) that Goldbach does imply that every odd integer greater than 17 is a sum of three different primes, which would mean every point \((2k+1,3),\,k>8\), is plotted. Sierpinski on p. 124 of Elementary Theory of Numbers, Elsevier, 1988 adds "It follows from the results of Vinogradov that each sufficiently large odd number is such a sum". Of course H.A. Helfgott's proof of the Ternary Goldbach Conjecture confirms that every odd integer greater than 5 is a sum of three primes. It is not mentioned explicitly in Helfgott's preprint but I asked him and a result for three distinct primes is indeed implied.
 A lovely implementation of FTA for positive integers up to 99 has been knitted by Sondra Eklund!
 Evelyn Lamb has created a mathpoetry project called Prime Factorization as Verse which is ungoing on her twitter account.
Theorem no. 8: The Central Limit Theorem
 Original source for the Turing CLT story is S. L. Zabell, "Alan Turing and the Central Limit Theorem", The American Mathematical Monthly, Vol. 102, No. 6, 1995, pp. 483–494; online (paywall), which is authoritative on the theorem's origins. The Lindeberg's CLT is Lindeberg, J . W., "Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung", Mathematische Zeitschrift, 15, pp. 211–225; online (paywall, euDML).
 The recommended weblink from this theorem was previously this by Thayer Watkins, which is still good, but the applets are a bit problematical now.
 Good modern animations of CLT can be found however: this by Michael Freeman.
 Another version of CLT is explained very clearly at the Math Citadel and its hypotheses justified with a convincing counterexample.
 John D. Cook gives an elegant summary of the historical origins of the normal distribution here.
 There is a formal proof of the theorem, announced here by Jeremy Avigad, Johannes Hölzl, Luke Serafin.
Theorem no. 9: Fermat's Last Theorem

 Wiles' proof of FLT occupied a complete issue of Annals of Mathematics: Wiles' "Modular elliptic curves and Fermat's Last Theorem", vol. 141, no. 3 (1995), pp. 443–551, accompanied by Richard Taylor and Andrew Wiles, "Ringtheoretic properties of certain Hecke algebras", pp. 553– 572. The papers don't seem to be readily available online, unfortunately.
 The current page on FLT replaces an earlier one which had a more primitive graphic (without the benefit of Benoît Leturcq's fun Fermat sketch) and omitted the Fermat nearmiss example). I have kept the old version here in case anyone prefers a less 'busy' page.
 Fermat nearmisses are based on lots of clever algebraic number theory: see this by Noam Elkies.
 The reference in the illustration to probability theory is a nod to the fact that Fermat coinvented it. See this, for example, by Peter Lee.
 A reference to "Molina's Urns" can be found, for example, in Frederick Mosteller, Fifty Challenging Problems in Probability with Solutions, Dover reprint, 2000 (problem 56 on p. 88).
 Another elegant account (7.1MB pdf) of the resolution of FLT, which gives a little more on subsequent developments, can be found in this list of lectures by Karl Rubin.
 The role of modular forms in the proof of FLT is made explicit in this presentation (7MB pdf) by Ken Ribet.
Theorem no. 10: Bayes' Theorem

 You can view a Bayes' theorem prior as allowing the inclusion of numerical odds for subjective assumptions. I think a Bayesian would argue that not including these odds is to make an equally subjective assumption that prior knowledge is irrelevant. This is very well argued by Mike Lee and Benedict King in this Conversation article.
 A nice retrospective on the history of use and abuse of Bayes' Theorem is provided by Bradley Efron in "Bayes’ Theorem in the TwentyFirst Century", Science 340, June 7, 2013. A preprint is available here (81KB pdf, April 2021). I also liked (before it became paywalled) this Scientific American blog entry by John Horgan.
 An investigation by Stephen M. Stigler, "Who Discovered Bayes's Theorem?", The American Statistician, 37 (4), 1983, 290–296 finds credible evidence that Bayes' Theorem was first discovered by Nicholas Saunderson. An online version is here (April 2021) and the article is reproduced in Stigler's book Statistics on the Table: The History of Statistical Concepts and Methods, Harvard University Press, paperback edition 2002.
 A beautiful animated illustration of conditional probability by Victor Powell is here, while Will Kurt's Bayesian blog Count Bayesie has a nice alternative to our cow counting illustration here.
 Bayes applied by Chris Budd to the Monty Hall problem.
 Robin Evans contributes this entry (with a fine A3 poster version) on Bayes to the Oxford Mathematics Alphabet.
Theorem no. 11: Lagrange's FourSquares Theorem

 Original source for this theorem: Lagrange, J.L., "Démonstration d’un théorème d’arithmétique", Nouveaux mémoires de l’Académie Royale des Sciences et Belleslettres de Berlin, Anneé 1770, 1772, pp. 123–133; online (as reproduced in Lagrange's collected works). Definitive on the history of Lagrange's contributions is Jenny Boucard, "Lagrange and the foursquare theorem", Lettera Matematica, Vol. 2, 2014, pp. 59–66; online.
 Very good on the evolution of the four square's problem is Mark B. Beintema and Azar N. Khosravani,"Universal forms: the foursquare theorem and its generalizations", Missouri J. Math. Sci., 15(3), 2003, pp. 153–161; online. There is an attractive popular article on the history of the theorem by Anuradha S. Garge, "Lagrange's Four Squares theorem: from conjecture to proof", At Right Angles, Vol. 1, No. 2, 2012, pp. 5–9; online (complete issue, 13MB pdf; the article is extracted here, 160KB pdf, April 2021).
 The problem of finding a foursquares representation of a given integer is discussed in Paul Pollack and Enrique Treviño, "Finding the four squares in Lagrange's theorem, Integers, Vol. 18A (2018), paper A15; online. There is an online app by Dario Alpern here.
Theorem no. 12: The Matrix Tree Theorem

 Original source for this theorem: Kirchhoff, G.m "Über die Auflösung der Gleichungen, auf welche man bei der untersuchung der linearen verteilung galvanischer Ströme geführt wird", Ann. Phys. Chem., 72, 1847, pp. 497–508; online (paywall, a pdf download is available via semanticscholar).
 The reliability calculation here, in general, is asking what is the probability that deleting \(en+1\) edges uniformly at random will result in a spanning tree. For a plane graph with \(e\) edges, \(n\) vertices and \(f\) faces, and having \(t\) spanning trees, the calculation becomes \(t(f1)!(n1)!/e!\), which neatly shows that the probability is identical for the dual graph.
 The weblink for this page proves MTT using the Binet–Cauchy theorem from matrix theory. A standard combinatorial approach uses induction based on deletioncontraction as in these notes by David P. Williamson. A direct combinatorial proof by Doron Zeilberger is given in section 4 of "A combinatorial approach to matrix algebra", Discrete Mathematics, Vol. 56, Issue 1, 1985, pages 61–72; online.
The proof of Seth Chaiken and Daniel J. Kleitman given in "Matrix Tree Theorems", Journal of Combinatorial Theory, Series A,
Vol. 24, Issue 3, May 1978, pp 377–381 is also of interest; online.
A nice random walk proof is given by Michael J. Kozdron here, invoking the algorithm of David Wilson for selecting a spanning tree uniformly at random. Gil Kalai has a nice overview here.
 See Garrys Tee's article in vol. 30 (3.9MB pdf) of Image for more on the history and applications of determinants.
Theorem no. 13: Fermat's Little Theorem

 Regarding the origins of this theorem an authoritative source is Colin R. Fletcher, "A reconstruction of the FrenicleFermat correspondence of 1640", Historia Mathematica,
Vol. 18, Issue 4, 1991, pp. 344–351; online. On the possible origins of the theorem in ancient Chinese mathematics, see this fascinating investigation by Qi Han and ManKeung Siu, "On the myth of an ancient Chinese theorem about primality", Taiwanese J. Math.,
Vol. 12, Number 4, 2008, pp. 941–949; online.
 A good source on Guiga's conjecture is D. Borwein, J.M. Borwein, P.B. Borwein, R. Girgensohn, "Guiga's conjecture on primality", American Mathematical Monthly, vol. 103, 1996, pp 40–50; online (paywall), preprint. The lower bounds for counterexamples (>4771 prime factors, > 19908 digits) are from this presentation (4.4MB pdf file).
 An alternative strengthening of its hypothesis that makes Fermat's test necessary and sufficient is Lucas's test, see Vaughan Pratt's Theorem.
 The proof of Fermat's Little Theorem given in the description here is due to James Ivory, "Demonstration of a theorem respecting prime numbers", New series of The Mathematical Depository, 1 (II),1806, pp 6–8. You appear to be able to access all of this volume free online from google books. The proof is given in slightly more detail by cuttheknot, which is where I took my version from.
 Euler's important generalisation of Fermat's theorem should be recorded here. Euler's totient function \(\phi(n)\), for \(n\) a positive integer, is the number of positive integers less than \(n\) and coprime to \(n\). Now for \(m\) a positive integer and \(a\) any integer coprime to \(m\) we have \(a^{\phi(m)}=1 \mbox{ mod } m\). Art of Problem Solving give a proof. For example, \(10^{\phi(9)}=10^6\) which of course has remainder \(1\) on division by \(9\). For prime \(p\) we have \(\phi(p)=p1\) so that Fermat's theorem is an immediate corollary. Euler's first proof of Fermat's theorem was published in 1736. He published several other proofs, culminating in 1763 with this generalisation, published in his
"Theoremata arithmetica nova methodo demonstrata"; online.
 This theorem is the choice of Jordan Ellenberg in Episode 4 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
 Tangential but the cube images on this theorem page come from the webpage of Jessica Fridrich who is the subject of a valuable blog post at Gödel's Lost Letter.
Theorem no. 14: Cook's Theorem

 Original source for this theorem is: Cook, Stephen, "The complexity of theorem proving procedures", Proceedings of the Third Annual ACM Symposium on Theory of Computing, 1981, pp. 151–158; online (paywall, the paper has been TEXed into pdf by Tim Rohlfs, available here.) There is a nice 40th anniversary blog post on Lance Fortnow's blog. Another entry there explains who the name 'NPcomplete' was invented in response to a poll run by Knuth in 1973.
 An English translation of Leonid Levin's paper, together with a thorough analysis, may be found here. The background to Levin's work in the USSR is described by B.A. Trakhtenbrot, "A Survey of Russian Approaches to Perebor (BruteForce Searches) Algorithms", IEEE Annals of the History of Computing, Vol. 6, Issue 4, 1984, pp. 384–400; online (paywall, a facsimile was here, May 2021, 24MB pdf).
 It may be asserted that Kurt Gödel was the first to ask the P vs NP question, in a letter to von Neumann in 1956. See page 250 of John W. Dawson Jr, Logical Dilemmas: The Life and Work of Kurt Gödel, A K Peters, 1997. Ash Jogalekar @curiouswavefn has tweeted a picture of the revelent passge.
 I provide a little more (amateur) analysis of this theorem as an example of a 'simultaneity' in mathematics here.
 Regarding P=NP? Gerhard Woeginger provides a valuable 'clearing house' for proof/disproof attempts.
 Although it is widely believed that P≠NP there are a few prominent sceptics, for example Don Knuth (see Q. 17 here) and R.J. Lipton (see this). Bill Gasarch has conducted three surveys over nearly 20 years regarding people's beliefs about P=NP. The latest (and links to previous) is described here.
Theorem no. 15: The Cauchy–Frobenius Lemma

 The current version of this page replaces an earlier version which concentrated on depicting permutations and the idea of fixing a point. This new version offers a basic illustration of how the lemma applies in counting, an illustration which is continued in the page for the PólyaRedfield Enumeration Theorem. The old version of this page is retained here.
 Furthermore, I have abandoned the original name "The Orbit Counting Lemma" of this page since the attribution to Cauchy and Frobenius seems appropriate, and because it made for an easier correspondence with the French translation of the page. A probably definitive account of the lemma is given by Peter M. Neumann in "A lemma that is not Burnside's", The Mathematical Scientist, Vol. 4, Issue 2, July 1979, pp. 133–141; online.
 The lemma applies to arbitrary group actions; I preferred to limit myself to permutation groups to avoid having to define what is meant by a group action.
 This theorem is the choice of Mohamed Omar in Episode 10 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 16: Sperner's Lemma

 Original source for this theorem is: E. Sperner, "Neuer Beweis für die Invarianz der Dimensionszahl und des Gebietes", Abh. Math. Sem. Univ. Hamburg, 6, 1928, pp. 265–272; online. The relationship between Sperner's result and that of Knaster–Kuratowski–Mazurkiewicz which is more explicitly a lemma for Brouer's Fixed Point Theorem is discussed in the Wiki and Springer Encyclopedia entries for the former.
 My original weblink from this theorem page was to the Sperner entry at cuttheknot. I prefer, because of java applet browser issues, to relocate this link to my notes page. I hope that at some future date I can reinstate it because some clever and altruistic people have honoured Alexander Bogomolny by giving cuttheknot a new lease of life!
 Sperner's Lemma provides an elementary addition to Maryam Mirzakhani's legacy.
 This theorem is the choice of James Tanton in Episode 27 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 17: The WellOrdering Theorem

 The origins of this theorem are well summarised by Michael Hallett in the abstract of an entry in the Springer collection of Zermelo's works. The first official publication was E. Zermelo, "Beweis, daß jede Menge wohlgeordnet werden kann. (Aus einem an Herrn Hilbert gerichteten Briefe)",
Mathematische Annalen, Vol. 59, 1904, pp. 514–516; online (paywall, facsimile).
 As the remark on the Banach–Tarski Paradox suggests, it is infinity, rather than wellordering or the Axiom of Choice, that can defy intuition. This 'antiantiBanarch–Tarski' argument is very well made by Asaf Karagila (follow the link back from the wonderful cartoon; I found this originally at Boole's Rings).
Theorem no. 18: Brouwer's Fixed Point Theorem

 Original sources for this theorem:
 Bohl, P., "Über die Bewegung eines mechanischen Systems in der Nähe einer Gleichgewichtslage", J. Reine Angew. Math., 127 (3/4), 1904, pp. 179–276; online (paywall).
 Jacques Hadamard, "Note sur quelques applications de l’indice de Kronecker", in Jules Tannery, Introduction à la théorie des fonctions d’une variable (Volume 2), 2nd edition, A. Hermann & Fils, Paris 1910, pp. 437–477; online (facsimile)
 Brouwer, L. E. J., "Über Abbildung von Mannigfaltigkeiten", Mathematische Annalen, Vol. 71, 1912, pp. 97–115; online (paywall, facsimile).
 There is a comparison of Hadamard's and Brouwer's work on fixed point topology in chapter 13 of Vladimir Maz'ya and Tatyana Shaposhnikova, Jacques Hadamard, A Universal Mathematician, American Mathematical Society, 2000. They quote Donald M. Johnson: "Hadamard's Note is markedly similar to Brouwer's classic paper defining the degree of a mapping ... Yet there is hardly any doubt that Brouwer's is the superior work. Whereas Hadamard's Note stands at the end of a great line of mathematical development, Brouwer's great paper looks forward to new avenues of topological thinking".
 The contention that there is no constructive proof of 'BFPT' goes very deep. See this at mathoverflow, for example.
 A very nice discussion by Phil Wilson of Brouwer and constructivist mathematics is given here by plus magazine.
 The wikipedia page for this theorem has good coverage of the necessity of its hypotheses.
 This theorem is the choice of Francis Su in Episode 20 and of Holly Krieger in Episode 25 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 19: Dilworth's Theorem

 Original source for this theorem is: Dilworth, Robert P., "A decomposition theorem for partially ordered sets", Annals of Mathematics, 51 (1), 1950, pp. 161–166; online (paywall). Cameron's attribution to Gallai and Milgram also appears on the former's wiki page with the citation P. Erdős: "In memory of Tibor Gallai", Combinatorica, 12 (1992), pp. 373–374; online (paywall). The earlier result of Erdős–Szekeres is given in Erdős, P. and Szekeres, G., "A combinatorial problem in geometry", Compositio Mathematica, Tome 2 (1935), pp. 463–470; online; and boasts its own wikipedia page.
 The current illustraion of this theorem replaces one based on snooker balls which was less informative but to which I may as well retain a link.
Theorem no. 20: The Merton College Theorem

 Francesca LovellRead offers a good historical account of this theorem.
Theorem no. 21: Brun's Theorem
 The original source of this theorem is Brun, Viggo (1919). "La série 1/5+1/7+1/11+1/13+1/17+1/19+1/29+1/31+1/41+1/43+1/59+1/61+..., où les dénominateurs sont nombres premiers jumeaux est convergente ou finie", Bulletin des Sciences Mathématiques, 43: avril, pp. 100–104, mai, pp. 124–128; online.
 Brun's \(cx/(\ln x)^2\) bound on the twin prime count is very well motivated by James Maynard in his PROMYS Europe 2015 lecture "Patterns in the Primes" which can be found here (last time I checked, Firefox complained about the security of the pdf download, so proceed with caution).
 Brun's sieve received less attention in the early 1900s than it perhaps deserved. In the introduction to George Greaves, Sieves in Number Theory, we read that "The mathematical community did not immediately give Brun's results the recognition they later received. E. Landau left Brun's 1920 paper unread for a decade, apparently because he was not predisposed to believe that elementary methods as used by Brun could penetrate problems such as Goldbach's to the asserted extent". And in the introduction to Heini Halberstam and HansEgon Richert's classic, Sieve Methods, we read "its complicated structure and Brun's own early accounts tended to discourage closer study", and that Landau, writing an account of the method, commented "Myself I have never bothered to thoroughly work through Mr. Brun's original work" (google's translation).
 Euler proved the divergence of the prime reciprocals in 1737, the climax of the paper which introduced the product formula for the zeta function (see theorem 246).
 The first 100000 twin primes are listed here.
Theorem no. 22: Cantor's Uncountability Theorem

 There is an interesting discussion about the earliest apparence of the diagonal argument in Cantor's work here at mathoverflow.net.
 This theorem is the choice of Skip Garibaldi in Episode 34 and of Yoon Ha Lee in Episode 61 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 23: Cantor's Theorem

 The popular story has it that thinking about infinity, and attacks by others (notably Kronecker) on his thinking about infinity, drove Cantor insane. A much less sensational view is given in this blog entry by Richard Zach and this by Thony Christie.
Theorem no. 24: Kuratowski's Theorem

 Original source for this theorem: it was announced in K Kuratowski, "Sur les courbes gauches", Annales Polonici Mathematici, 8, 1929, p. 324. The publication, with proof, followed in "Sur le problème des courbes gauches en topologie", Fundamenta Mathematica, 15, 1930, pp. 271–283; online. The abstract of Frink and Smith's unpublished proof appears here (Bull. AMS, 36, 3, p. 214) under 'Abstracts of papers', where it is no. 179 (but merely says "One of the results of this paper is a simple necessary and sufficient condition that an arbitrary linear graph be mappable on a plane.")
 The multiple discoveries and proofs of this theorem are wonderfully charted in John W Kennedy,
Louis V Quintas,
Maciej M Sysłois, "The theorem on planar graphs", Historia Mathematica,
Vol. 12, No. 4, 1985, 356–368; online.
 Bill Tutte, under the Blache Descartes pseudonymn wrote a little poem about the nonplanarity of \(K_{33}\) which may be read here.
Theorem no. 25: Wagner's Theorem

 Original source for this theorem is Wagner, K., "Über eine Eigenschaft der ebenen Komplexe", Math. Ann., 1937, 114: pp. 570–590; online (paywall), at Göttinger Digitalisierungszentrum.
Theorem no. 26: Euler's Polyhedral Formula

 Plus magazine has this lovely account of Euler's formula by Abigail Kirk. Joe Malkevitch's account for the AMS's Feature Column is also excellent and has extensive references. And he has a Part II which describes many applications and ramifications of the formula.
 It has been argued that Descartes discovered the polyhedral formula before Euler but the balance of opinion seems to be that his version, which does not recognise the significance of the polyhedral 'edge', cannot be considered equivalent. Tony Philips has more.
 A classic of the philosophy of mathematics is dervied from Euler's formula: Imre Lakatos, Proofs
and Refutations: The Logic of Mathematical Discovery, edited by Worrall
and Zahar, Cambridge University Press, 1976. A good account may be found here at the Stanford Encyclopedia of Philosophy.
Theorem no. 27: The Pythagorean Theorem
 It would seem that even 'Pythagorean' is a misnomer for this theorem. Piers BursillHall
gave me the following valuable pointers:
 A nice summary of the theorem's history is provided by Manjul Bharagava here.
 The theorem is equivalent to Euclid's notorious 5th axiom, the Parallel Postulate, in the sense that each may be derived directly from the other, a fact which seems to date back to Legendre. More details here. It is also equivalent to Heron's formula (see Theorem no. 76) as revealed in a beautiful article by Vaughan Pratt, "Factoring Heron," The College Mathematics Journal, Vol. 40, no. 1, January 2009, pp. 15–16; online.
 The fact that the theorem is Prop. 47 of Book 1 of Euclid makes this diagram by Thomson Nguyen of dependencies in Book 1 of interest!
 There is a beautiful account by Steven Strogatz of a particularly elegant proof of the Pythagorean Theorem attributable to and, Strogatz argues, bearing all the hallmarks of, Albert Einstein.
 Krzysztof Apt's elegant essay on Dijkstra's work "Edsger Dijkstra, the man who carried computer science on his shoulders", Inference, vol. 5, issue 3, 2020; online, gives some background to Dijkstra's discovery of his generalisation of the Phythagorean theorem. Further interesting commentary by Jan Stevens can be found here.
 A wonderful java applet proof of the theorem has been created by Jim Morey. It is online here but nowadays the chances are you will struggle to make your browser run it.
 This theorem is the choice of Henry Fowler in Episode 7 and of Fawn Nguyen in Episode 39 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 28: Ramsey's Theorem

 Original sources for this theorem:
 Ramsey, F., "On a problem of formal logic", Proc. London Math. Soc., Vol. s230, Issue 1, 1930, pp. 264–286; online (paywall,
facsimile 1.4MB pdf, March 2021).
 Paul Erdős and George Szekeres, "A combinatorial problem in geometry", Compositio Mathematica, 2, 1935, pp. 463–470; online.
 Thoralf Skolem also gives an early proof of Ramsey's theorem in "Ein kombinatorischer Satz mit Anwendung auf ein logisches Entscheidungsproblem", Fundamenta Mathematicae, 20, 1933, pp. 254–261; online. This is not an independent discovery though—Skolem opens by citing Ramsey's paper.
 Improvements on the original Erdős–Szekeres upper bound (stated, as in our page, for \(R(s+1,t+1)\), to simplify the binomial coefficient) are reviewed in this latest (May 2020) advance by Ashwin Sah. The latest bounds (March 2017) for \(R(5)\) are due to Vigleik Angeltveit and Brendan D. McKay.
 For lower bounds see Gil Kalai's blog entry on a breakthrough (September 2020) by Asaf Ferber and David Conlon
 An account of Erdős and Ramsey theory is given by Ronald L. Graham and Joel Spencer in the centennial reflections here (from p. 132).
 A very interesting account of early precursors to Ramsey's theorem is offered by the Computational Complexity blog. This is also the focus of Alexander Soifer, The Mathematical Coloring Book: Mathematics of Coloring and the Colorful Life of its Creators, Springer, 2008.
 More on the famous Erdős quote by Evelyn Lamb.
 This theorem is the choice of Yen Duong in Episode 31 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 29: Gauss's Law of Quadratic Reciprocity

 Özlem Imamoğlu's review, Bull. Amer. Math. Soc.,
Vol. 44, No. 4, 2007, 647–652, online, of Franz Lemmermayer's Reciprocity Laws: From Euler to Eisenstein, Springer, 2000, is a superb miniessay on reciprocity beyond quadratic.
Theorem no. 30: The Law of Large Numbers

 The theorem is described here in elementary terms, as would have been understood by Laplace himself. An excellent modern account in terms of measure theory is given by Terence Tao here.
 I cannot resist linking to "The Law of Small Numbers", Jonathan Kujawa's elegant centenary homage to Richard Guy in 3quarksdaily.
Theorem no. 31: Benford's Law

 More detail on use of Benford by the IRS in the United States is given here. The Wikipedia entry has a good entry on forensic aspects of Benford. Another compelling forensic application is Daniel Gamermann and Felipe Leite Antunes, "Evidence of Fraud in Brazil's Electoral Campaigns Via the Benford's Law",
online.
 An interesting occurrence of Benford is in the frequencies of leading digits in base 10 representations of powers, e.g. \(2^k,k=0,1,\ldots\). See Theorem 299 (notes(5)) and also "A simple answer to Gelfand’s Question" by
Jaap Eising, David Radcliffe and Jaap Top, The American Mathematical Monthly, March 2015, 234–245, online here. Tangentially, we can ask whether prime numbers obey Benford.
Theorem no. 32: The Green–Tao Theorem on Primes in Arithmetic Progression

 Original source for this theorem is Green, Ben and Tao, Terence, "The primes contain arbitrarily long arithmetic progressions", Annals of Mathematics, Vol. 167 , no. 2, 2008, pp. 481–547; online.
 Green and Tao's achievement is described by Bryna Kra as "an amazing fusion of methods from analytic number theory and ergodic theory" in his technical overview of their proof "The GreenTao Theorem on arithmetic progressions in the primes: an ergodic point of view", Bull. Amer. Math. Soc. 43 (2006), 3–23; online. There is a nice overview by Ben Green here (p. 10, 11MB pdf file). Tao has collected some surveytype presentations at various levels here.
Theorem no. 33: The Prime Number Theorem

 See also Benjamin Fine and Gerhard Rosenberger, "An Epic Drama: The Development of the Prime Number Theorem", Scientia Series A: Mathematical Sciences, Vol. 20 (2010), 1–26; online.
 A fine general historical account of the prime number theorem by Tom M. Apostol, "A centennial history of the prime number theorem", Engineering and Science, No. 4, 1996, pp. 19–28, is online here (3.4MB pdf). The classic account by Don Zagier of "Newman's short proof of the prime number theorem", American Mathematical Monthly, vol. 104 (1997), pp. 705–708, is online here.
 There is a brief discussion of heuristic explanations for the Prime Number Theorem (notably the one by Greg Martin) here.
 If \((\bar{x},\bar{y})\) is the centre of mass of the arc of \(y=\log(x)\) in the interval \([1,x]\) then \(\frac12\pi(x)\) is asymptotic to \(\bar{x}/\bar{y}\) (see M500 magazine, issue 260, pp. 10–12).
 Regarding the famous 'elementary proof' of PNT see Norman Levinson's "A motivated account of an elementary proof of the prime number theorem", American Mathematical Monthly, vol. 76, 1969, pp. 225–245; online. The unfortunate associated priority dispute is meticulously documented by Dorien Goldfeld in "The elementary proof of the prime number theorem: an historical perspective", in David Chudnovsky, Gregory Chudnovsky and Melvyn Nathanson (eds.), Number Theory
New York Seminar 2003, Springer, 2004; online via Goldfeld's webpage (under preprints). It includes the observation that Tchebychef had given an elementary proof in 1852 that \(x/\log x\) is the correct order of magnitude for \(\pi(x)\). Both the proof and dispute are given a nontechnical overview by Joel Spencer and Ronald Graham, "The Elementary proof of the prime number theorem", Mathematical Intelligencer, vol. 31 (3), June 2009, 18–23.
 A number of other elementary proofs of PNT have been found. The latest (June 2020) is by Florian Richter and his paper gives a short history of PNT and elementary proofs of it.
 There are formal proofs of PNT:

the Erdős–Selberg elementary proof: Jeremy Avigad, Kevin Donnelly, David Gray, and Paul Raffand, "A formally verified proof of the prime number theorem", ACM Transactions on Computational Logic, Vol. 9 Issue 1, December 2007; online preprint (and see here for a nice overview presentation by Avigad) and
 the classical complex analysis proof: John Harrison, "Formalizing an analytic proof of the prime number theorem", Journal of Automated Reasoning, vol. 43, pp. 243–261, 2009; online.
 Posted on twitter by Tamàs Görbe this attractive the corollary of the PNT: \(\lim_{n\rightarrow\infty} (p_1\times \ldots \times p_n)^{1/p_n}=e.\)
\begin{align*}\textrm{Exponentiate both sides of: } \frac{1}{p_n}\sum_{k=1}^{n}\log p_k &\sim \frac{1}{n\log n}\sum (\log k+\log\log k) \hspace{.3in}\textrm{ (PNT)}\\
&\sim \frac{1}{n\log n}(n\log n  n) \hspace{.3in}\textrm{ (Stirling)}\\
&\sim 1.
\end{align*}
Theorem no. 34: The First Isomorphism Theorem

 The wiki entry for the isomorphism theorems gives as source Emmy Noether's paper "Abstrakter Aufbau der Idealtheorie in algebraischen Zahl und Funktionenkörpern", Mathematische Annalen, vol. 96 (1927) pp. 26–61; online (paywall); at Göttinger Digitaisierungszentrum.
Theorem no. 35: The Second Isomorphism Theorem

 This page replaces an earlier version combining the 2nd and 3rd isomorphism theorems with an illustration based on their superficial similarity to rules for arithmetic with fractions. I've left a copy here (opens in new tab). The 3rd isomorphism theorem is now presented seperately as Theorem no. 253.
 The example given is a special case of the application given by P.M. Cohn in Algebra, Volume 1 (which I assume is carried over to Classic Algebra although I haven't checked): if a subgroup \(H\) of \(\mbox{Sym}_n\) has any odd permutations then its even permutations form a normal subgroup of index 2 in \(H\).
 Another depiction of the Cayley table of Frobenius 20, together with much other valuable information is given here by the beguiling website escarbille.free.fr.
Theorem no. 36: Euler's Identity

 The MacTutor Archive entry for Benjamin Peirce records his charming comment on Euler's identity: "Gentlemen, that is surely true, it is absolutely paradoxical; we cannot understand it, and we don't know what it means. But we have proved it, and therefore we know it is the truth."
 The symbol used in the illustration of this theorem is on loan from Michael Hartl, with thanks.
Theorem no. 37: Girard's Theorem

 Fix the area \(T\) of a spherical triangle and invert Girard's formula to give \(A+B+C=T/r^2+\tau/2\). Now let radius \(r\) tend to infinity: we recover a triangle in the Euclidean plane whose angles sum to \(\tau/2\), as expected.
 Attribution of this theorem to Harriot can be found in chapter 2 of Roger Penrose, The Road to Reality: A Complete Guide to the Laws of the Universe,
Vintage, 2005; and in chapter 10 of David S. Richeson, Euler's Gem: The Polyhedron Formula and the Birth of Topology, Princeton University Press, 2008. I have seen it given to Legendre (e.g.) but Legendre's result, published in 1798, much later than Girard, approximates the difference between angles in a spherical triangle and angles in a plane triangle having the same side lengths. A good account is here. (Legendre did not, in any case, claim the result as his.)
Theorem no. 38: Lucas' Theorem

 Romeo Meštrović has compiled a fine survey of applications and extensions of Lucas's theorem.
 A generalisation of Lucas's theorem is given by Andrew Granville in his dynamic esurvey Arithmetic properties of Binomial Coefficients.
Theorem no. 39: Pascal's Rule

 There is one version of Pascal's triangle which is indeed triangular: where the entries are reduced modulo 2. In this case the pattern which emerges is a version of Sierpinski's gasket, see Wolfram, S., "Geometry of binomial coefficients, American Mathematical Monthly, vol. 91, no. 9, 1984, pp. 566–571; online.
 Pascal's triangle as it is usually displayed has sides which are parabolic, that is, quadratic in \(n\). The easiest way to confirm this is perhaps to estimate the sum of the digits in the \(n\)th row using the normal curve approximation. This gives \({n\choose k}\approx\dfrac{2^{n+1}}{\sqrt{\tau n}}e^{(2kn)^2/2n}\). Taking logs and summing over \(k\) gives highest terms of order \(n^2\).
 Amazingly a simple relationship between Pascal's triangle and \(e=2.71828\ldots\) appears to have been noticed for the first time, by Harlan J. Brothers, only in the twentyfirst century.
Theorem no. 40: Stirling's Approximation

 A very nice introduction to Stirling's approximation is Finbarr Holland, "A leisurely elementary treatment of Stirling’s formula", Irish Mathematical Society Bulletin, 77, Summer 2016, pp. 35–44; online.
 A good source of information on the central binomial coefficients is the corresponding entry at oeis.org, where the sequence is no. 984.
 A nice application of Stirling in number theory may be found at (Theorem 33, notes(8))
Theorem no. 41: Lagrange's Theorem

 There is a very fine presentation "Some prehistory of Lagrange’s Theorem in group theory: 'The number of values of a function'" by Peter M. Neumann for the Mathematical Association (whose sometime president he was). I didn't find a link on the MA's website but a pdf download is here (.8MB)
Theorem no. 42: Zeckendorf's Theorem

 It would appear that Zeckendorf's theorem was first published in C. G. Lekkerkerker, "Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci", Simon Stevin, 29 (19511952), 190–195. I have not seen this paper but Daykin in a 1960 paper (see note 3 below) says "Zeckendorf's proof is given by C. G. Lekkerkerka (sic) in [1]," ([1] being the Simon Stevin paper). Zeckendorf himself published his theorem in 1972 in "Representation des nombres naturels par une somme de nombres
de Fibonacci ou de nombres de Lucas", Bull. Soc. Royale Sci. Liege 41 (1972) 179–182. I haven't seen this paper either (the Bulletin de la Société Royale des Sciences de Liège is online but not all issues appear to be digitised).
 A good presentation of Zeckendorf's and Lekkerkerker's theorem is given here by Steven J. Miller. Another good source on Lekkerkerker's theorem is Jukka Pihko, "On Fibonacci and Lucas representations and a theorem of Lekkerkerker", Fibonacci Quarterly, vol. 23, no. 3 (1988), 256–261, online here.
 The exact value of the average number of Zeckendorf summands, over the interval \([F_{n+1},F_{n+2})\), as approached asymptotically by Lekkerkerker's theorem, is \(L_n=1+\varepsilon(n)/F_n\), where \(\varepsilon(n)=\sum_{k=0}^{\lfloor(n1)/2\rfloor}k{n1k\choose k}\) (see Steven J. Miller's presentation). This apparently allows us to write, via the identity \(\varphi^2=1+\varphi\), the error in Lekkerkerker's ratio as the constant \(3/5\), thus: \(\lim_{n\rightarrow\infty}\left(L_n2n/(5+\sqrt{5}\,)\right)=3/5\). The function \(\varepsilon(n)\) can itself be written entiredly in terms of Fibonacci numbers:
$$\varepsilon(n)=\left\{\begin{array}{ccl}
\left(\frac{n}{2}\frac25\right)F_n\frac{n}{10}\left(F_{n1}+F_{n+1}\right) & &n \mbox{ even}\\
\frac15F_{n1}+\frac{n1}{5}(F_{n2}+F_n)&&n \mbox{ odd},
\end{array}\right.$$ alternatively,
$$\varepsilon(n)=\left\{\begin{array}{ccl}
\left(\frac{n}{2}\frac25\right)F_n\frac{n}{10}F_{2n}/F_{n} & &n \mbox{ even}\\
\frac15F_{n1}+\frac{n1}{5}F_{2n2}/F_{n1}&&n \mbox{ odd},\end{array}\right.$$
the ratio \(F_{2T}/F_T\) also being the \(T\)th Lucas number (A000032).
 David Daykin's paper is "Representation of Natural Numbers as Sums of Generalised Fibonacci Numbers", J. London Math. Soc., (1960) s135 (2): 143160. The first page is freeaccess here. A followup paper by Daykin appeared in Fibonacci Quarterly in 1969 and can be viewed here. A brief account of Daykin's uniqueness result is given in J. L. Brown, Jr., "Zeckendorf's theorem and some applications", Fibonacci Quarterly, vol. 2, no. 3, 1964, 163–168; online here.
 Garry J. Tee has contributed some interesting remarks on Zeckendorfbased arithmetic, which he investigated in "Russian Peasant Multiplication and Egyptian Division in Zeckendorf Arithmetic", Australian Mathematical Society Gazette, vol. 30, no. 5, 2003, 267–276:

"My algorithms for arithmetic in Zeckendorf arithmetic are much more efficient than any published previously, but they still cost very much more than binary arithmetic. I commented that, if more efficient algorithms for Zeckendorf addition and subtraction could be devised, then they could be used to give much more efficient algorithms for Zeckendorf multiplication and division.
The 2013 paper by Conner Ahlbach, Jeremy Usatine, Christiane Frougny and Nicholas Pippenger on Efficient Algorithms for Zeckendorf Arithmetic, Fibonacci Quarterly, 51(3):249–255 does give much more efficient algorithms for Zeckendorf addition and subtraction  but their main emphasis is on the depth of circuitry required." 

The paper can be viewed in pdf form (≈1.6MB) here (with kind permission of the Australian Mathematical Society).
 The universality of the Fibonacci sequence, restricted to just 1,1,2,3,5, is exploited in a clever clock by Philippe Chrétien, described here by Alex Bellos.
 There is a less sophisticated version of Colm Mulcahy's card trick here by Kiran Ananthpur Bacche.
 This theorem is the choice of Pamela Harris in Episode 64 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem blog. She talks about generalising Zeckendorf, work which can be read about here, for example.
Theorem no. 43: The DPRM Theorem
 Scholarpedia's article on
Matiyasevich theorem lists four articles which together comprise the original proof of this theorem, together amounting to less than 40 pages.
 Yuri Matiyasevich, Hilbert's
10th Problem, MIT Press, 1993, gives a complete selfcontained exposition of the proof of DRPM.
 Martin D. Davis, "Hilbert's tenth problem is unsolvable", The American Mathematical Monthly, vol. 80, (1973), pp. 233–269; online; is an excellent account of the proof of DPRM.
 A particularly charming and accessible example of a Diophantine set is the set of Fibonacci numbers: James P. Jones, "Diophantine Representation of the Fibonacci Numbers", Fibonacci Quarterly, Feb. 1975, 84–88; online here (3rd from bottom). Jones is also one of the team who produced a particularly compelling prime polynomial, via the fact that the set of primes is Diophantine: Douglas Wiens, James P. Jones, Daihachiro Sato, Hideo Wada, "Diophantine representation of the set of prime numbers", The American Mathematical Monthly, vol. 83, 1976, pp. 449464; online.
 There is a machine proof of this theorem: Dominique LarcheyWendling, Yannick Forster, "Hilbert's Tenth Problem in Coq"; arxiv.
 Jonathan Pila contributes this entry (with a fine A3 poster version) on Diophantine Equations to the Oxford Mathematics Alphabet.
Theorem no. 44: Pappus's Theorem
 Same comment as for Theorem 55 regarding Java; the app for this theorem, if you want to try, is here.
 See note (5) for Theorem 215 regarding Pappus's theorem becoming involved in twentieth century mathematics
Theorem no. 45: Binet's Formula
 The role of the Fibonacci sequence in the resolution of Hilbert's tenth problem is given a characteristically beguiling treatment by Evelyn Lamb here.
 It may be observed that computing \(F_n\) for negative values of \(n\) continues to 'work', extending the Fibonacci sequence to the left: \(\ldots, 4,3,2,1,1,0,1,1,2,\ldots\).
 The 'square spiral' illustrating this formula is the basis for a lovely curve construction discovered by Edmund Harriss and described here.
 Audrey G. Bennett traces the Fibonacci sequence and spiral back to African architects and weavers in this Conversation article.
Theorem no. 46: Cameron's Theorem on DistanceTransitive Graphs
 Original sources for this theorem:
 N. L. Biggs and D. H. Smith, "On trivalent graphs", Bull. London Math. Soc., Vol. 3, Issue 2, 1971, pp. 155–158; online (paywall).
 P.J. Cameron, "There are only finitely many finite distancetransitive graphs of given valency greater than two", Combinatorica, Vol. 2, No. 1, 1982, pp. 9–13; online (paywall).
 Cameron's proof of this theorem depends on his proof, with Preager, Saxl and Seitz, of the Sims Conjecture which in turn relies on the Classification of the Finite Simple Groups. A CFSGfree proof was suggested by Cameron and completed by Richard Weiss, "On distancetransitive graphs", Bull. London Math. Soc., Vol. 17, Issue 3, 1985, pp. 253–256; online (paywall).
in 1985. The paper proving the Sims Conjecture is cited in (Theorem 65, notes(1))
 Questions regarding infinite distancetransitive graphs are generally open, except in the case of locally finite case which was solved by H. Dugald Mcpherson in 1982. This work and progress on the nonlocally finite case, as of 1998, is described by Cameron in "A census of infinite distancetransitive graphs", Discrete Mathematics, Volume 192, Issues 1–3, 1998, pp 11–26; online here.
 Our illustration of this theorem shows eight of the twelve 3regular distance transitive graphs. The full list is given in the relevant wikipedia entry, with pictures of each graph.
Theorem no. 47: The Binomial Theorem

 The definition of binomial coefficients to allow for arbitrary complex powers of the binomial can be generalised still further to allow both parameters to be complex, as explained here by John D. Cook. (This does not impact on the Binomial Theorem whose statement only features the 'top' parameter.)
 The original weblink for this theorem was an absorbing paper by Lawrence Neff Stout (1948–2012): "Aesthetic Analysis of Proofs of the Binomial Theorem" which was slated for but does not appear in, The Humanistic Mathematics Network Journal. It is available at academia.edu and I have uploaded a temporary copy here for quick reference.
Theorem no. 48: Beineke's Theorem on Line Graphs

 Original source for this theorem: Lowell W.Beineke, "Characterizations of derived graphs", J. Combinatorial Theory,
Vol. 9, Issue 2, 1970, pp. 129–135; online.
Beineke announced the result in 1968, according to this Wiki entry.
 The attribution to N. (presumably Neil) Robertson is found on p. 74 of Frank Harary, Graph
Theory, Westview Press, new edition, 1994 and is confirmed by HongJian Lai and Ľubomír Šoltés in their 2001 paper (see note (3)).
 The number of forbidden subgraphs for characterising line graphs can be lowered under slightly stronger conditions. Thus Ľubomír Šoltés proved in 1994 that, for a graph on at least 9 vertices, forbidden subgraphs V and IX can be ignored, bringing the number down to 7. Then in 2001, with HongJian Lai, he proved that just forbidden subgraphs I, VI and VIII are enough, provided that the graph being tested has minimum degree 7 and is not isomophic to two complete graphs sharing an edge (e.g. subgraph VII): HongJian Lai and Ľubomír Šoltés, "Line Graphs and Forbidden Induced Subgraphs", Journal of Combinatorial Theory, Series B, Vol. 82, Issue 1, 2001, pp. 38–55; online.
Theorem no. 49: Netto's Conjecture (Dixon's Theorem)

 Original sources for this theorem:
 Eugen Netto's conjecture appears in his 1882 book Substitutionentheorie und ihre Anwendung auf die Algebra, for which there is a page in the Mactutor Archive. More details can be found in section 2.1 of the excellent survey Timothy C. Burness, "Simple groups, generation and probabilistic methods, in C. M. Campbell, C. W. Parker, M. R. Quick, E. F. Robertson and C. M. RoneyDougal (eds.), Groups St Andrews 2017 in Birmingham, LMS Lecture Notes Series, Volume 455, CUP, 2019.
 John D. Dixon, "The probability of generating the symmetric group", Mathematische Zeitschrift, Vol. 110, 1969, pp. 199–205; online (paywall).
 An earlier version of this page which explained permutation multiplication, rather than depicting the distribution of pairs generating \(S_n\) and \(A_n\), has been preserved here.
 A good overview of probabilistic group theory by John Dixon is given in this 2004 preprint.
 Sharper bounds on the probability of generating \(S_n\) and \(A_n\) are given in Attila Maróti and M. Chiara Tamburini, "Bounds for the probability of generating the symmetric and alternating groups", Archiv der Mathematik, Vol. 96, Issue 2, 2011, pp 115–121; online (paywalled); freeaccess preprint. These have been improved in Luke Morgan and Colva M. RoneyDougal, "A note on the probability of generating alternating or symmetric groups", Archiv der Mathematik, Vol. 105, Issue 3, 2015, pp 201–204; online (paywalled); freeaccess preprint.
 The plots on this page use the elements of \(S_n\) ordered by parity (even permutations before odd) then by number of moved points, then by value of first moved point. For \(S_4\), for example, this gives the listing \((),(1 2 3),(1 2 4),(1 3 2),(1 3 4),(1 4 2),(1 4 3),(2 3 4),(2 4 3),(1 2)(3 4),(1 3)(2 4),(1 4)(2 3),(1 2),(1 3),(1 4),(2 3),(2 4),(3 4),(1 2 3 4),(1 2 4 3),(1 3 4 2),
(1 3 2 4),(1 4 3 2),(1 4 2 3) \).
 The numbers of permutation pairs generating \(S_n\) is sequence A071605 at oeis.org.
Theorem no. 50: The Euler–Hierholzer "Bridges of Königsberg"
Theorem

 An account of this theorem is to be found on page 33ff of the 2nd edition of Edouard Lucas's Récréations Mathématiques, vol. 1, published in 1891, the year of Lucas's tragically early death. The sufficiency of all degrees even is dealt with in a note on page 223 and is, according to this Wikipedia entry, essentially Hierholzer's 1873 argument. Lucas lists in his bibliography the article of Fleury giving an alternative method of construction. This article is a response to Lucas's Récréations Mathématiques entry and proposes a first solution to the construction problem in the version of drawing a figure in one continuous line. I have not seen Lucas's 1st edition to see if his entry differs in the light of Fleury's article. You can see Lucas's 2nd edition online here and Fleury's article is online here (p.257ff).
 There is a nice reallife application to snow ploughing described here (which is actually the Chinese Postman problem but Euler tours are the background).
 And the traditional puzzle is solved, with very nice graphics, for New York here.
Theorem no. 51: A Theorem of Melody Chan on Group Actions

 Original source for this theorem (which is the weblink from the theorem page) is Melody Chan, "The maximum distinguishing number of a group", Electronic J. Comb., vol. 31 (2006), paper R70; online. The theorem in question is Theorem 3.1.
 Chan's theorem concerns group actions with distinguishing number 2, a subject taken further in, for example, Marston Conder and Thomas Tucker, "Motion and distinguishing number two", Ars Mathematica Contemporanea, vol. 4, no. 1 (2011), pp. 63–72; online.
Theorem no. 52: The RobertsonSeymour Graph Minors Theorem

 The original sources for this theorem are detailed in its Wiki page. The article directly addressing the theorem presented here is Neil Robertson and P.D. Seymour, "Graph Minors. XX. Wagner's conjecture", J. Comb. Theory, Series B, Vol. 92, Issue 2, 2004, pp. 325–357; online.
 There is a very nice semitechnical overview of 'RobertsonSeymour' theory by Lovász here. Graph minors theory is also very well explained in a series of blog posts by Jim Belk.
Theorem no. 53: Bailey's Theorem on Latin Squares

 The original source for this theorem is R.A. Bailey, "QuasiComplete Latin Squares: Construction and Randomization", Journal of the Royal Statistical Society. Series B (Methodological) Vol. 46, No. 2 (1984), pp. 323–334; online (paywalled).
 Bailey's conjecture on terraces has been shown by Matt Ollis to hold for groups of all orders up to 511 with the possible exception of 256 and 384. Ollis also has this on the conjecture for abelian groups: "On terraces for abelian groups", Discrete Mathematics, Vol. 305, Issues 1–3, 6 December 2005, pp 250–263; online.
 The original web link for this theorem description was this entry at the Encyclopedia of Design Theory which remains a valuable resource but now hosted, and perhaps ephemerally, by Leonard Soicher at Queen Mary University of London, the domain designtheory.org having ceased to exist.
Theorem no. 54: The Bose Equivalence Theorem in
Design Theory

 The independent publications of this theorem are
 E.H. Moore, Tactical memoranda I–III. Amer. J. Math. vol. 18, no. 3, 1896, pp. 264–303; online;
 Raj Chandra Bose, "On the application of the properties of Galois fields to the problem of construction of hyperGraeco Latin squares”, Sankhyā, vol. 3, pt. 4, 1938, 323–38; online (paywalled);
 W.L. Stevens, The completely orthogonalized Latin Square", Annals of Eugenics, vol. 9, issue 1, 1939, pp. 82–93; online.
Theorem no. 55: Miquel's Triangle Theorem

 Original source for this theorem: Miquel, A., "Théorèmes de géométrie", Journal de Mathématiques Pures et Appliquées, Tome 3, 1838, pp. 485–487; online (facsimile by Gallica). The theorem is the 2nd of two corollaries (réciproques) to Theorem 1. The figures appear at the end of the volume with figure 1 (pertaining to the Theorem 1, appearing here).
 I used to link to the geometry app that created this theorem's illustration. But guaranteeing that a Java app will work for everyone's favourite browser has become too painful. And the app wasn't all that exciting anyway, so it is no longer maintained. But you can look at it here — last time I checked it worked in Internet Explorer.
 Some brief biographical details are given by JeanLouis Ayme here.
Theorem no. 56: Morley's Miracle

 Original source for this theorem is F. Morley, "On the metric geometry of the plane nline", Trans. Amer. Math. Soc., Vol. 1, No. 2, 1900, pp. 97–115; online. However, there is famously no mention in this paper of equilateral triangles, or triangles of any sort. A fine discussion of how and why, and when, the triangle theorem emerged is given beginning here, at cuttheknot (the main article is the weblink from the theorem page). Another important source is Cletus O. Oakley and Justine C. Baker, "The Morley trisector theorem", The American Mathematical Monthly, Vol. 85, No. 9, 1978, pp. 737–745; online (paywall). Clark Kimberling has a valuable page on Morley which is also a good source for this theorem.
 This theorem page replaces a previous one which linked to an animation created using David Joyce’s Geometry Applet package. Such (Java) animations have become problematical for many browers and much better animations are easy to find on the web.The app for this theorem, if you want to try to see it it works, is here, and the original theorem page has been retained here. In passing, the addition of angle bisectors in the new illustration, the depicted triangle having vertices \((4,2), (1,9), (10,1)\), suggests some common bisectortrisector intersections but I suppose these are coincidental. The vertices \((4, 2), (7, 8),(12, 1)\), for example, show no such intersections.
 One of Morley's sons, Frank V. Morley, was also a mathematician but returned to England and became a director, with
Geoffrey Faber and T.S. Eliot, of Faber & Gwyer, later Faber &
Faber ("Morley, Faber and Eliot would sometimes communicate in exchanges
of light verse." John Mullan writing in The Guardian, September 25,
2004, here (paragraph 6). The link to Morley's mathematics is tenuous but there is more on Morley junior at Faber and Faber here.
Theorem no. 57: The Birkhoff–von Neumann Theorem

 Original sources for this theorem:
 Birkhoff, Garrett, "Tres observaciones sobre el algebra lineal", Univ. Nac. Tucumàn. Revista A., 5, 1946, pp. 147–151; not online, I think.
 J.. von Neumann, "A certain zerosum twoperson game equivalent to an optimal assignment problem", Ann. Math. Studies, Vol. 28, Contributions to the Theory of Games (AM28), Volume II, 1953, pp. 5–12; online (paywall).
 The convex polytope exhibited in this theorem is usually referred to as the Birkhoff polytope, despite the fact that it appears to have been known about at least fifty years earlier. It has a wikipedia page where more on the historical background can be found.
Theorem no. 58: Galois' Theorem on Finite Fields

 The recommended weblink for this page was previously some notes by Peter Cameron on finite fields, posted at designtheory.org. This website currently (Auguest 2018) does not exist and its content is being hosted by Leonard Soicher at Queen Mary University of London (which is the link given in the previous sentence). The notes on finite fields are thus still available but I have preferred to use a more permanent link on the theorem page itself.
 It would be ahistorical to say something like "Galois showed constructively that finite fields have prime power order; and Eliakim Moore showed that this construction was the only one possible". Moore, a pioneer of abstract algebra proved the structure theorem which says "this is what finite fields look like". But this answers a question which would not have occurred to Galois! A detailed study of the work which led up to and beyond Moore is given by Frederic Brechenmacher in "A history of Galois fields", 2012, <hal00786662>. The definitive analysis of the Galois archive is of course Peter M. Neumann's The Mathematical Writings of Evariste Galois, European Mathematical Society, 2011.
 There is a wonderful knitted GF(16) by the woolythoughts blog.
Theorem no. 59: Germain's Theorem

 A detailed account of Germain's work on Fermat's Last Theorem has been given by Roger Thompson in the October and December 2016 issues of M500 magazine (pdf 600KB downloads).
 Although Germain's Theorem does not play a direct part in the eventual proof of FLT (Theorem 9) it has deep connections with other parts of number theory which are still of interest. For instance, it is connected with period lengths of modular Fibonacci sequences via 'Wall's Question' (see Theorem 235, notes(3)). There is also a connection to cryptography via the idea of 'strong' and 'safe' primes. See this, for example.
 The number and distribution of Germain primes are of interest in their own right, see here, for example.
 There is an interesting side story on the Germain–Gauss correspondence in this prizewinning essay by William C. Waterhouse. More on Gauss and Germain and on her work in number theory can be found in Raymond Flood's Gresham College lecture on the subject. Evelyn J. Lamb has this evocative piece which also has some useful onward links.
 There is a nice characterisation of Germain primes by Paolo Leonetti.
Theorem no. 60: The Strong Perfect Graph Theorem

 The qualifier 'strong' distinguishes this theorem from the Perfect Graph Theorem, also conjectured by Claude Berge and proved by László Lovász in 1972.
 Original source for this theorem: Maria Chudnovsky, Neil Robertson, Paul Seymour, Robin Thomas, "The strong perfect graph theorem", Annals of Mathematics, Vol. 164, Issue 1, 2006, 51–229. Online version.
 The 2001 Strong Perfect Graph Theorem for squarefree graphs (no induced 4cycles) of Michele Conforti, Gérard Cornuéjols and Kristina Vušković is proved in "SquareFree Perfect Graphs", Journal of Combinatorial Theory B, 90 (2004) 257–307; online.
Theorem no. 61: Moufang's Theorem

 Original source for this theorem: Moufang, R., "Zur Struktur von Alternativkörpern", Math. Ann., 110, 1935, pp. 416–430; online (paywall, facsimile).
 For a proof of Moufang's theorem see Aleš Drápal, "A simplified proof of Moufang's theorem", Proc. AMS, Vol. 139, No. 1, 2011, 93–98; online.
 The smallest nonassociative Moufang loop has order 9 — see Theorem no. 114; the octonions form a loop of order 16 (the negated elements are omitted from our multiplication table for conciseness) and there are four other nonassociative Moufang loops of this order, see the paper by Orin Chein here. The sequence of numbers of nonassociative Moufang loops is A090750.
 The question of whether nonMoufang loops may still obey Moufang's Theorem is addressed here by Izabella Stuhl.
Theorem no. 62: The Marriage Theorem and The Frobenius–Kőnig Theorem

 Original source for Hall's marriage theorem is P. Hall, "On representatives of subsets", J. London Math. Soc., Vol. s110, Issue 1, January 1935, pp. 26–30; online (paywall). Hall distinguishes his result from Kőnig's (graphtheoretic) version because he is choosing representatives from sets which are not necessarily of the same size. So our presentation is, superficially, a simplification.
 Original source for Kőnig is D. Kőnig, "Über Graphen und ihre Anwendungen auf Determinantentheorie und Mengenlehre", Math. Annalen, Vol. 77, Issue 4, December 1916, pp. 453–465; online (paywall); at Göttinger Digitaisierungszentrum.
 A valuable source on Kőnig's work in graph theory is the doctoral dissertation (Université ParisDiderot  Paris VII, 2010) of Mitsuko Wate Mizuno, "The works of KONIG Dénes (1884–1944) in the domain of mathematical recreations and his treatment of recreational problems in his works of graph theory"; online. I read there that Kőnig's paper (note 2) is given a n English translation in Norman L. Biggs, E. Keith Lloyd and Robin
J. Wilson, Graph
Theory, 17361936, Clarendon Press, 1986. I have not seen this.
 Peter Cameron explains Philip Hall's interest in this theorem as a group theorist.
Theorem no. 63: Thales' Theorem

 The story of Thales sacrificing an ox is apocryphal. It has also been attached to Pythagoras. See here for more details.
Theorem no. 64: Neumann's Separation Lemma

 Peter Neumann published his lemma in "The structure of finitary permutation groups", Archiv der Mathematik, Vol. 27, Issue 1, 1976, 3–17; online; appearing in December. The extension to finite permutation groups beat it into print, appearing in October: B.J. Birch, R.G. Burns, Sheila Oates Macdonald and Peter M. Neumann, "On the orbitsizes of permutation groups containing elements separating finite subsets", Bull. Austral. Math. Soc., Vol. 14, Issue 1, 1976, 7–10; online. I think Peter Neumann must have told me that both results were discovered in 1974.
Theorem no. 65: Sims' Conjecture

 Original source for this theorem: P. J. Cameron, C. E. Praeger, G, J. Saxl and G. M. Seitz, "On the Sims Conjecture and distance transitive graphs", Bull. London Math. Soc, Vol. 15, Issue 5, 1983, pp. 499–506; online (paywall, 900KB pdf download, April 2021).
 The theorem is very well placed in context by this short article by Cheryl Preager in Asia Pacific Mathematics Magazine.
 See notes to Theorem 46 for an important application of Sims' conjecture.
Theorem no. 66: An Erdős–Ko–Rado Theorem on Intersecting Permutations

 Original sources for this theorem:
 Péter Frankl and Mikhail Deza, "On the maximum number of permutations with given maximal or minimal distance", J. Combin. Theory Ser. A, Vol. 22, Issue 3, 1977, pp. 352–360; online.
 P. J. Cameron and C. Y. Ku, "Intersecting families of permutations", European J. Combin., Vol. 24, Issue 7, 2003, pp. 881–890; online.
 B. Larose and C. Malvenuto, "Stable sets of maximal size in Knesertype graphs", European J. Combin., Vol. 25, Issue 5, 2004, pp. 657–673; online.
 The woolythoughts blog offers an alternative view of \(S_4\). They have also done \(S_5\).
Theorem no. 67: Reidemeister's Theorem

 Original sources for this theorem:
 Reidemeister, Kurt, "Elementare Begründung der Knotentheorie", Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, 5 (1), 1927, pp. 24–32; online (paywall).
 J.W. Alexander and G.B. Briggs, "On types of knotted curves", Annals of Mathematics, Vol. 28, No. 1/4,1926–1927, pp. 562–586; online (paywall, 1.8MB pdf, March 2021).
 The independent discovery of the Reidemister moves is analysed in M. Epple, “Knot invariants in Vienna and Princeton during the 1920s: epistemic configurations of mathematical research”, Science in Context, Vol. 17, No. 1/2, 2004, pp. 131–164; online (paywall, reprint March 2021)
 Valuable resources for visualising knots and for knot theory generally are to be found at the knotplot website.
Theorem no. 68: The Stable Marriage Theorem

 Original source for this theorem: Gale, D. and Shapley, L. S., "College admissions and the stability of marriage", Rand Report no. P2240, 1961; online. Published in American Mathematical Monthly, 69 (1), 1962, pp. 9–14; online (paywall, but pdf downloads are easy to find by searching for the paper title).
 Closely related is the economist Thomas Schelling's work on segregation and diversity. A compelling illustration of these ideas is provided in animated form by The Parable of the Polygons by Vi Hart and Nicky Case.
 An excellent account of Lloyd Shapley and his work is given here by Joseph Malkevitch.
 An account of reallife matefinding algorithm is given here by Plus magazine.
Theorem no. 69: Arrow's Impossibility Theorem

 Original sources for this theorem: Arrow, Kenneth J., "The possibility of a universal social welfare function", RAND Report P41, 1948; online; Arrow, Kenneth J.,"A Difficulty in the Concept of Social Welfare", J. Political Economy, Vol. 58, No. 4, 1950, pp. 328–346; online (paywall, but copies are not hard to find online, e.g. on the Wiki page for the theorem, April 2021).
 Arrow's theorem as a result in social science belongs to the general area of social choice theory. From this perspective, a good account (English and German versions) is given here (Snapshot no. 9/2015) by Victoria Power (also available in German).
 And something less serious!
 This theorem is the choice of Belin Tsinnajinnie in Episode 56 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 70: The 123 Conjecture

 Original source for this theorem: Michal Karoński, Thomasz Łuczak and Andrew Thomason, "Edge weights and vertex colours", J. Combin. Theory Ser. B, 91 (1), 2004, pp. 151–157; online. Subsequent progress up to 2012 is charted in Ben Seamone, "The 123 Conjecture and related problems: a survey", arxiv (which is the weblink from the theorem page).
 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not be \(K=3\).
 Jakub Przybyło has proved that the conjecture holds for regular graphs of sufficiently high degree and that \(K\leq 4\) for all regular graphs of degree 2 or more: "The 1–2–3 Conjecture almost holds for regular graphs", J. Comb. Theory, Series B,
Vol. 147, 2021, pp. 183–200; online (paywall, arxiv).
Theorem no. 71: Sharkovsky's Theorem

 Original source for this theorem: Sharkovsky, A. N., "Coexistence of cycles of a continuous mapping of the line into itself", Ukrain. Mat. Zh., 16 (1), 1964, pp. 61–71. The Russian citation can be found on the Ukranian Wiki page for the theorem. There is an English translation of the paper: J. Tolosa, International Journal of Bifurcation and Chaos, Vol. 05, No. 05, 1995, pp. 1263–1273; online (paywall). Various online links to the Russian original appear to be broken.
 @jakebarrrett has alerted me to the fact that a number of elementary proofs of this theorem are available. See, e.g., Chapter 1 of Louis Block & William Coppel, Dynamics in One Dimension, Springer, 2009. Also some notable treatments by BauSen Du which can be accessed online via his arxiv listing.
 The crucial (but weaker) result that period 3 implies periods of all orders is discovered independently by TienYien Li and James A. York, "Period Three Implies Chaos", The American Mathematical Monthly, Vol. 82, No. 10 (Dec., 1975), pp. 985–992; online (paywalled); there is a good description of their theorem by Padraic Bartlett here. Sharkovsky published his result in Russian at the height of the war and it did not receive widespread attention until the 1970s—see his MacTutor entry.
 A number of valuable reprints of classics from the dynamical systems literature are available here, including Robert May's 1976 Nature article dealing with the logistic function
Theorem no. 72: MacWilliams' Identity

 Original source for this theorem: F.J. MacWilliams, "A theorem on the distribution of weights in a systematic code", Bell System Techn. J. , Vol. 42, Issue 1, 1963, pp. 79–94; online (paywall, facsimile).
 There is a onevariable version of this theorem corresponding to the nonhomoeneous polynomial obtained by setting \(y=1\) in the definition of \(W_C(x,y)\). The identity becomes
\(W_{C^{\perp}}(x)=C^{1}W_C(1x,1+(q1)x)\) which, using the singleargument version of \(W_C\) becomes \(W_{C^{\perp}}(x)=\dfrac{1}{C}(1+(q1)x)^nW_C\left(\dfrac{1x}{1+(q1)x}\right)\).
 There is a very nice application of MacWilliams, due to Assmus and Maher, to prove nonexistence of projective planes of order 6 modulo 8. An excellent presentation of this work is given by Matroid Union.
Theorem no. 73: Goodstein's Theorem

 Original source for this theorem: Goodstein, R., "On the restricted ordinal theorem", Journal of Symbolic Logic, 9 (2), 1944, pp. 33–41; online (paywall, 350KB pdf download March 2021).
 For some remarks on Goodstein's theorem in the context of the search for independence results for Peano arithmetic see Michael Rathjen, "Goodstein Revisited", Annals of Pure and Applied Logic, in press; arxiv.
Theorem no. 74: Gödel's First Incompleteness Theorem

 The theorem can be stated as "a consistent mathematical system contains assertions for which neither the assertion nor its negation can be proved from the axioms of the system". Indeed this is how Gödel himself stated his theorem, avoiding reference to mathematical truth. The version which says "contains truths which cannot be proved" is equivalent by virtue of the fact that, given consistency, nonprovability of a negated assertion is synonymous with truth of the assertion. See Peter Cameron's article in
Gowers et al (eds.), The Princeton Companion to Mathematics, Princeton University Press, 2008, a preprint of which is here.
 Mark ChuCarroll's blog Good Math, Bad Math has posted a very nice 4part introduction to Gödel I: the final part, which indexes the other three, is here. A good overview with a philosophical slant is this from The Times Literary Supplement by Juliette Kennedy, which is now behind a paywall but Kennedy's webpage has much else of value. Natalie Wolchover has an excellent Quanta article on the proof of Gödel's theorem.
 A fascinating selfcontained exposition of Gödel's two incompleteness theorems is given by Stanisław Świerczkowski here (Dissertationes Math., 422, 2003, click on the 'POBIERZ ZA DARMO' button) in terms of the theory of 'hereditarily finite sets'. These were the proofs which were chosen as suitable for machine checking in 2015: Lawrence C. Paulson (2015). "A mechanised proof of Gödel’s incompleteness theorems using Nominal Isabelle", Journal of Automated Reasoning. 55, no. 1: 1–37
 A short sketch proof of Gödel's theorem by Raymond Smullyan (c.f. 5000 B.C. and Other Philosophical Fantasies, St. Martin's Press, 1983) is reproduced here.
 Marianne Freiberger has written an excellent account for Plus magazine of Harvey Friedman's work developing nonartificial examples of Gödel's theorem in action.
 The link between Gödel's theorem and proofs of undecidability (e.g. the halting problem) is subtle and is very well explored by Jørgen Veisdal here.
Theorem no. 75: Gödel's Second Incompleteness Theorem

 There is a delightful 1page description of Gödel 2 in 'words of one syllable' by George Bollos here.
 See also note (3) to Gödel 1.
 The second incompleteness theorem was independently discovered by John Von Neumann who, it appears, swiftly saw how it followed as a corollary of the first theorem, see p. 162 of Rebecca Goldstein, Incompleteness: The Proof and Paradox of Kurt Gödel, W. W. Norton & Company, 2005.
 A little footnote regarding SETI (featured somewhat gratuitously in this theorem page's illustration).
Theorem no. 76: Brahmagupta's Formula

 Original source for this theorem, apart from the
Brāhmasphuṭasiddhānta
(Wiki) and the origins of Heron's formula (Wiki): Coolidge, J. L., "A Historically interesting formula for the area of a quadrilateral", The American Mathematical Monthly, 46 (6), 1939, pp. 345–347; online (paywall).
 Coolidge attributes an equivalent to his quadrilateral area formula to Carl Anton Bretschneider and Friedrich (I think) Strehlke, both 1842. Wikipedia has an entry on Bretschneider's Formula which also credits (also 1842) von Staudt:
$$K=\sqrt{(sa)(sb)(sc)(sd)abcd\cos^2\theta},$$ where \(\theta\) is half the sum of either pair of opposite angles
(in a cyclic quadrilateral opposite angles sum to \(180^{\circ}\) with the half angle giving zero cosine).
 Using the notation of the theorem description, Ptolemy's Inequality says \(ad+bc\geq ef\) with equality if and only \(abcd\) is cyclic. So the subtracted term in Coolidge's square root is always nonnegative (this is immediate from Bretschneider's Formula).
 A clever trapezoid version of Heron's formula due to Miguel Ochoa Sanchez can be found at cuttheknot.org.
 Also regarding Heron, see note (3) for Pythagoras.
 Fascinating material on Indian mathematicians' investigations of cyclic quadrilaterals may be found in Radha Charan Gupta, "Parameśvara's rule for the circumradius of a cyclic quadrilateral", Historia Mathematica, Vol. 4, Issue 1, February 1977, 67–74, online here. Parameśvara's (also known as L'Huilier's) rule states that: circumradius of cyclic quadrilateral with sides \(a,b,c,d\) is obtained on dividing \(\sqrt{(ab+cd)(ac+bd)(ad+bc)}\) by \(4\times\mbox{area of quadrilateral}\).
 Another open archive article from Historia Mathematica deals directly with Brahmagupta's formula: Satyanad Kichenassamy, "Brahmagupta’s derivation of the area of a cyclic quadrilateral", Historia Mathematica
Vol. 37, Issue 1, February 2010, Pages 28–61.
 As distinct from Brahmagupta's Formula, his Theorem usually refers to another result about a cyclic quadrilateral : if its diagonals are orthogonal and a line joins their intersection to a side, then the continuation of this line bisects the side opposite.
Theorem no. 77: Pick's Theorem

 Original source for this theorem: Pick, Georg, "Geometrisches zur Zahlenlehre", Lotos  Zeitschrift für Naturwissenschaften 47. Neue Folge XIX. Band., 1899, pp. 311–319; online (facsimile)
 Yiwang "Evan" Chen has a good presentation about Pick here (1.1MB pdf)
Theorem no. 78: The ThreeDistance Theorem

 The theorem was proved independently and simultaneously by at least three people, as is welldocumented in its Wikipedia page. Although somtimes referred to as 'the Steinhaus Conjecture', it doesn't seem to have survived long enough to warrant the title. Stanisław Świerczkowski sent me the following interesting background information on the history and his proof of this theorem:

"The Three Distances Theorem was conjectured by Hugo Steinhaus. When I gave him the proof, he checked it and asked me to write a report for the Academy of Sciences. He presented this report to the Academy (only members could do that) whereupon it was published in the BULLETIN DE L’ACADEMIE POLONAISE DES SCIENCES, Cl. III – Vol. IV, No.9, 1956. The date of his presentation to the Academy is: 29 June 1956. I have here an offprint. If you wish to look at it, and it is not in your library, I could scan it and send you an image by email.
About that time Vera Sós and her husband Prof. Turán visited our University, so they certainly heard about the Three Distances result directly from my colleagues or from me. Of course, Erdős was visiting us too many times. I remember Vera quite well.
In the announcement in the BULLETIN DE L’ACADEMIE POLONAISE mentioned above, there are four related theorems, and I postponed publishing the proofs of these to a later paper. By the time I wrote it, Vera Sós published her proof of the Three Distances Theorem, so I found it simpler to just refer to her paper. My proof (which I do not recall) almost certainly found its way into my Ph.D. thesis on the subject of cyclically ordered groups. This dissertation was submitted to the Polish Academy of Sciences. A recent inquiry disclosed that they cannot find it." 

These recollections can also be found in Świerczkowski's autobiography Looking Astern, which is available via his entry at the MacTutor archive (under Additional Resources). Notwithstanding, there is indeed a published proof by him: Świerczkowski, S., "On successive settings of an arc on the circumference of a circle", Fundamenta Mathematicae, 46 (2), 1958, pp. 187–189; online. In his paper Świerczkowski refers to yet another independent proof, unpublished, by Peter Szüsz, using continued fractions.
 There are generalisations of the theorem, see for instance J.F. Geelen and R.J. Simpson, "A two dimensional Steinhaus theorem", Australasian J. Combinatorics, vol. 8, 1993, 169–197, available online here.
 The theorem is often visualised as points plotted around a circle of unit circumference, see this blog entry from Σidiot's blog, for example.
Theorem no. 79: The Fifteen Theorem

 Original sources: the weblink from the theorem page is to preprints of two chapters from Eva BayerFluckiger and David Lewis (eds.), Quadratic Forms and Their Applications, American Mathematical Society, 2001. The first, by Conway, sets the scene. The second, by Manjul Bhargava, gives the first published proof of the 15 Theorem. It references Schneeberger's 1995 PhD dissertation which contains perhaps the first statement (and sketch proof) of the theorem.
 Manjul Bhargava won a Fields Medal in 2014 in part for his work on quadratic forms, including his proof with Jonathan Hanke of the Conway–Schneeberger 290 Conjecture. The citation is here, and he is interviewed by plus magazine on the subject here.
 The nine values specified in the statement of the 15 Theorem are entry A030050 at OEIS, while the 29 values for positive definite quadratic forms (the 290 Conjecture) in general are entry A030051.
Theorem no. 80: OneFactorisation of Regular Graphs

 This conjecture should probably be taken as originating with Amanda Chetwynd and Anthony Hilton's 1985 paper "Regular graphs of high degree are 1factorizable", Proc. London Math. Soc., 50
(1985), 193–206; online (paywall). In Stiebitz et al, Graph Edge Coloring: Vizing's Theorem and Goldberg's Conjecture, WileyBlackwell, 2012, on p. 259, we find the comment that: "it [the onefactorisation conjecture] 'was going around' in the early 1950s, Hilton [136] was told by G.A. Dirac." And then, "fifteen years before the onefactorization conjecture was published by Chetwynd and Hilton, NashWilliams [233] proposed a much stronger conjecture, which is also known as the hamiltonian factorization conjecture: Let \(G\) be a \(\Delta\)regular graph on \(2n\) vertices, where \(\Delta\geq n\). Then \(G\) has a hamiltonian factorization, i.e., \(G\) is the edgedisjoint union of \(\Delta/2\) Hamilton cycles if \(\Delta\) is even or \((\Delta1)/2\) Hamilton cycles and a linear factor if \(\Delta\) is odd." (The factorisation of \(K_2\times K_5\) is our theorem illustration is an example). However, Anthony Hilton has told me that he has never seen any mention of his conjecture published prior to 1985 and considers it unlikely that it would have been thought about in the 1950s. Certainly Vizing's Theorem was not known until the 1960s. Ref [233] is C. St. J. A. NashWilliams, "Hamiltonian lines in graphs whose vertices have suciently large valencies", in Combinatorial theory and its applications, III, NorthHolland 1970, 813–819. The recent work of Csaba et al (see main theorem description) also resolves NashWilliams' conjecture for large \(n\).
 Csaba et al is here (preprint) and has been published electronically (paywalled) as "Proof of the 1factorization and Hamilton Decomposition Conjectures", Memoirs of the American Mathematical Society, 2016, Vol. 244, Number 1154.
Theorem no. 81: The Lutz–Nagell Theorem

 Original sources for this theorem:
 Trygve Nagell, "Solutions de quelques problèmes dans la théorie arithmétique des cubiques planes du premier genre", Wid. Akad. Skrifter Oslo, no 1, 1935, pp. 1–25; not online, it seems; zbMATH Open entry.
 Élisabeth Lutz, "Sur l'équation y^{2} = x^{3}  Ax  B dans les corps padiques", J. Reine Angew. Math., 177, 1937, pp. 237–247; online (paywall, facsimile copy).
 When I first posted this theorem I ignorantly assumed that my example elliptic curve \(y^2=x^343x+166\) crossed the horizontal axis at \((8,0)\), giving a rational point of order 2 to illustrate the second case of the theorem. Shaun Stevens was kind enough to put me right, pointing out that this would imply a torsion group of order at least \(2\times 7\), contradicting the maximum order of 12 asserted by Mazur's Torsion Theorem.
 Tony Forbes' notes on "Elliptic curves, Factorization and Primality Testing" are very good for context on this theorem (200KB pdf download which will probably open in a separate application).
 Jennifer Balakrishnan contributes this entry (with a fine A3 poster version) on elliptic curves to the Oxford Mathematics Alphabet.
Theorem no. 82: Gruenberg's Theorem on Nilpotent Groups

 Original source for this theorem: K. W. Gruenberg, "Residual Properties of Infinite soluble groups", Proc. London Math. Soc., Vol. s37, Issue 1, 1957,
pp. 29–62; online (paywall).
 The Heisenberg group \(H(n)\) is a group of upper triangular matrices under multiplication; thus for \(n=3\): $$\left(\begin{array}{ccc} 1 & a & c \\ 0 & 1 & b \\ 0 & 0 & 1\end{array}\right)\times \left(\begin{array}{ccc} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{array}\right)
=\left(\begin{array}{ccc} 1 & a+x & c+z+ay \\ 0 & 1 & b+y \\ 0 & 0 & 1\end{array}\right). $$ The multiplication looks a bit obscure when expressed in terms of triples, as in our theorem description. However it is a natural convenience in the analysis of this group, c.f. this interesting paper.
Theorem no. 83: The Delsarte–Goethals–Seidel Theorem

 The original source for this theorem is P. Delsarte, J. M. Goethals, and J. J. Seidel, "Spherical codes and designs", Geom. Dedicata, 6 (1977), pp. 363–388; online (paywall).
 A fascinating (but 11MB!) account of Delsarte's approach is Florian Pfender and Günter M.Ziegler, "Kissing numbers, sphere packings, and s ome unexpected proofs", Notices of the AMS, vol. 51, no. 8, 2004, pp. 873–883; online.
Theorem no. 84: The Five Circle Theorem

 Original source for this theorem: Miquel, A., "Théorèmes de géométrie", Journal de Mathématiques Pures et Appliquées, Tome 3, 1838, pp. 485–487; online (facsimile by Gallica, the figures appear at the end of the volume with figure 3 (pertaining to the fivecircle theorem, appearing here).
 Same comment as for Theorem 55 regarding Java; the app for this theorem, if you want to try, is here.
 The web link for this theorem, a presentation by Wolfgang Schief is complemented by an an article: W.K Schief
and B.G Konopelchenko, "A novel generalization of Clifford's classical point–circle configuration. Geometric interpretation of the quaternionic discrete Schwarzian Kadomtsev–Petviashvili equation", 250,
Vol. 465, Issue 2104, 2019, pp. 1291–1308; online (paywall), arxiv.
 A converse to the fivecircle theorem by Frank Morley is discussed in Tobias Dantzig, "Elementary Proof of a Theorem Due to F. Morley", The American Mathematical Monthly,
Vol. 23, No. 7 (Sep., 1916), pp. 246–248; online. (Tobias was the father of George of simplex method fame).
Theorem no. 85: Cayley's Theorem

 Original source for this theorem: A. Cayley, "On the theory of groups, as depending on the symbolic equation θ^{n} = 1", Philosophical Magazine, Series 4, Vol. 7, 1854, Issue 42, pp. 40–47; online (paywall). There is a very good companion to this paper: David J. Pengelly, "Arthur Cayley and the first paper on group theory", in From Calculus to Computers: Using the Last 200 Years of Mathematics History in the Classroom, Amy ShellGellasch and Dick Jardine (eds.), Mathematical Association of America, 2005; preprint (275KB pdf, May 2021). Although the above source is what concerns us here, Cayley's paper is in three parts, all with the same name, and appearing in the Philosophical Magazine: Part II is Vol. 7, 1854, Issue 47, pp. 408–409; online (paywall); Part III is Vol. 18, 1859, Issue 117; online (paywall).
 Our description and illustration of this theorem are limited to finite groups but it applies equally to infinite groups.
 Peter Neumann's assessment of Cayley's theorem is an example of what is known as Gromov's dichotomy "Any proposition concerning all countable groups is either trivially true or false". I wrote down Neumann's remarks at a talk he gave at Queen Mary University of London during a celebration to mark the 60th birthday of R.A. Bailey:
 Jonas Karlsson has sent me the following valuable contextual remarks on Cayley's theorem:

Regarding Peter Neumann's somewhat dismissive comment on Cayley's theorem, I think it's worth pointing out that said theorem is a special case of the Yoneda embedding. That is, regarding a group G as a oneobject category, a setvalued presheaf on the group is a set with a Gaction, and the content of the Yoneda lemma is that regarding the group itself as a Gset (under translation) constitutes an embedding, i.e. the map is injective; which is precisely Cayley's theorem.
As for the Yoneda lemma itself, its proof may be a triviality but the lemma is what legitimizes the functorofpoints approach to algebraic geometry, and this viewpoint seems to be spreading to other areas of modern geometry as well. From this perspective, the theorem was remarkably prescient! 

 Some other remarks on generalising Cayley are given here by Terence Tao.
Theorem no. 86: Noether's Symmetry Theorem

 Original source for this theorem: Noether, E., "Invariante Variationsprobleme" , Nachr. König. Gesell. Wissen. Göttingen, Math.–Phys. Kl.(1918), pp. 235–257; online at Göttinger Digitalisierungszentrum. A 1971 English translation by M.A. Tavel has been put on the arxiv here by Frank Y. Wang. I understand that the translation given in our recommended book, Yvette KosmannSchwarzbach (transl. Bertram E. Schwarzbach), The Noether Theorems: Invariance and Conservation Laws in the Twentieth Century, Springer, New York, 2011, is superior. The French original "Les Théorèmes de Noether: Invariance et lois de conservation au XXe siècle" provides a French translation of course.
 Some other excellent accounts of Noether's theorem can be found here at the Physics Mill blog and here at the Perimeter Institute.
 A valuable source on the background and impact of Noether's theorems is Raphaël Leone, "On the wonderfulness of Noether's theorems, 100 years later, and Routh reduction"; preprint.
 A 2018 centenary conference for Noether's theorems was held at Notre Dame university. Only the abstracts are online at the conference website but you can get a good overview of current interest in Noether's achievements.
Theorem no. 87: Lamé's Theorem

 Original source for this theorem: G. Lamé, "Note sur la limite du nombre des divisions dans la recherche du plus grand commun diviseur entre deux nombres entiers", Comptes rendus des séances du l'Académie des Sciences, 19 (1844), pp. 867–870. I don't find this issue on Gallica. However, there is a wonderful analysis of Lamé's theorem and its precursors in Jeffrey Shallit, "Origins of the analysis of the Euclidean algorithm", Historia Mathematica,
Vol. 21, Issue 4, 1994, pp. 401–419; online
 Lamé's theorem comes in many different guises, all essentially saying that Euclid's algorithm takes longest (relative to input size) when applied to consecutive Fibonacci numbers. An excellent survey is provided by cuttheknot.
 Unsurprisingly The Fibonacci Quarterly has a number of interesting papers on Lamé and the Fibonacci numbers. See, for example, J. L. Brown, Jr. and R. L. Duncan, "The Least Remainder Algorithm", The Fibonacci Quarterly, Vol. 9, No. 4, 1971, pp. 347–350, 401; online, and follow back from the references.
 Another charming connection between Fibonacci and Euclid lies in the fact that the gcd of two Fibonacci numbers is the Fibonacci number of their gcd: \(\gcd(F_m,F_n)=F_{\gcd(m,n)}.\) A proof may be found here on cuttheknot. Thanks to Joshua Zelinsky for telling me this.
 Also a good corrective to golden ratio hype is this by Chris Budd.
Theorem no. 88: The Cauchy–Kovalevskaya Theorem

 Original sources for this theorem:
 Cauchy, Augustin, "Mémoire sur l'emploi du calcul des limites dans l'intégration des équations aux dérivées partielles", Comptes rendus hebdomadaire des séances de l'academie des sciences, tome 15, July 1842; in Œuvres completes, première série, tome 7, extrait 170, pp. 17–33; online (facsimile)
 von Kowalevsky, Sophie, "Zur Theorie der partiellen Differentialgleichung", Journal für die reine und angewandte Mathematik, 80, 1875, pp. 1–32; online (paywall, facsimile)
 Putting Kovalevskaya's work in a modern context is this HAL archives article: Elemer Elad Rosinger, "Can there be a general nonlinear PDE theory for existence of solutions?"
 Garry J. Tee kindly sent me a scan of his wellknown article about Kovalevskaya which appeared in the Mathematical Chronicle in 1977. The Mathematical Chronicle Committee in turn kindly gave me permission to host a copy here. I have made pdf (about 3.7MB) and Powerpoint (about 9MB) versions. (The Mathematical Chronicle Committee is arranging for digitisation of Mathematical Chronicle).
Theorem no. 89: The AbelHurwitz Binomial Theorem

 Original sources for this theorem:
 Abel, N. H., "Beweis eines Ausdrucks, von welchem die BinomialFormel ein einzelner Fall ist", J. reine angew. Math., 1, 1826, pp. 159–160; online (paywall, facsimile from Göttinger Digitaisierungszentrum).
 A. Hurwitz, "Uber Abel’s Verallgemeinerung der binomischen Formel", Acta Mathematica, 26, 1902, pp. 199–203; online.
 Further generalising Abel and Hurwitz is the subject of Alexander Kelmans and
Alexander Postnikov, "Generalizations of Abel’s and Hurwitz’s identities", European Journal of Combinatorics,
Vol. 29, Issue 7, 2008, pp. 1535–1543; online.
 There is a beguiling entry in Gil Kalai's blog Combinatorics and More on Abel sums.
Theorem no. 90: Cardano's Cubic Formula

 A more explicit version of this formula (comparable in format to the quadratic formula) can be found here. This version is made fun of in this blog post.
 The case where a cubic equation has three real roots which may only be expressed in radical form using complex numbers is known as casus irreducibilis. It was determined by Pierre Wantzel in 1843.
 Thony Christie offers a very detailed discussion of the skirmishes around ownership of solutions to the cubic here. A posting here by Plus magazine is also illuminating.
 Kellie Gutman has made an English translation of Tartaglia's verseform solution of the cubic, brought to us by poetrywithmathematics.blogspot.fr.
 David Benjamin has this nice portrait for plus magazine of Cardano and his family.
Theorem no. 91: Khinchin's on Continued Fractions

 Original source for this theorem: Khintchine, A., "Metrische Kettenbruchprobleme", Compositio Math., Tome 1, 1935, pp. 361–382; online
 A closely related result of Khinchin says that that for almost all real numbers, the \(n\)th root of the denominator of the \(n\)th convergent of the continued fraction expansion tends in the limit to a fixed constant, known as Lévy's constant.
Theorem no. 92: The Quadratic Formula

 A valuable list of derivations of the quadratic formula is provided by CutTheKnot.
 As a footnote to Rob J. Low's advocacy of the quadratic equation expressed as \(ax^2+2bx +c=0\), an analysis is given by Tony Forbes in issue 204 of M500 magazine, pp. 22–23, of the probability of the solutions being real. In contrast to the 'standard' \(ax^2+bx +c=0\) where the probability is \((41+\ln 64)/72\), the Rob J. Low form has a rational probability: \(7/9\).
 There is an interesting exchange on Twitter (24 April, 2019) between @robinhouston and @MathPrinceps on the fact that Gauss and Lagrange disagreed on whether \(bx\) or \(2bc\) is preferable, with Gauss preferring the latter.
Theorem no. 93: The Generalised Hexachord Theorem

 Original source for this theorem: M. Babbitt, "Some aspects of twelvetone composition", The Score, 12, 1955, pp. 53–61; not online, I think, see here for details of the journal.
 The application of Fourier analysis pioneered by Emmanuel Amiot and others, referenced on this theorem page, has evolved into a whole monograph by Amiot: Music Through Fourier Space: Discrete Fourier Transform in Music Theory, Springer, 2016.
 The music example in this theorem description was typeset using the CERL Sound Group LIME music notation software.
Theorem no. 94: Cayley's Formula

 The second proof of Cayley in Stanley's book, by André Joyal, has a nice application to automata theory by Peter Cameron, described here.
 Also in Cameron's blog: another lovely enumeration yielding \(n^{n2}\).
Theorem no. 95: The Convolution Theorem

 Original source for this theorem: Cooley, James W. and Tukey, John W., "An algorithm for the machine calculation of complex Fourier series", Math. Comput., 19 (90), 1965, pp. 297–301; online. The relevant Wikipedia page is a rich source of historical and other sources, many open access.
 For Gauss and FFT see Michael T. Heideman, Don H. Johnson and c. Sidney Burrus, "Gauss and the History of the Fast Fourier Transform", IEEE ASSP Magazine, October 1984, 14–21. Online here. Lanczos's contribution is mentioned in his MacTutor entry.
 The continuouse convolution operator and related theorems date back to Euler and Laplace, as described by Alejandro Dominguez, "A History of the Convolution Operation", IEEE Pulse, January/February 2015; online.
 A lovely gentle introduction to Fourier transforms by betterexplained.com. (who also provide the recommended web link on the theorem page. The recommendation used to be a very nice pdf download from Berkeley but Firefox warns me the link is risky. At your own risk it is here.
 Aled Walker contributes this entry (with a fine A3 poster version) on the Fourier transform to the Oxford Mathematics Alphabet.
Theorem no. 96: The Rule of Sarrus

 There is apparently a 4×4 version of the Rule of Sarrus called the 'Rule of Villalobos' due to the Mexican mathematician Gustavo Villalobos Hernández. See comment (4) here which links to a Spanish wikipedia entry which unfortunately appears to have been deleted.
Theorem no. 97: Nevanlinna's FiveValue Theorem

 The source for this theorem is R. Nevanlinna, "Eindentig keitssätze in der theorie der meromorphen funktionen,"
Acta. Math., 48 (1926), pp. 367–391; online.
 The tribute of Lee Rubel reads "... my favorite theorem in all of mathematics is a theorem of R. Nevanlinna that two functions, meromorphic in the whole complex plane, that share five values must be identical. For real functions, there is nothing that even remotely corresponds to this." It is from the introduction to Entire
and Meromorphic Functions, by Lee A. Rubel with James E. Colliander, SpringerVerlag New York, 1996. Herman Weyl described Nevanlinna's 1925 paper on meromorphic functions as "one of the few great mathematical events in our century." (Not having access to Weyl's book Meromorphic Functions and Analytic Curves, Princeton University Press, 1944, I am relying on the review of the book by Herbert Busemann and the Wikipedia article on Nevanlinna theory.
Theorem no. 98: Cartwright's Theorem

 Original source for this theorem: "On analytic functions regular in the unit circle II", Quart. J. Math. Oxford Ser. (2) 6 (1935) 94–105; online. In his obituary of Cartwright (Bull. London Math. Soc. 34 (2002) 91–107) Walter K. Hayman comments "With this paper the author essentially created a new field. It was almost the only paper quoted by Littlewood in his book [Lectures on the Theory of Functions, OUP, 1944] and led me to ask Mary Cartwright to become my research supervisor." The citation appears on p. 232 on Littlewood's book. He writes "The proofs, due to Cartwright, are difficult, and depend on ideas unlike any we have been considering". The book is online here.
 There is a good popular account of this theorem by Jaz_Will here.
Theorem no. 99: The Happy Ending Problem

 Original source for this theorem: Erdos, P. and Szekeres, G., "A combinatorial problem in geometry", Compositio Mathematica, Tome 2 (1935), pp. 463–470; online. They wrote about the problem again in 1960: P. Erdos and G. Szekeres, "On some extremum problems in elementary geometry", Ann. Univ. Sci. Budapest. Eötvös Sect. Math., 3–4 (1960/1961), pp. 53–62; online (1.8MB pdf).
 The attribution to Turán of the solution to n = 5 for this problem I found in this article by Imre Bárány (section 10). More details are given by Szekeres and Peters in their article proving n = 6: Szekeres, G., Peters, L., "Computer solution to the 17point Erdos–Szekeres problem", ANZIAM J., vol. 48(2), 2006, pp. 151–164; online.
 There is a very nice description of the Happy Ending problem and Suk's asymptotic solution here, by Quanta magazine.
Theorem no. 100: The Design of the Century

 The original source for this theorem is Forbes, A.D., Grannell, M.J. & Griggs, T.S. "The design of the century", Math. Slovaca, 57, 495–499 (2007); online (paywall); preprint.
Theorem no. 101: Kepler's Conjecture

 Original source for this theorem is: Thomas C. Hales, "A proof of the Kepler conjecture", Annals of Mathematics, Vol. 162, Issue 3, 2005, pp. 1065–1185; online. But note the extended proof process undertaken by Hales, alluded to below.
 This was a 'theorem under construction' pending Hales' Flyspeck project to reinforce his original proof with a machineautomated one. This completed in 2014 but the proof was already made bulletproof with his 2012 publication of Dense Sphere Packings: A Blueprint for Formal Proofs. The formal proof is also described by Hales and twentyone other authors in "A formal proof of the Kepler conjecture", Forum of Mathematics, Pi (2017), Vol. 5, e2; online.
 The story of Hales' original proof and publication of Kepler is well told here. An glimpse of the story lies in the 2005 Annals paper footnote (which most publishing mathematicians will empathise with!) "Received 1998, revised, 2003". Note that his main referee at Annals, namely Gábor Fejes Tóth, is the son of László Fejes Tóth who made the original breakthrough in the conjecture's resolution. By the way, a statement by the Editorial Board at Annals explicitly invites "computerassisted proofs of exceptionally important mathematical theorems", an invitation which I understand to predate the Kepler submission.
 An earlier proof of Kepler, by WuYi Hsiang, is generally regarded as incomplete. Details are given by Hales in section 4.4 of this preprint which is, generally, a firstrate introduction to the conjecture and his proof.
 There are some remarks about Flyspeck in Hales's contribution to an AMS Notices special issue on formal proof. The Flyspeck project is now at GitHub. (The old google code website is still accessible here, as of February 2021, but carries the warning "Google is shutting down google code and we are being evicted.")
Theorem no. 102: Viète's Formula

 Tom Ostler has a more detailed account of the connections between Viète's formula and ruler and compass constructions in "Geometric constructions approximating pi", Mathematical Spectrum, Vol. 40, No. 3 (May 2008), 106–108; complete issue download (600KB).
Theorem no. 103: The Parking Function Formula

 Original sources for this theorem:
 Ronald Pyke, "The supremum and infimum of the Poisson process", Ann. Math. Statist. 30 (1959) 568–576; online preprint(900KB pdf).
 A. G. Konheim and B. Weiss, "An occupancy discipline and applications", SIAM J. Appl. Math., 14 (6), 1966, pp. 1266–1274; online (paywall).
 The bijection illustrated on the theorem page is from Philippe Chassaing and JeanFrançois Marckert, "Parking functions, empirical processes, and the width of rooted labeled trees, Electronic J. Combinatorics., Vol. 8, Issue 1, 2001, article R14; online.
 The bijection to labelled trees establishes the formula but the `book' proof is due to Henry O. Pollack. An account is given in this wonderful presentation by Richard Stanley
 The note by Peter Cameron which is the recommended weblink for this theorem was written for the Queen Mary Maths dept student newsletter. Its topic is very nicely explored in this presentation by Thomas Prellberg.
Theorem no. 104: The Goins–Maddox–Rusin Theorem for Heron Triangles

 Original sources for this theorem (the latter two are the weblinks from the theorem page):
 N. J. Fine, "On rational triangles", Amer. Math. Monthly, Vol. 83, No. 7, 1976, 517–521; online (paywall).
 David J. Rusin, "Rational triangles with equal area," New York Journal of Mathematics,
Vol. 4, 1998, 1–16; online
 Edray Herber Goins, Davin Maddox, "Heron triangles via elliptic curves", Rocky Mountain J. Math., Vol. 36, No. 5, 2006, pp. 1511–1524; online
 The triangletocurve conversion rule given in the text (from the Goins–Maddox paper) guarantees to produce an elliptic curve and a rational point on that curve. The actual curve and point depend on the order in which sides \(a,b\) and \(c\) are chosen. For example, the triangle with sides \(3, \frac{10}{3},\frac{17}{3}\), with area \(n=4\), gives \(\rho=1/4\) or \(2\) or \(2/9\), depending on which sides are called \(a,b\) and \(c\). And for the curve depicted, \(\rho=1/4\), neither ordering of the sides recovers the point \(P=(8,24)\) on this curve (which gives the same triangle but with some negative sides)  instead points \((2,6)\) and \((18,102)\) are discovered.
 The version of Heron's formula I have used is one of several given in the wikipedia entry. It is useful for showing the invariance of the rule under sign changes.
 The paper by Nathan Fine cited on this theorem page is N. J. Fine, "On rational triangles", The American Mathematical Monthly, Vol. 83, No. 7 (Aug.  Sep., 1976), pp. 517–521; online (paywalled).
 There is a nice account here of searching for congruent numbers (integer areas of rational right triangles).
The associated sequence at oeis is A003273.
Theorem no. 105: The Pumping Lemma

 A nice entry at Computational Complexity by Bill Gasarch poses a challenge "Find a nonreg lang that is not easily proven nonreg." Meaning, the pumping lemma plus some other basic tools always seems to be sufficient.
Theorem no. 106: Babbit's Theorem

 I am unsure of the source of this theorem. It may well be mentioned in Babbit's 1995 paper in The Score on interval analysis (see note (1) to Theorem 93).
 The music example in this theorem description was typeset using the CERL Sound Group LIME music notation software.
Theorem no. 107: The Tverberg Partition Theorem

 The original recommended web link for this page was Stephen Hell's magnificent dissertation
"Tverbergtype Theorems and the Fractional Helly Property". This became unavailable as a direct download but is now located at depositonce.tuberlin.de/handle/11303/1761 and is highly recommended.
 Tverberg's Theorem is a special case of the socalled Topological Tverberg Conjecture. The conjecture is true when \(r\), the number of partition parts, is a prime power but was proved false in general in 2015. It is all verywell described by Gil Kalai: follow the links back from here (you will also find a proof of Tverberg's Theorem itself).
 Also by Gil Kalai, a good overview, marking the birthday of Imre Bárány, of various 'Tverberg' results and conjectures
Theorem no. 108: The Analyst's Travelling Salesman Theorem

 Original sources for this theorem:
 Peter W. Jones, "Rectifiable sets and the traveling salesman problem", Inventiones Mathematicae, Vol. 102, Issue 1, 1990, page 1–16; online.
 Kate Okikiolu, "Characterization of subsets of rectifiable curves in R^{n}", J. London Math. Soc., s246 (2), 1992, pp. 336–348; online.
 There is a good wiki page on ATST.
 Raanan Schul's Analyst's Traveling Salesman Theorems, a Survey is an excellent source of more detailed and more recent information.
 The Koch snowflake image in the illustration of this theorem was copied from a website called www.scientificweb.com/testreport/mathbench4/ which appears no longer to exist.
Theorem no. 109: The Beardwood–Halton–Hammersley Theorem

 Original source for this theorem: Beardwood, J., Halton, J. H. and Hammersley, J. M., "The shortest path through many points", Proc. Cambridge Philos. Soc., 55, pp. 299–327; online (paywall).
 Entry A073008 at oeis.org is "Decimal expansion of the Traveling Salesman constant" which appears to correspond to \(\beta_2\) on our page. The digits are conjectural, being the expansion of \((4/153)(1 + 2\sqrt{2})\sqrt{51}\). It is not clear whose conjecture this is; it is not in the cited article Stefan Steinerberger, "New bounds for the traveling salesman constant", Adv. in Appl. Probab., Vol. 47 (1), 2015, pp. 27–36; online (paywall, arxiv), which seems authoritative and relatively recent.
 An interesting counterexample of Alessandro Arlotto and J. Michael Steele puts a limit on how far the hypothesis on the variables \(X_1, X_2, \ldots\) forming the TSP tour may be relaxed.
Theorem no. 110: The Robbins Problem

 Original source for this theorem: W. McCune, "Solution of the Robbins Problem", J. Automated Reasoning, 19(3), 1997, pp. 263–276; online (paywall, preprint pdf 200KB).
 John Baez has a nice explanation here of a singleaxiom definition of a lattice. I first saw this as a Baez google+ post accompanied by his usual crop of quality comments, a couple of which I quote:
 David Tweed: "Related, there's a claim to a more human proof that the Robbins axioms describe boolean algebra at http://www.markability.net/robbins.htm . Frustratingly it doesn't say if this was derived from unintuitive computer proofs or independently. So a fact first proved by computer may later acquire a human proof."
 Robert Rothenberg: It reminds me of work people have done to come up with singleaxiom versions of various logics, I think Harris & Rezus. See A. Rezus, "On a Theorem of Tarski", Libertas Math. 2, pp. 6397. Discussed in http://fitelson.org/ar.html .
Theorem no. 111: De Morgan's Laws

 I found the following on Peter Cameron's mathematical quotations page (under logic):

" ... the contradictory opposite of a copulative proposition is a disjunctive proposition composed of the contradictory opposites of its parts.
... the contradictory opposite of a disjunctive proposition is a copulative proposition composed of the contradictories of the parts of the disjunctive proposition."
William of Ockham (Occam), Summa totius logicae, 14th century (transl. Philotheus Boehner 1955)
Note: This is Ockham's formulation of De Morgan's Laws, more than five hundred years before De Morgan. It is just as clear in his Latin text. 

Theorem no. 112: Bregman's Theorem

 Original source for this theorem: Bregman L., "Some Properties of Nonnegative Matrices and their Permanents", Dokl. Akad. Nauk SSSR, v. 211, No. 1, 1973, pp. 27–30; online. (English translation: Soviet Math. Dokl., v. 14, 1973, pp. 945–949. The original conjecture of Minc appears in H. Minc, "Upper bounds for permanents of (0, 1)matrices", Bull. Amer. Math. Soc., Vol. 69, Issue 6, 1963, pp. 789–791; online.
 A. Schrijver attributes to Brouwer (presumably L.E.J. Brouwer who was still alive when Minc originally conjectured what became Bregman's Theorem) the observation that an upper bound for permanents of arbitrary nonnegative matrices may be derived directly from the bound for (0,1)matrices. Thus, if \(v = (b_1, b_2, \ldots, b_n)\) is a vector in descending order, and letting \(b_{n+1}=0\), define
$$g(v)=\sum_{k=1}^n(b_kb_{k+1})(k!)^{1/k}.$$
Then the Minc bound is replaced by the product \(\prod_ig(v_i)\) over the rows \(v_i\) of the matrix. (A. Schijver, "A short proof of Minc's conjecture", J. Combin. Theory, A, vol. 25, no. 1, 1978, 80–31; online.)
 Although computing \(\mbox{per}(M)\), the permanent of an arbitrary matrix \(M\), is hard (specifically, #Phard) even if \(M\) is \(0\mbox{}1\), in the case where \(M\) has nonnegative entries Jerum, Sinclair and Vigoda have given a polynomial algorithm which gives a value arbitrarily close to \(\mbox{per}(M)\) with high probability (FPRAS).
Theorem no. 113: The van der Waerden Conjecture

 The probability calculations in this theorem description may be justified explicitly by observing that column sums add to unity, so that for, say, System 1,
$$\left(\frac12+\frac14+\frac14\right)\times\left(\frac13+\frac58+\frac{1}{24}\right)\times\left(\frac16+\frac18+\frac{17}{24}\right)=1,$$
which, expanding the brackets, equals the sum of the probabilities of every possible product of output failures over the three computers.
 Closely related are questions about permanents of matrices having all row and column sums equal, counting perfect matchings in regular bipartite graphs of equalsize parts. Lower bounds have recieved much attention, notably the Schrijver–Valient Conjecture. Some recent information is here.
Theorem no. 114: The Lagrange Property for Moufang Loops

 The sources for this theorem are Alexander N. Grishkov and Andrei V. Zavarnitsine, "Lagrange's theorem for Moufang loops",
Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 139, Issue 1, July 2005 , pp. 41–57; online (paywall); preprint; and
S.M. Gagola III and J.I. Hall, "Lagrange's theorem for Moufang loops", Acta Sci. Math. (Szeged), 71 (2005), pp. 45–64; online. G. Eric Moorhouse has told me that his proof remained unpublished, having been the victim of unfortunate timing. It is cited in S.M. Gagola III and J.I. Hall as 'private communication'.
 The original weblink for this theorem page was Chein, O., Kinyon, M.K., Rajah, A. and Vojtěchovský, "Loops and the Lagrange Property", Results. Math., 43, (2003), pp. 74–78; online (paywall); preprint. While just predating the proof of the theorem it
nevertheless remains an excellent introduction.
Theorem no. 115: The Hardy–Ramanujan Asymptotic Partition Formula

 Hardy and Ramanujan published their asymptotic formula in 1918 in "Asymptotic Formulae in Combinatory Analysis", Proc. London Math. Soc., (2) 17, pp 75–115. But they had published preliminary versions as early as 1916 and the abstract to the 1918 paper appeared in the records of LMS Proceedings for March 1st, 1917 (Hardy himself, as vicepresident, taking the chair). The papers can be viewed online here.
Theorem no. 116: The Polynomial Coprimality Theorem

 The row and column labels in the illustration of this theorem have been supressed for \(n=3\) to save space. They are, in order, \(x^3, x^3+1, x^3+x, x^3+x+1, x^3+x^2, x^3+x^2+1, x^3+x^2+x, x^3+x^2+x+1\), the 4th and 6th being irreducible (irreducible polynomials up to degree 5 over GF(2) are listed here).
 The theorem first appears in S. Corteel, C. Savage, H. Wilf, D. Zeilberger, "A Pentagonal number sieve", Journal of Combinatorial Theory, Series A, vol. 82, no. 2, 1998, 186–192. There is a reprint online here (scroll down to 1998). The constructive proof of Arthur T. Benjamin and Curtis D. Bennet is "The Probability of Relatively Prime Polynomials, Mathematics Magazine, vol. 80, no. 3 (2007), pp. 196–202; online.
 Thomas Hagedorn and Jeffrey Hatley have generalised this result to consider polynomials over the ring \(\mathbb{Z}_{p^k},\ p\) prime: "The probability of relatively coprime polynomials in \(\mathbb{Z}_{p^k}[x]\)", Involve, vol. 3, no. 2, 2010, pages 223–232; online. Their paper reviews several other generalisations.
Theorem no. 117: A Theorem of Erdős and Wilson on Edge Colourings)

 Original source for this theorem: P. Erdős and Robin J. Wilson, "On the chromatic index of almost all graphs", Journal of Combinatorial Theory, Series B, Vol. 23, Issues 2–3, 1977, pp. 255–257; online (which is the paper linked from the theorem page, but this is the official version whereas the theorem page links to The Erdős Project at the Rényi Institute, which actually does not yet offer Erdős's complete output, currently running up to end of 1989.
Theorem no. 118: Catalan's Conjecture (Mihăilescu's Theorem)

 Original sources for this theorem:
 V.A. Lebesgue, "Sur l'impossibilité en nombres entiers de l'équation x^{m }= y^{2} + 1", Nouv. Ann. Math., 9, 1850, pp. 178–181; online.
 J.W.S. Cassels, "On the equation a^{x } – b^{y} = 1, II", Proc. Cambridge Philos. Soc., Vol. 56, Issue 2, 1960, pp. 97–103; online (paywall).
 Chao Ko, "On the Diophantine equation x^{2 }= y^{n }+ 1, xy ≠ 0", Sci.Sinica (Notes), 14, 1964, pp. 457–460.
 Preda Mihăilescu, "Primary cyclotomic units and a proof of Catalan’s conjecture", Journal für die reine und angewandte Mathematik, Vol. 2004, Issue 572, pp. 167–195; online (paywall).
 There is more on Pillai's conjecture and on the history of Catalan's conjecture here by Michel Waldschmidt.
Theorem no. 119: Kneser's Conjecture

 Lovász's proof of Kneser's conjecture appears in "Kneser's conjecture, chromatic number, and homotopy", J. Combin. Theory A, Vol. 25, issue 3, 1978, 319–324; online. In the same issue there is a halfpage proof, inspired by Lovász's and also topological in nature, by Imre Bárány, "A short proof of Kneser's conjecture", J. Combin. Theory A, Vol. 25, issue 3, 1978, 325–326; online. The elementary proof of Jiří Matoušek appears in "A combinatorial proof of Kneser’s conjecture", Combinatorica, Vol. 24, Issue 1, 2004, 163–170; online (paywall, online preprint, March 2021, see under M).
Theorem no. 120: The Lovász Local Lemma

 Original sources for this theorem:
 The original Local Lemma appears in Erdős, P. and Lovász, L., "Problems and results on 3chromatic hypergraphs and some related questions", in A. Hajnal; R. Rado; V. T. Sós (eds.), Infinite and Finite Sets (to Paul Erdős on his 60th birthday). II., NorthHolland, 1975, pp. 609–627; online.
 The paper of Carsten Thomassen is "The even cycle problem for directed graphs", J. Amer. Math. Soc., Vol. 5, Number 2, 1992, pp. 217–229; online: the result on hypergraph colouring appears as Theorem 5.1.
 An earlier version of this theorem description may be found here. It uses an artificial number theory example to show nonindependent sets which are nevertheless pairwise independent. But the conclusion of the Local Lemma is still true even though its hypotheses are false. The example replacing it gives us a false conclusion and is a bit less artificial (in fact it is based on a scenario from cryptography, described to me by Michelle Kendall, which is not artificial at all).
 The current example concerns an event A_{ij} that two multisets of size 2, each chosen with repetition from the set {1,...,n}, have nonempty intersection. The number of ways that such a pair of intersecting multisets can be chosen is
$${n+1\choose 2}^2\left({n\choose 2}{n1\choose 2}+n{n\choose 2}\right)=\frac12n\left(2n^2n+1\right).$$
This is A081436 in oeis.org. The probability of \(A_{ij}\) is thus
\(\frac12n(2n^2n+1)/{n+1\choose 2}^2\).
The probability of the triple event \(A_{ij}\cap A_{ik}\cap A_{jk}\) may be calculated as \(\frac12n(2n^3+2n^27n+5)/{n+1\choose 2}^3\). When \(n = 3\), this has value \(7/18\), much greater than \(8/27\), the cube of the probability of the single event.
 Anwer Khurshid and Haredo Sahai, "On mutual and pairwise independence: some counterexamples", Pi Mu Epsilon Journal, Vol. 9, No. 9, 1993, 563–570, looks promising as a source of further false applications of the Local Lemma. Online here (10.3MB pdf).
 A constructive proof of the Local Lemma is given in Robin A. Moser and Gábor Tardos, "A constructive proof of the Lovasz Local Lemma", Journal of the ACM, Vol. 57, Issue 2, 2010; online (paywall, arxiv).
Theorem no. 121: Lambert's Formula

 How this formula arose out of Lambert's attempts to prove Euclid's 5th postulate is beautifully described here by Evelyn J. Lamb...
 ...who also describes and links to a beguiling interactive tool for tiling the Poincaré disk, part of a fine nonEuclidean geometry resource by Malin Christersson.
Theorem no. 122: The Borsuk–Ulam Theorem

 The original source for this theorem is Karol Borsuk, "Drei Sätze über die ndimensionale euklidische Sphäre",
Fundamenta Mathematicae, 20 (1933), pp. 177–190; online.
H. Steinlein, "Spheres and symmetry: Borsuk's antipodal theorem", Topol. Methods Nonlinear Anal., Vol. 1, Number 1 (1993), pp. 15–33; online; cites two original papers by Borsuk, an earlier having appeared in 1932.
 In one dimension the theorem is an easy corollary of the Intermediate Value Theorem, see this from Plus magazine, for example.
Theorem no. 123: The Wedderburn–Artin Theorem

 An interesting article on the significance of WedderburnArtin is provided by Quora (with companion entries on significance of several other theorems)
Theorem no. 124: The Lagrange Interpolation Formula

 The link between Lagrange interpolation and the Chinese Remainder Theorem is discussed here.
 Of course linear Lagrange interpolation for sets of points in \(\mathbb{R}^2\) extends in various directions. A pretty derivation for linear interpolation in \(\mathbb{R}^n\) by Kamron Saniee is given in "A Simple Expression for Multivariate Lagrange Interpolation", SIAM Undergraduate Research Online (SIURO), Vol. 1, Issue 1, 2008; online. These lectures notes by Kostas Kokkotas do an excellent job of giving the wider context for interpolation (see chapter 3).
 You can find interpolation calculators on the web. This, from dCode, a French ciphers and cryptogram site, for example; or wolfram alpha which will respond to something like "interpolate (0,1), (1,4),(2,5)".
Theorem no. 125: The Skolem–Noether Theorem

 Original sources for this theorem:
 Skolem, Thoralf, Zur Theorie der assoziativen Zahlensysteme, Skrifter, Oslo (12), 1927 (a 50page monograph  I don't think it is available digitally).
 Noether, E., "Nichtkommutative Algebra", Math. Z., 37, 1933, pp. 514–541; online.
 For a short (but not selfcontained) proof of Skolem–Noether see this from the stacks project; an elementary proof is given by Jenő Szigeti and Leon van Wyk, "A Constructive Elementary Proof of the SkolemNoether Theorem for Matrix Algebras", American Mathematical Monthly, vol. 124, no. 10, 2017, pp. 966–968; preprint.
Theorem no. 126: The Asymptotic (Half) Liar Formula

 The exponential nature of the Dumitriu–Spencer asymptotic can be appreciated by noting that, while \(U_1(7)\geq 15\) (the example illustrating the theorem), the value of \(U_1(25)\) exceeds \(10^6\), i.e. asking 25 questions is guaranteed to find a value between 1 and a million, in the face of at most one lie. A nice account of this result by
Deryk Osthus and Rachel Watkinson is here.
 The exact value of \(U_1(7)\) is \(16\); \(U_1(25)=1290554\). With such rapidly growing values it is more convenient to ask the liar question the other way round: given a value for \(n\) what is the least number, \(q_k(n)\), of questions which will guarantee a number in the range \(1,\ldots,n\) is determined, in the face of \(k\) lies? This is the version discussed by Osthus and Watkinson. Thus \(U_1(7)=16\) because \(q_1(16)=7\) but \(q_1(17)=8\).
 A comprehensive and very readable survey, "Searching games with errors—fifty years of coping with liars" by Andrzej Pelc, Theoretical Computer Science, 270, 71–109 , is available here via Elsevier's Open Access Archive.
 A tangential but intriguing link regarding the Fano plane, which generates Peter Cameron's trick illustrating this theorem, is this on using the Fano plane to generate poetry.
 Although I attribute the delphic oracle image used in my illustration of this theorem to the Staatliche Museen Berlin I think this attribution is based merely on a google image search. I have elsewhere seen it attributed to the "Collection of Joan Cadden".
Theorem no. 127: The Lucas–Lehmer Test

 Original sources for this theorem are:
 É. Lucas, "Théorie des fonctions numériques simplement périodiques", American Journal of Mathematics, 1(1878), pp. 184–240, 289–321; online, also at edouardlucas.free.fr (1.1MB pdf).
 D. H. Lehmer, "An extended theory of Lucas’ functions", Annals of Mathematics, 2nd Ser.,31 (1930), pp. 419–448; online (paywall).
 I like this short blog entry, written by a Mersenne prime hunter, on using the Lucas–Lehmer test.
 There is a good entry here by 3010tangents about how to prove Lucas–Lehmer.
 For an elementary proof of Lucas–Lehmer see J. W. Bruce, "A really trivial proof of the Lucas–Lehmer test", Amer. Math. Monthly, 100 (1993), 370–371; online.
Theorem no. 128: The Euclid–Euler Theorem

 The MacTutor archive offers a good page on the history of perfect numbers. Fermat's contribution is discussed in Colin R. Fletcher, "A reconstruction of the FrenicleFermat correspondence of 1640", Historia Mathematica,
Vol. 18, Issue 4, 1991, pp. 344–351; online.
 I found this wonderfully openended exploration of perfect numbers (1.8MB pdf) by Oliver Knill (one of many he has posted). Patrick Honner, for Quanta magazine, has this.
 Article no. 1 at John Voight's site is a nice exploration of odd perfect number properties.The current lower bound on odd perfect numbers appears to be Pascal Ochem and Michaël Rao,"Odd perfect numbers are greater than \(10^{1500}\)", Mathematics of Computation, Vol. 81, Issue 279, 2012, 1869–1877; online. They have raised this subsequently to \(10^{2000}\) according to this Quanta article by Steve Nadis, which gives a good account of recent (September 2020) research.
 A charming fact about perfect numbers is that they are harmonic divisor numbers, a fact proved by Øystein Ore in 1948. Indeed, Ore conjectured that all such numbers are even which would imply the nonexistence of odd perfect numbers.
Theorem no. 129: The Ollerenshaw–Brée Formula

 The entry for Ollerenshaw at Agnes Scott is good on technical details of her work on magic squares and cites her main publications in the area (starting 1986). The definitive source for this theorem is her book with David Brée, which is the futher reading link on the theorem page.
 Evelyn Lamb wrote this very nice tribute to Ollerenshaw on the occasion of her 100th birthday. Another tribute to Ollerenshaw's work on magic squares was carried by the Royal Society Publishing Blog
 A depiction of a mostperfect magic square dating from the 10th century is found in the Jain temple Parshvanatha at Khajuraho in Madhya Pradesh.
 A good source of information on magic squares generally is the website of Francis Gaspalou.
Theorem no. 130: A Theorem on Apollonian Circle Packings

 The papers that inspired this theorem description can be located on Ron Graham's website dated 2003–2006. The theorem as given is from R. L. Graham, J.C. Lagarias, C.L. Mallows, A.R. Wilks, and C.H. Yan, "Apollonian circle packings: number theory", J. Number Theory, Vol. 100, no. 1, 2003, 1–45; online.
 Jerzy Kocik gives another treatment of specifying all integral Apollonian circle packings in this preprint "On a Diophantine equation that generates all integral Apollonian Gaskets".
Theorem no. 131: The Existence Theorem for Orthogonal Diagonal Latin Square

 Original source for this theorem (as cited in its illustration) is John Wesley Brown, Fred Cherry, Lee Most, Mel Most, E.T. Parker and W.D. Wallis, "Complete of the spectrum of orthogonal diagonal latin squares" in Rolf S. Rees (ed.), Graphs, Matrices, and Designs, Routledge, 1992, pp. 43–49. When I last checked (Dec. 2020) Amazon's 'Look inside' gave access to all but the last two pages of this paper.
Theorem no. 132: Theaetetus' Theorem on the Platonic Solids

 A nice account of the proof of this theorem using Euler's Polyhedral Formula is given by John D. Cook here.
 I recommend this intriguing discussion by Pat Ballew of which of the Platonic solids is 'most spherical'. Rather in the same vein, a picture on twitter (14 Jan 2021) from @KangarooPhysics shows the solids wearing circular belts around their 'waists'.
 This theorem is the choice of Justin Curry in Episode 8 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
Theorem no. 133: The Total Probability Theorem

 In The Doctrine of Chances: Probabilistic Aspects of Gambling, SpringerVerlag, 2010, Stewart N. Ethier writes (p. 68) "The conditioning law (... often called the law of total probability) was used without comment by Montmort and De Moivre. It was formalized in the derivation of Bayes's law."
 An excellent and much more thorough application of probability theory to the deuce rule of tennis is provided here by Chalkdust magazine.
Theorem no. 134: The Change of Variables Theorem
 The Victor Katz article referred to on the theorem page is Victor J. Katz, "Change of variables in multiple integrals: Euler to Cartan", Mathematics Magazine, Vol. 55, Issue 1, 1982, pp. 3– 11; online (paywall); December 2020 there was a facsimile version here (2nd down).
Theorem no. 135: Praeger's Theorem on Bounded Movement
 Original source for this theorem: Cheryl E. Praeger, "On Permutation Groups with Bounded Movement", Journal of Algebra, Vol. 144, Issue 2, 1991, pp. 436–442; online.
Theorem no. 136: Theorems of Euler and Rényi on e
 Original sources for this theorem:

Euler published his result in "Calcul de la probabilité dans le jeu de rencontre", Mémoires de l'Académie des Sciences de Berlin, année 1751, 7 (1753) pp. 255–270; online. It is unfair to call it a 'theorem of Euler' since De Montmort had already derived it in 1713, although without proof. The rich history of the problem of derangements is admirably charted in Lajos Takács, "The problem of coincidences", Archive for History of Exact Sciences, Vol. 21, Issue 3, 1980, pp. 229–244; online (paywall, there was a copy here in January 2021).
 A. Rényi, "Some remarks on the theory of trees", Publ. Math. Inst. Hungar. Acad. Sri., 4 (1959), pp. 73–85. I cannot find an online copy but there are more details including a proof of Rényi's formula in Lajos Takács, "On Cayley's formula for counting forests", Journal of Combinatorial Theory, Series A,
Vol. 53, Issue 2, 1990, pp. 321–323; online.
 The theory of permutation groups is naturally a rich source of results on derangements, beginning perhaps with Jordan's 1873 theorem that a transitive group always has one. Peter Cameron's blog has much on the topic (type 'derang' into the search box).
Theorem no. 137: Wallis's Product
 There are two superb articles on Wallis's work by Jacqueline A. Stedall in the September 2000 issue of the Royal Society's Notes and Records. The quote on the theorem page is from the first of these. They are paywalled but seem to be made openaccess from time to time, so worth checking.
 There are closely related product formulae for other constants such as \(e\). The most elegant is perhaps that of Nicholas Pippenger, "An infinite product for e", American Mathematical Monthly, vol. 87, no. 5, 1980, p. 391; online (paywall). See also this preprint by Jonathan Sondow and Huang Yi.
Theorem no. 138: Vaughan Pratt's Theorem

 Original source for this theorem: Vaughan R. Pratt, "Every prime has a succinct certificate", SIAM J. Comput., Vo. 4, No. 3, 1975, pp. 214–220; online (paywall, a scanned copy was here, February 2021).
 Since polynomialtime solvable problems are automatically in NP this theorem is, since 2002, a corollary of the wellknown algorithm of Agrawal, Kayal and Saxena (see also the weblink for this theorem).
Theorem no. 139: Strassen's Matrix Theorem

 Original source for this theorem: Strassen, Volker, "Gaussian elimination is not optimal", Numer. Math., 13, 1969, pp. 354–356; online (paywall, a facsimile is available from Göttinger Digitaisierungszentrumdf).
 Virginia Williams' account of her reduction in ω can be found in overview and technical versions at her website. There is a very good account of her work at Gödel's Lost Letter by Richard Lipton. Until October 2020, the race for the bottom for ω was between Williams and François Le Gall, who also has an article with Florent Urrutia on nonsquare matrix multiplication; in October 2020 a short lead was taken by Williams, as welldescribed by Kevin Hartnett for Quanta Magazine. There is a short discussion about the problem at cstheory.stackexchange.
 An interesting discussion by Bill Gasarch on limitations of current approaches to getting ω = 2 + ε can be found at Lance Fortnow' & Bill Gasarch's comutational complexity blog.
 The complexity of matrix multiplication is inimately connected with some conjectures in extremal set theory as discussed by Gil Kalai.
 Conventional wisdom says that Strassen is only advantageous for large matrices but this is challenged in Jianyu Huang, Tyler M. Smith, Greg M. Henry and Robert A. van de Geijn, "Strassen's Algorithm Reloaded", Proc. The International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), Salt Lake City, UT, November 2016. A preprint is to be found on Huang's webpage.
 Paolo D’Alberto has a whole blog devoted to technical issues of fast matrix multiplication.
 Veit Elser has this intruiging machine learning preprint: "A network that learns Strassen multiplication"
Theorem no. 140: A Theorem on Rectangular Tensegrities

 Original sources for this theorem:
 Bolker, E. D. and Crapo, H., "How to brace a onestory building", Environment and Planning B: Planning and Design, Vol 4, Issue 2, 1977, pp. 125–152; online (paywall).
 Jenny A. Baglivo and Jack E. Graver, Incidence and Symmetry in Design and Architecture, Cambridge University Press, 1983, Chapter 3, Section 1.
 This post by Dave Richardson gives a nice introduction to the subject of graph theory and rigidity.
 Joseph Malkevitch has a valuable bibliography (up to 2001) of rigidity publications in which the work cited in this theorem description can be located.
Theorem no. 141: The Piff–Welsh Theorem

 Original source for this theorem: M. J. Piff and D. J. A. Welsh, "On the vector representation of matroids", J. London Math.Soc., Vol. 2, no. 2, 1970, pp. 284–288; online (paywall).
 A fast algorithm for constructing representations of transversal matroids is given in RekabEslami, M., Esmaeili, M. & Gulliver, T.A., "A fast algorithm to construct a representation for transversal matroids", Japan J. Indust. Appl. Math., 33, 2016, pp. 207–226; online (paywall).
 An excellent online introduction to matroid theory is provided by Joseph Malkevitch here.
Theorem no. 142: Sylvester's Law of Inertia

 Original source for this theorem: J.J. Sylvester, "A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares", Philosophical Magazine IV, 1852, pp. 138–142; online (facsimile from the Hathi Trust digital library, there is a pdf via the recommended webpage for this theorem: www.maths.ed.ac.uk/~aar/sylv/.
Theorem no. 143: The Robinson–Schensted–Knuth Correspondence

 Original sources for this theorem:
 G. de B. Robinson, "On the representations of the symmetric group", Amer. J. Math., 60 (3), 1938; online (paywall, there was a copy here, January 2021). Robinson adds a part II and a Part III to this paper, in vol. 69 (2), 1947, and in vol. 70 (2), 1948, but I think only part I is relevant here, although Knuth's investigations (see below) overlapped with part II.
 Schensted, C., "Longest increasing and decreasing subsequences", Canadian Journal of Mathematics, 13, 1961, pp. 179–191; online
 D.E. Knuth, "Permutations, matrices, and generalized Young tableaux", Pacific J. Math., Vol. 34, Number 3 (1970), 709–727; online.
Theorem no. 144: Lieb's Square Ice Theorem

 Elliott Lieb's original paper is "Residual Entropy of Square Ice", The Physical Review, vol. 162, no. 1, 1967, pp 162–172. Even after 50 years you still have to pay to read it online but the first two pages are displayed here. (For Russian readers it is translated online here.)
 I originally gave as a web link from the theorem page a very nice but technical article by Stefan Felsner, Florian Zickfeld: "On the number of planar orientations with prescribed degrees", Electronic Journal of Combinatorics, vol. 15, 2008; online. This deals with orientations of planar graphs in much more generality — Lieb's result appears in section 2.2.
 It has been discovered that water can form into square ice at room temperature by confining it using layers of graphene.
Theorem no. 145: The Contraction Mapping Theorem

 mathcounterexamples.net has a valuable collection of counterexamples showing that all the hypotheses of this theorem are needed.
 This theorem is the choice of Vidit Nanda in Episode 24 of Kevin Knudson and Evelyn Lamb's My Favorite Theorem podcast.
 John D. Cook has an intriguing blog entry on Kepler's use of Banach's theorem three centuries before it was discovered!
Theorem no. 146: The Panarboreal Formula

 Original sources for this theorem: the theorem as given is from F. R. K. Chung
and R. L. Graham, "On universal graphs for spanning trees", J. Lond. Math. Soc., Vol. s227, Issue 2, 1983, pp. 203–211; online (paywall,
280KB pdf, April 2021) but is first mentioned in their 1979 paper "On universal graphs", Annals of the New York Academy of Sciences, 319 (1979), 136–140; 220KB pdf, April 2021.The \(\frac12n\log n\) lower bound on the size of a 'universal graph' is proved as Theorem 1 in F.R.K. Chung and R.L. Graham, "On graphs which contain all small trees", J. Combinat. Theory B, vol. 24, issue 1, 1978, pp 14–23; online.
 Some more work on panarboreal graphs and related issues is given in section 6.8 of "Spanning trees – A survey" by Kenta Ozeki and Tomoki Yamashita, 2010. A good source for recent research on universal graphs is Daniel Johannsen, Michael Krivelevich, Wojciech Samotij, "Expanders are universal for the class of all spanning trees", Combinatorics, Probability and Computing, Vol. 22, Issue 2, 2013, pp. 253–281; online (paywall, arxiv).
 The sequence of sizes of panarboreal graphs starts 0, 1, 2, 4, 6, 8, 11, 13, 16, 18, and is OEIS sequence A004401. This is the number of edges a graph needs to have in order to contain all \(n\)vertex trees. This is possibly smaller than for the question asked in our presentation of Chung and Graham's theorem since they ask for the number of edges when we insist the graph must also have \(n\) vertices. The values are perhaps the same, as pointed out in the OEIS entry.
Theorem no. 147: The Sophomore's Dream

 The decimal expansion of \(\int_0^1 x^x dx\) and related material can be found at oeis.org/A083648; \(\int_0^1 x^{x} dx\) is sequence A073009.
Theorem no. 148: A Theorem of Schur on RealRootedness

 Original source for this theorem: J. Schur, "Zwei Sätze über algebraische Gleichungen mit lauter reellen Wurzeln", J. Reine Angew. Math., 144, 1914, pp. 75–88; online (paywall, a facsimile is provieded at Göttinger Digitaisierungszentrum). Schur's use of the initial "J" is commented upon in his wiki entry. The result of Ernest Malo is in "Note sur équations algébriques dont toutes les racines sont réelles", Journal de Mathématiques spéciales, (ser.4), t. 4, 1895, p. 7–10 (I don't find this online, even for ready money).
Theorem no. 149: Euclid's Triangular Prism

 The original weblink for this theorem page was to Richard Fitzpatrick's site which links to a duallanguage complete Elements, nearly 5MB but a truly definitive web resource. This has been replaced on the page, just because it is more immediately accessible, by a link to David E. Joyce's page for Euclid 12.7.
Theorem no. 150: Woodall's Hopping Lemma

 Original source for this theorem is: D.R. Woodall, "The binding number of a graph and its Anderson number", J. Combinatorial Theory, Series B,
Vol. 15, Issue 3, December 1973, pp. 225–255; online. The theorem in question is Lemma 12.3.
 There is a good chapter on the Hopping Lemma in this 1995 LSE PhD dissertation of Sarah Jane Goodall.
 This arxiv preprint of Jan Kessler and Jens M. Schmidt seems of interest, offering "a polyhedral relative of Woodall's Hopping Lemma that allows cycle extensions through common neighbors of cycle vertex pairs even when none of these pairs have distance two in C". (See also this overview).
Theorem no. 151: The Small Prime Gaps Theorem

 Original source for this theorem: Daniel A. Goldston, János Pintz, Cem Yalçıl Yıldırım, "Primes in tuples, I", Annals of Mathematics, Vol. 170, Issue 2, 2009, pp. 819–86; online. Three further papers in the series extract further implications from the same methods: "Primes in tuples, II", Acta Mathematica, Vol. 204, Issue 1, 2010, pp. 1–47; online. "Primes in tuples III: On the difference {p_{n+?}p_{n}}", Funct. Approx. Comment. Math., 35, 2006, pp. 79–89; online; "Primes in tuples IV: Density of small gaps between consecutive primes", Acta Arithmetica, 160 (1), 2013, pp. 37–53; online.
 For the sake of contrasting this theorem with subsequent proofs that \(p_{n+1}p_n\leq c\) infinitely often for a constant \(c\), its conclusion may be replaced by \(p_{n+1}p_n\leq (\log p_n)^{1/2+\epsilon}\). The progress towards current knowledge is beautifully described by Terence Tao in this youtube lecture and, for a more general audience, this lecture by Vicky Neale.
 The background to Zhang's proof of \(p_{n+1}p_n\leq c\) and subsequent improvements by Maynard and Tao are described by John Friedlander in "Prime Numbers: A Much Needed Gap Is Finally Found", Notices of the AMS, Vol. 62, No. 6, June/July 2015, 660–664, online here. A more technical overview is given by Andrew Granville, "Primes in intervals of bounded length", Bull. Amer. Math. Soc., 52 (2015), 171–222, online here.
Theorem no. 152: De Moivre's Theorem

 The historical context of De Moivre's theorem is described in David R. Bellhouse and Christian Genest, "Maty’s Biography of Abraham De Moivre,Translated, Annotated and Augmented", Statistical Science, Vol. 22, No. 1, 2007, pp. 109–136; online. See, in particular, footnote 46 on p. 118.
 Here is a cute demonstration, using De Moivre, that the series \(\cos(1),\cos(2),\cos(3),\ldots\) does not converge.
Theorem no. 153: Euler's Partition Identity

 Original source for this theorem: Leonhard Euler, Introductio in Analysin Infinitorum, Vol. 1, 1748, Chapter 16, Section 326. A translation into English of the whole work (and Vol. 2) has been provided by Ian Bruce at his website 17centurymaths.com, where the original Latin may also be found, in pdf. The 'official' source for the work is here at the Euler Archive.
 D. H. Lehmer has provided a valuable explanation of two generalisations of Euler's identity, Glaisher's Theorem and Roger's Theorem in "Two nonexistence theorems on partitions", Bull. Amer. Math. Soc., Vol. 52, Number 6,1946, pp. 538–544; online.
Theorem no. 154: The Remainder Theorem

 Colin Beveridge has a good intuitive introduction to the remainder and factor theorems at FlyingColoursMaths.
Theorem no. 155: A Tripartite Turán Theorem

 The source for this theorem is Adrian Bondy, Jian Shen, Stéphan Thomassé and Carsten Thomassen, "Density Conditions For Triangles In Multipartite Graphs", Combinatorica, Vol. 26, Issue 2, 2006, pp 121–131; online (paywall); preprint.
Theorem no. 156: The Lecture Hall Partition Theorem

 The source for this theorem is BousquetMélou, M., Eriksson, K., Lecture Hall Partitions. The Ramanujan Journal 1, 101–111 (1997); online.
Theorem no. 157: The Transversal Matroid Theorem

 Original ources for this theorem:
 Edmonds, J; Fulkerson, D.R., "Transversals and matroid partition", Journal of Research of the National Bureau of Standards, Section B, vol 69, issue 3, 1965, pp. 147–153;
online;
 Mirsky, L. and Perfect, H., "Applications of the notion of independence to problems of combinatorial analysis", J. Combinatorial Theory, vol. 2, issue 3, 1967, pp. 327–357; online.
 There is a very valuable pair of posts on Qiaochu Yuan's Annoying Precision blog giving "proofs of the Sylow theorems which I am actually able to remember".
Theorem no. 158: The Albert–Brauer–Hasse–Noether Main Theorem

 The restriction of the Main theorem to number fields is essential: not every finitedimensional division algebra is a cyclic algebra. A counterexample, due to A. Adrian Albert, is described on page 57 of Lewis's article (the weblink on the theorem page).
 There is more on Käte Hey's contributions to modern algebra and number theory in Peter Roguette's "Class Field Theory in characteristic p. Its origin and development".
Theorem no. 159: The McIver–Neumann Halfn Bound

 Original source for this theorem: A. McIver and P. M. Neumann, "Enumerating finite groups", Quart. J. Math. Vol. 38, Issue 4, 1987, pp. 473–488; online (paywall). Some background is given via Peter Cameron's blog.
 The Frobenius groups, of which \(F_{20}\) is used to illustrate this theorem page, has a nice description by Emmanuel Amiot here. \(F_{20}\) makes another apparence illustrating the third isomorphism theorem.
Theorem no. 160: The Classification of Archimedean 4Polytopes

 Tony Phillips' review (a 2.4MB download from here) of Tony Robbin's Shadows of Reality, Yale University Press, 2006, contains much fascinating material on 4dimensional solids. Also recommended is Snezana Lawrence, "Life, architecture, mathematics, and the fourth dimension", Nexus Network Journal, 17, 587–604 (2015); online.
 The Historia Mathematica article by Irene PoloBlanco is chapter 5 of her excellent 2007 University of Groningen thesis Theory and History of Geometric Models which may be found online here.
 If only Alicia Boole Stott could have tried the virtual reality game Hypernom!
Theorem no. 161: Quadratic Nonresidue is ZeroKnowledge Provable

 There is at least one reallife illustration of zeroknowledge proofs in the field of nuclear disarmament. A follow up.
 Jeremy Kun has a fine series of blog posts on zeroknowledge proofs. Follow from here.
 There is a wellknown presentation of the zeroknowledge paradigm: Quisquater JJ. et al. (1990) "How to explain zeroknowledge protocols to your children. In: Brassard G. (ed.) Advances in Cryptology — CRYPTO’ 89 Proceedings. CRYPTO 1989. Lecture Notes in Computer Science, vol 435. Springer, New York, NY; online (paywall); facsimile. The 'et al.' in the citation represents "Myriam Quisquater, Muriel Quisquater, Michaël Quisquater, Louis Guillou, Marie Annick Guillou, Gaïd Guillou, Anna Guillou, Gwenolé Guillou, Soazig Guillou" who I presume include the 'children' (of JeanJacques Quisquater and Louis Guillou). Tom Berson is credited with the English version.
Theorem no. 162: Heath’s Finitely Discontinuous Function Theorem

 Original source for this theorem is Jo Heath, "kto1 functions between graphs with finitely many discontinuities", Proc. AMS, vol. 103, no. 2, June 1988, pp. 661–666; online. The term 'wiggle' for limit constructions of continuous functions appears in print in, for example, John Baptist Gauci,
Anthony J. W. Hilton and Dudley Stark, "Wiggles and finitely discontinuous kto1 functions between graphs", J. Graph Theory, Vol. 74, Issue3, November 2013, pp. 275308; online (paywall); but Anthony Hilton's use of the term dates at least as far back as 2008 when he talked about wiggles to the Combinatorics Study Group at Queen Mary University of London (see 12 December).
Theorem no. 163: The Friendship Theorem

 A useful little entry at The Futility Closet links to an elementary (purely graphtheoretic) proof of this theorem by Judith Longyear and Torrence Parsons.
Theorem no. 164: The Diaconis–Holmes–Montgomery CoinTossing Theorem

 Original source for this theorem is Persi Diaconis, Susan Holmes, and Richard Montgomery, "Dynamical bias in the coin toss", SIAM Review, 2007, Vol. 49, No. 2 : pp. 211–235; online (paywall).
What I presume is a preprint is available here (5MB pdf file) on Holmes' webpage. My illustration of the theorem is partly based on figures 3 and 4 of this preprint.
Theorem no. 165: Lin McMullin's Theorem

 Original sources for this theorem are L. McMullin, A. Weeks, "The golden ratio and fourth degree polynomials, OnMath, Winter 200405, Vol., Number 2; and McMullin, L., "How I found the golden ratio on my CAS", The North Carolina Association of Advanced Placement Mathematics Teachers Newsletter, 13 (1) (Winter 2005) pp. 6–7. Neither article appears easy to track down now. The theorem appears to have been discovered before, by Herman Theodor Rendtorff Aude of Colgate University: HTR Aude, "Notes on Quartic curves", The American Mathematical Monthly, Vol. 56, Issue 3, 1949, pp. 165– 170; online (paywall); see also, Reinert A. Rinvold, "Fourth degree polynomials and the golden ratio", The Mathematical Gazette, Vol. 93, Issue 527, July 2009, pp. 292–295; online (paywall).
Theorem no. 166: Haken's Unknot Theorem

 Original source for this theorem: Wolfgang Haken, "Theorie der Normalflächen: Ein Isotopiekriterium für den Kreisknoten", Acta Math., Vol. 105, Number 34 (1961), pp. 245–375; online.
 For more on the complexity of determining unknottedness there is a wiki page on the subject. Notably, a quasipolynomialtime (i.e. \(n^{O(\log n))}\)) algorithm was announced in February 2021 by Marc Lackenby
 For details of the implementation of recognition algorithms for the unknot, notably that of Joan Birman and Michael Hirsch, see Joan S. Birman, Marta Rampichini, Paolo Boldi and Sebastiano Vigna, Towards an implementation of the BH algorithm for recognizing the unknot", J. Knot Theory and its Ramifications, vol. 11, no. 4, 2002, pp.601–645 ; online (paywall); preprint. For more on Birman's work in lowdimensional topology generally see her MacTutor entry.
 Knot theory deals with embeddings of the 1sphere \(S^1\)^{} (a closed curve) into 1 + 2 = 3 dimensions. More generally we can embed the nsphere \(S^n\)into n + 2 dimensions. Thus \(S^2\), a hollow sphere, is embedded into 4 dimensional space. And we can ask about the decision problem: is our embedding the 'unknot'? For \(n\geq 3\) the answer is that the question is undecidable: this was proved in 1996 by Alexander Nabutovsky and Shmuel Weinberger. However, decidability in the case \(n=2\) remains an open problem. A wonderful discussion of these issues is given here by Bjorn Poonen.
Theorem no. 167: The Lindemann–Weierstrass Theorem

 The proof of the transcendence of \(e\), Charles Hermite's breakthrough 1873 result, is very clearly described here.
 The Dottie number has its own Wikipedia page.
Theorem no. 168: The MaxFlow MinCut Theorem

 Original sources for this theorem:
 Ford, L. R. and Fulkerson, D. R., "Maximal flow through a network", Canadian Journal of Mathematics, Vol. 8, 1956, pp. 399–404; online; previously appeared as RAND report P605; online (paywall)
 P. Elias, A. Feinstein and C. Shannon, "A note on the maximum flow through a network", IRE Transactions on Information Theory, Vol. 2, Issue 4, 1956, pp. 117–119; online (paywall). A facsimile is available via semanticscholar.
 The Elias et al paper cites a third source: G. B. Dantzig and D. R. Fulkerson, "On the MaxFlow MinCut Theorem of Networks", in "Linear Inequalities", Ann. Math. Studies, no. 38, Princeton, New Jersey, 1956, which I presume to be a reincarnation of a RAND report of the same name: P826, 1955; onine (paywall); a facsimile can be read here (1.1MB pdf); the purport seems to be that DantzigFulkerson is the first 'constructive' (algorithmic) proof of the Maxflow Mincut theorem.
 There is more on the history of this theorem at Whitty, R. W. "Some comments on multiple discovery in mathematics", Journal of Humanistic Mathematics, Volume 7 Issue 1(January 2017), pp. 172–188; online (see top of page 179).
 In "Some comments" there is an attribution to Anton Kotzig, which was repeated in earlier versions of this theorem page: "restricted as in our example to integer capacities, by A. Kotzig". I can no longer locate the source for this attribution; it is specific enough that I must have read it somewhere but I would expect such research to be mentioned in such an authoritative tribute as
Jaromír Abrham, Alexander Rosa, Gert Sabidussi and Jean M. Turgeon, "Anton Kotzig 1919–1991", Mathematica Slovaca, Vol. 42, No. 3, 1992, pp. 381–383; online.
Theorem no. 169: Sokal's Theorem on Chromatic Roots

 The original source for this theorem is Alan D. Sokal, "Chromatic roots are dense in the whole complex plane", Combinatorics, Probability and Computing, Vol. 13, Issue 2,
March 2004 , pp. 221–261; online (paywalled);
preprint. Some interesting subsequent work by Adam Bohn is reported in "A dense set of chromatic roots which is closed under multiplication by positive integers", Discrete Mathematics, Vol. 321, 28 April 2014, Pages 45–52; online.
Theorem no. 170: Machin's Formula

 Machin's Formula has an alternative statement viz \(\tau/8=4\cot^{1}5\cot^{1}239\), thanks to the relationship between \(\cot^{1} x\) and \(\tan^{1}(1/x)\). This version is somewhat neater (although the inverse cotangent function is not directly available on calculators or in spreadsheets) and is preferred by some writers c.f. Pat's Blog.
 Another entry at Pat's Blog offers an interesting synopsis of the history of Machin's series.
Theorem no. 171: The BEST Theorem

 Original sources for this theorem:
 Tutte, W. T. and Smith, C. A. B., "On unicursal paths in a network of degree 4", American Mathematical Monthly, Vol. 48, Issue 4, 1941, pp. 233–237; online (paywall).
 van AardenneEhrenfest, T. and de Bruijn, N. G., "Circuits and trees in oriented linear graphs", Simon Stevin: Wis en Natuurkundig Tijdschrift, 28, 1951, pp. 203–217; online.
 The BEST theorem finds an application in probability theory in an interesting contribution to exchangeability of random variables: Ivan Bardet, Cécilia Lancien, Ion Nechita, "de Finetti reductions for partially exchangeable probability distributions"; online preprint. The application has its origins in a paper from 1984 of Arif Zaman: "Urn models of Markov exchangeability", Ann. Probab., Vol. 12, No. 1 (1984), 223–229; online. Thanks to Ion Nechita for alerting me to this.
 A further rich source of connections to knot theory and graph polynomials is Richard Arratia, Béla Bollobás and Gregory B.Sorkin, "The interlace polynomial of a graph", Journal of Combinatorial Theory, Series B,
Vol. 92, Issue 2, 2004, pp. 199–233; online.
Theorem no. 172: The Dyson–Andrews–Garvan Crank

 Original sources for this theorem:
 F.J. Dyson, "Some guesses in the theory of partitions", Eureka, Vol. 8, 1944, pp. 10–15; online (the same edition has a elegant proof by Dyson of the Fundamental Theorem of Algebra).
 George E. Andrews and F. G. Garvan, "Dyson's crank of a partition", Bull. Amer. Math. Soc. (N.S.),
Vol. 18, No. 2 (1988), 167–171; online.
 The table illustrating this theorem page now seems to me rather impenetrable! The colour blocks (in rowmajor order) correspond to partitions of 17 whose \(M\) count (number of 1s) is \(17,16,15,\ldots\, 1,0\). The lengths of the blocks are given by OEIS sequence A002865. The entries (crank values) are just \(M \!\!\mod 11\) until the 3rd entry in row 2, which records the partition consisting of 8 1s and a 9. Here \(N=1\), so the crank is \(18 \!\!\mod 11=4\). The final 66 table entries correspond to partitions of 17 having zero 1s, with the crank value being the value of \(\lambda\). Partitions are arranged lexecographically, with the final few partitions being \((5\,6\,6),\, (5\,12),\,(6\,11),\,(7\,10),\,(8\,9),\,(17)\).
Theorem no. 173: The Ramanujan Partition Congruences

 Original sources for this theorem:
 Ramanujan, S., "Some properties of p(n), the number of partitions of n", Proc. Cambridge Philosophical Society, 19, 1919, pp. 207–210; online (transcription at ramanujan.sirinudi.org).
 Ramanujan, S., "Congruence properties of partitions", Mathematische Zeitschrift, 9 (1–2), 1921, pp. 147–153;online (paywall, a transcription at ramanujan.sirinudi.org) (prepared from Ramanujan's manuscripts after his death by G.H. Hardy)
 There is a very nice account of the \(p=5\) congruence ('Ramanujan's most beautiful identity') by Christian Krattenthaler in the June 2017 issue of the European Mathematical Society Newsletter. A direct link to the pdf file is here (2.8MB), with Krattenthaler's article beginning on p. 41 and the Ramanujan part beginning on p. 47.
 A good discussion "Congruence properties of the partition function" by Tony Forbes is here (pdf, 0.5MB download, 100 pages long but the last 90 pages form an appendix listing computergenerated identities and can be ignored by most readers, I imagine).
Theorem no. 174: The Cameron–FonDerFlaass IBIS Theorem

 Original source for this theorem: P.J. Cameron and D.G. FonDerFlaass, "Bases for permutation groups and matroids", European J. Comb., Vol. 16, Issue 6, 1995, pp. 537–544; online.
 Peter Cameron has a nice tribute post on his blog which describes the IBIS theorem.
 The original weblink from this theorem page was a fine article "Quantifying symmetry" by Jonathan A. Cohen, The Australian Mathematical Society Gazette,
Vol. 32, Number 2, May 2005. It offers a very good introduction to bases of permutation groups but doesn't mention IBIS groups which is why eventually I preferred to link to a talk, by Cameron, which does.
Theorem no. 175: The Bungers–Lehmer Theorem on Cyclotomic Coefficients

 Original sources for this theorem:
 A. Migotti, "Zur Theorie der Kreisteilungsgleichung", Sitzber. Math.Naturwiss. Classe der Kaiser. Akad. der Wiss., 87, 1883, pp. 7–14; I don't find this source online. The result appears to have been discovered indendently by A. S. Bang, "Om Ligningen φ_{n}(x) = 0", Nyt tidsskrift for matematik, Vol. 6, Afdeling B, 1895, pp. 6–12; online (paywall). The paper is in Danish which I don't read; the attributed is in Marion Beiter, "The midterm coefficient of the cyclotomic polynomial F_{pq}(x)", The American Mathematical Monthly, Vol. 71, No. 7, 1964, pp. 769–770; online (paywall, 300KB pdf download, March 2021).
 Emma Lehmer, "On the magnitude of the coefficients of the cyclotomic polynomial", Bull. Amer. Math. Soc., 42 (6), 1936, pp. 389–392; online.
 Rolf Bungers proof conditional on the infinitude of twin primes is cited by Lehmer as appearing in his 1934 Göttingen dissertation, which I think is "Über die Koeffizienten von Kreisteilungspolynomen", which shows up, minimally, on google books.
 Jiro Susuki's proof that any integer is a cyclotomic coefficient is in "On coefficients of cyclotomic polynomials",
Proc. Japan Acad. Ser. A Math. Sci., 63(7), 1987, pp. 279–280; online. The result is strengthened in ChunGang Ji and WeiPing Li, "Values of coefficients of cyclotomic polynomials, Discrete Mathematics, Vol. 308, Issue 23, 2008, pp. 5860–5863; online.
 A tight superpolynomial bound on the growth of maximum absolute values of coefficients was obtained in 1949 by Paul Erdős (lower bound) and Paul T. Bateman (upper bound). See Erdős, P., "On the growth of the cyclotomic polynomial in the interval (0,1)", Glasgow Math. J., Vol. 3, Issue 2, 1957, pp. 102–104; online.
 The famous proof of Wedderburn's Little Theorem by Ernst Witt is based on cyclotomic polynomials and is a great tangent to follow. See the recommended weblink on that page.
 An exciting new development in the study of cyclotomic coefficients is the discovery by Gregg Musiker and Victor Reiner of a topolocial interpretation. See this (250KB pdf), for example.
Theorem no. 176: The Existence Theorem for Bachelor Latin Squares

 Original sources for this theorem: Evans, A.B., "Latin squares without orthogonal mates", Designs, Codes and Cryptography, Vol. 40, 2006, pp. 121–130; online (paywall); and Wanless, I.M. and Webb, B.S., "The existence of latin squares without orthogonal mates", Designs, Codes and Cryptography, Vol. 40, 2006, pp. 131–135; online (paywall); reprint.
Theorem no. 177: The Heine–Borel Theorem

 Original source for this theorem: Borel, Émile, "Sur quelques points de la théorie des fonctions", Annales Scientifiques de l'École Normale Supérieure, 3, 12, 1895, pp. 9–55; online. Nicole R. Andre, Susannah M. Engdahl and Adam E. Parker give a wonderful early history of this result in "An Analysis of the First Proofs of the Heine–Borel Theorem", Convergence, Vol. 9, 2012, online here. The Wiki page on compactness is also very good.
 The notion of compactness is given a very good intuitive treatment by Evelyn Lamb here.
Theorem no. 178: Sendov's Conjecture

 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not confirm Sendov's conjecture for polynomials of arbitrary degree.
 The proof of Sendov for degree 8 is: Johnny E.Brown and Guangping Xiang, "Proof of The Sendov Conjecture for Polynomials of Degree at Most Eight", Journal of Mathematical Analysis and Applications,
Vol. 232, Issue 2, 1999, 272–292; online.
 Jérôme Dégot's proof of Sendov for high degree appeared in 2014 as "Sendov conjecture for high degree polynomials", Proc. AMS, vol. 142 (2014), 1337–1349. It is paytoview online but a preprint is here.
 There is a nice snapshot of Dégot's result at about p. 100 of this cornucopia (1.1MB pdf) by Pamela Gorkin.
 A paper by Zaizhao Meng on the arxiv claims a proof of Sendov for polynomials of degree 9. A paper by Dinesh Sharma Bhattarai claims a proof of Sendov for polynomials of degree 10. Also this and this claiming to prove the conjecture outright, but posting errors.
 Progress has been made by Robert Dalmasso for the case where the zeros of the polynomial are simple.
 Terence Tao has posted an unconditional proof of Sendov for high degree: "Sendov’s conjecture for sufficiently high degree polynomials". The introduction has a more complete review than the above of recent work on the conjecture.
Theorem no. 179: The Descartes Circle Theorem

 A proof 'from the book' of this theorem is given in Levrie, Paul, "A straightforward proof of Descartes's circle theorem", The Mathematical Intelligencer, 41:3 (2019), pp. 24–27; online.
Theorem no. 180: The Greibach Normal Form Theorem

 Original source for this theorem: Greibach, Sheila, "A new normalform theorem for contextrree phrase structure grammars". Journal of the ACM., vol. 12 (1), 1965, pp. 42–52; online (paywall).
 The algorithm converting \(G\) to Greibach Normal Form on \(O(G^4)\) symbols is given in Norbert Blum and Robert Koch, "Greibach Normal Form transformation revisited", Information and Computation,
Vol. 150, Issue 1,1999, pages 112–118; online.
 Prof. Greibach was kind enough to send me a few comments on this theorem which I quote below:

Although my original definition of GNF is as you describe, in my
class notes I now permit S > emptystring if S does not appear on
the right hand side of any production and similarly for CNF (Chomsky
Normal Form) as in the notes you attach, so that all contextfree
languages are covered.
As far as I know, GNF is not used in any grammarpda transformations
directly; it is essential to conversion to a pda without epsilonrules,
i.e. to a nondeterministic pda which must read a new input each unit
of time (I usually call this quasirealtime). Indeed, the fact that
GNF suffices for contextfree languages is equivalent to the fact
that nondeterministic pda are equivalent in power to quasirealtime
pda.
My normal form was proven in 1962 and appears in my 1963 thesis but,
as you note, the first full publication was in 1965. 

Theorem no. 181: Archimedes’ Equiareal Map Theorem

 Bradley Carroll has a very nice series of pages on Archimedes' achievements, where we find (see this page) that Archimedes himself regarded his results on areas and volumes of curved bodies to be his finest work. The spherecylinder surface area ratio is quoted there as 2/3 whereas our version says the surface areas are equal; this is merely because our cylinder has no top or bottom. To be specific, the ratio for a sphere of radius \(r\) and a cyclinder of radius \(r\) and height \(h\), is \(2\tau r^2/(\tau r h+\tau r^2)\); we have \(h=2r=2\) and omit the second term in the denominator.
 I could not resist invoking E.T. Bell's celebrated triumvirate in connection with this theorem. That is shameless popularism, though, since I entirely subscribe to Thony Christie's injunction "Context is everything".
 The attribution to GaussviaNewton of differential geometry is even more shameless. In Dirk Jan Struik's classic twopart "Outline of a History of Differential Geometry", the whole of part 1 is preGauss (Isis, vol. 19, no. 1, 1933, pp. 92–120; online (paywall)) with the key modern players being Clairaut, Euler and Monge. However, Gauss's work in the 1820's pervades a large proportion of part 2 (Isis,
vol. 20, no. 1, 1933, pp. 161–191; online (paywall)), and the first and second fundamental forms belong to intrinsic differential geometry which was intrinsically Gauss.
Theorem no. 182: The Girard–Newton Identities

 A short survey of elementary proofs of these identities, together with an elegant new proof using matrix algebra, is given in Dan Kalman, "A Matrix Proof of Newton's Identities", Mathematics Magazine, Volume 73, Number 4, October 2000, pp 313  315 (online here, dated 8/16/99).
 The photo of Peter Cameron is a cropped version of one I found on his 60th birthday conference website, maintained by Robert Bailey. It is attributed to Adrian Bondy by Peter Cameron himself. He told me in an email that "the picture was taken by Adrian Bondy ... at the Victoria Arms in Oxford (at Dominic Welsh's retirement conference in 2005 (I think)." From a followup email from Bondy "I don't recall having taken the photo, but it's possible." Since Bondy's photography is art (see his gallery website) I take the issue seriously!
Theorem no. 183: Theorema Egregium

 Original source for this theorem: Karl Friedrich Gauss, "Disquisitiones generales circa superficies curvas auctore Carolo Friderico Gauss. Societati regiæ oblatæ D. 8. Octob. 1827", Commentationes societatis regiæ scientiarum Gottingensis recentiores, Commentationes classis mathematicæ. Tom. VI. (ad a. 1823–1827), Gottingæ, 1828, pp. 99–14. An English translation with introduction is offered as a pdf (1MB) download by Project Gutenberg.
 An excellent and beautifully illustrated technical source on this theorem is Nigel Hitchin's notes on Geometry of Surfaces, under Teaching here.
Theorem no. 184: von Neumann's Minimax Theorem

 Original source for this theorem is: von Neumann, J., "Zur Theorie der Gesellschaftsspiele", Mathematische Annalen, 100 (1),1928, pp. 295–320; online (paywall, facsimile at Göttinger Digitaisierungszentrum).
 A superb analysis of the origins of von Neumann's theorem is Tinne Hoff Kjeldsen, "John von Neumann’s Conception of the Minimax Theorem: A Journey Through Different Mathematical Contexts", Arch. Hist. Exact Sci. 56 (2001) 39–68. There is reprint online here.
Theorem no. 185: Kőnig's Bipartite Matching Theorem

 This theorem is commonly referred to as the Kőnig–Egerváry theorem, having been discovered independently and simultaneously by Kőnig's compatriot Jenő Egerváry.
Theorem no. 186: The Insolvability of the Entscheidungsproblem

 Some contextual information is given in a presentation (500KB pdf) I gave at Rewley House on 23 June 2012.
 An good source of undecidable problems is this 2012 survey (450KB pdf) by Bjorn Poonen. Related links on Diophantine undecidability are: James P. Jones paper "Diophantine Representation of the Fibonacci Numbers" (1.3MB pdf) ,
and this paper by Yuri Matiasevich (in which some of the characters seems to print strangely but not unreadably).
 Details of Jack Copeland's Essential Turing are given here; Charles Petzold's reading guide to Turing's 1936 paper is listed here. Biographies of Turing are listed here.
 Hilbert's 1930 "Wir müssen wissen, Wir werden wissen" radio address is online here with a transcription and an accompanying English translation.
 There is a poetry version of the proof of nondecidability of the Halting problem here by Geoffrey K. Pullum, as I learnt from Pat'sBlog.
 At the heart of Turing's result is the demonstration that not all functions from the natural numbers to the natural numbers can be computed. Joel David Hamkins has the intriguing result that any function is computable if the right model of arithmetic is chosen. This means arithmetics which satisfy the axioms of the natural numbers but which contain additional 'nonstandard' numbers. A good introduction is provided by John Baez.
 This blog post from Gödel's Lost Letter is an excellent source on proving the unsolvability of the Halting Problem.
Theorem no. 187: Karp's Theorem

 Karp's original article is R.M. Karp, "Reducibility among combinatorial problems", in Complexity of Computer Computations (R.E. Miller and J.W. Thatcher, eds.), Plenum Press, 1972, pp. 85–103. It is reprinted with a nice introduction by Richard Karp in Michael Jünger, Thomas M. Liebling, Denis Naddef, George L. Nemhauser, William R. Pulleyblank, Gerhard Reinelt, Giovanni Rinaldi and Laurence A. Wolsey (eds.), 50 Years of Integer Programming 19582008: From the Early Years to the StateoftheArt, Springer, 2010. There is a copy of the chapter online here (scroll down to Lecture 19, last checked December 2020).
Theorem no. 188: alKāshi's Law of Cosines

 Garry J. Tee kindly provided the following amplification on the history of trigonometric functions: "Hipparchus in (c130) invented the chord, the first trigonometric function, and he constructed a short table of values of the chord function. (On a sphere of radius R, the chord of angle x is the distance between 2 points on the sphere subtending angle x at the centre). By the 5th century, Hindu astronomers had replaced the chord by the more convenient sine function, with \(2R\sin x = \mbox{chord}(2x)\). In 499, Aryabhata commenced his renowned astronomical treatise “Aryabhatiya” with a short table of sines."
 As is often the case with formulae in Euclidean geometry there are spherical and hyperbolic versions of this theorem. The spherical law of cosines is described here; the hyperbolic has a Wiki page.
Theorem no. 189: The Handshaking Lemma

 Architectural historian Dr Lynn Pearson kindly sent me the following comments in response to a query regarding the origins and attribution of the tiling pattern illustrating this theorem:

"While investigating the 6/7 murals query, I chanced upon the QM Physics archives website; the brochure about the new Physics Building (1962) is available at
ph.qmul.ac.uk/sites/default/files/brochure1963.pdf;
this details the six panels. I too made it six as there are six architectural 'bays'. But I see what you mean about the different section you have used for your theorem. Looking at the six panels, the middle 4 have the main pattern in white on a blue ground, with various small sections of coloured tiling around. But the two end panels have the blue section, and another smaller vertical section on a gold background. The one furthest from the road has the precessing orbit, plus what looks to me like ellipses/catenary curves?? – see archive pic:
ph.qmul.ac.uk/sites/default/files/alumni/donat_01.jpg
and
ph.qmul.ac.uk/sites/default/files/alumni/donat_20.jpg.
Are these two elements of the panel connected mathswise?
And the same goes for your panel, the one nearest the road – that's 'spreading of dislocations from a FrankRead source' plus the diagram you have in your theorem. Are these things linked in some way? It seems that the designers chose to 'finish off' the series of murals with slightly more ornate ones at each end.
Anyway, all these panels were by Carter's, but my notes from the Carter's photographic archive at Poole Museum show that the firm worked closely with the building's architects, Playne & Lacey, on the project. A man called R. Khosla, who worked for the architects, helped in the design of the 6 panels, but left before they were complete. As the head of design at Carter's, A. B. Read did much of the mural work; I'd think he completed the job. The tile painting would have been done by the firm's (lady) artists. I think that is as good an attribution as you will get." 

Dr Pearson also provided a link to a relevant article of hers, although for copyright reasons its images cannot be displayed. And there is a little more in the Tile Gazetteer.
 From plus magazine: applying the double counting argument proof of the Handshaking Lemma to the complete graph on n + 1 vertices is a neat way of proving that $$1+2+\ldots + n = \frac12n(n+1).$$
 Other impressive applications of the Handshaking Lemma: the proof of Sperner's Lemma in 2D; and this proof that the socalled 'LightsOut' game is solvable for any graph in which all vertices are initially 'turned on'.
Theorem no. 190: Jackson's Theorem on Compatible Euler Tours

 Original source for this theorem is Jackson, Bill, "A characterisation of graphs having three pairwise compatible Euler tours", J. Combinatorial Theory, Series B, Vol. 53, Issue 1, September 1991, pp. 80–92; online.
 The original weblink from this theorem page was to the fine set of notes (850KB pdf) posted here by Tero Harju. Still recommended, of course, but the replacement link to the Egerváry Research Group page is more directly relevant.
Theorem no. 191: L'Hospital's Rule

 A nice 'double' example of L'Hospital in action is the proof that \(\ln(x)\tan(x) \rightarrow 0\) as \(x\rightarrow 0\), starting in the \(\infty/\infty\) form as \(\ln(x)/\cot(x)\) and then transferring to the \(0/0\) form (twice).
 The expression \(0^0\) gets a thorough investigation by Michael Huber and V. Frederick Rickey in "What is \(0^0\)", Convergence, Vol. 5, 2008. Online here. Other good treatments are this blog post by David A. Tanzer (with over 50 very informative reader responses) and this from askamathematician (which has over 1000 responses, which I haven't read but I suppose there must be some interesting stuff there as well!)
 A nice short account of the Bernoulli vs L'Hospital ownership of this theorem is given here at Life Through a Mathematician's Eyes.
The original 1988 paper is here: \href{http://www.cs.cmu.edu/~sleator/papers/RotationDistance.htm}{www.cs.cmu.edu/$\sim$sleator/papers/RotationDistance.htm}.
Theorem no. 192: The Rotationdistance Bound

 Original source: Daniel D. Sleator, Robert E. Tarjan and William P. Thurston, "Rotation distance, triangulations, and hyperbolic geometry", J. Amer. Math. Soc., Vol.1, No.3., 1988, pp. 647–681; online.
Theorem no. 193: Frieze's Theorem on Expected Minimum Tree Length

 Original source: Alan M. Frieze, "On the value of a random minimum spanning tree problem", Discrete Applied Mathematics, 10 (1985), 47–56; online.
 A sharper asymptotic for Frieze's result is given in Colin Cooper, Alan Frieze, Nate Ince, Svante Janson, Joel Spencer, "On the length of a random minimum spanning tree", Combinator. Probab. Comp., 25 (2015) 89–107; online (paywalled), openaccess preprint.
 Find more wonderful properties of \(\zeta(3)\) in this preprint by David Broadhurst.
Theorem no. 194: Wilson's Theorem

 The (contrapositive to the) 'if' converse to the theorem follows because if \(n\) is composite then some \(d\),\(1<d<n\), divides \(n\). Then \(d\) appears as a factor of \((n1)!\) and therefore cannot divide \((n1)!+1\), in which case, neither can \(n\).
 Fredrik Johansson describes a neat trick for reducing the computation required for testing primality via Wilson's Theorem.
 A combinatorial argument which seems in the same spirit as P.G. Anderson et al's combinatorial lemma is found in Szilárd András, "A combinatorial generalization of Wilson’s theorem", Australasian J. Comb., Vol. 49, 2011, pp. 265–272; online. (direct pdf, 125KB)
 It seems convenient to record here Gauss's generalisation of Wilson's Theorem: if \(n>2\) then
$$\prod_{\stackrel{k=1}{\gcd(k,n)=1}}^n\hspace{.2in}k = \left\{\begin{array}{rcl}
1\mbox{ mod }n & & n=4, p^m, 2p^m,\\
1\mbox{ mod } n &&\mbox{otherwise.}\end{array}\right.$$
A very nice proof is given here by Pete L. Clark (under "Expository Papers").
Theorem no. 195: The Erdős–Ko–Rado Theorem

 Original sources for this theorem:
 P. Erdős, Chao Ko and R. Rado, "Intersection theorems for systems of finite sets", Quart. J. Math., Oxford Ser. (2) 12, 1961, pp. 313–320; online. The paper was written in 1938, however, see P. Erdős, "My joint work with Richard Rado", in Surveys in combinatorics 1987, London Math. Soc. Lecture Note Ser., 123, pp. 53–80, Cambridge Univ. Press, 1987; online.
 The Katona proof was published in Katona, G.O.H., "A simple proof of the Erdős–Chao Ko–Rado theorem", Journal of Combinatorial Theory, Series B,
Vol. 13, Issue 2, 1972, pp. 183–184; online.
 An interesting discussion by John Mount of proofs of Erdős–Ko–Rado can be found in this WinVector blog entry.
 The requirement that \(n\geq 2k\) is necessary since when \(k>n/2\) any pair of subsets must necessarily intersect. This makes \(n/2\) a transition point where the size of a maximum intersecting family jumps up, as illustrated below for \(n=50\). The horizontal axis is \(k\); the red dots are values of \({n1 \choose k1}\); the blue dots are values of \({n \choose k}\).
Theorem no. 196: The Cantor–Bernstein–Schröder Theorem

 There is a more here the history of this theorem.
 The proof presented for this theorem is streamlined by appealing to the KnasterTarski fixedpoint theorem which guarantees a fixed point for a monotone function on a complete lattice (in this case the power set lattice). More details at this MathWorld entry.
 CBS is, in the analysis of Williard Quine, one form of the 'law of comparability. In Set Theory and Its Logic, Harvard University Press, revised edition,1969, p. 208, he memorably says,
"Accidents of definition aside, there are three distinct things here: the Axiom of Choice, the SchröderBernstein Theorem, and triviality."
 A graphtheoretic proof of this theorem generalises to one about paths, as described in Reinhard Diestel & Carsten Thomassen, "A
CantorBernstein theorem for paths in graphs", American Mathematical Monthly, February 2006, online
here.
 The proof structure of CBS is analysed in depth by Wilfried Sieg, "The Cantor–Bernstein theorem: how many proofs?", Philosophical Transactions of the Royal Society, A, Vol. 377 Issue 2140, March 2019; online.
Theorem no. 197: The Robin–Lagarias Theorem

 Lagarias's paper was published as Jeffrey C. Lagarias, "An elementary problem equivalent to the Riemann Hypothesis", Amer. Math. Monthly, 109 (2002), 534–543. The recommended weblink on the theorem page is the arxiv preprint. Guy Robin's breakthrough is Robin, Guy (1984), "Grandes valeurs de la fonction somme des diviseurs et hypothèse de Riemann", Journal de Mathématiques Pures et Appliquées. Neuvième Série 63 (2), 1984, pp. 187–213. I think it may not exist online but check this math.stackexchange entry for updates. The earlier sources for this theorem are well charted in Lagarias's paper.
 A collection of assertions equivalent to the Riemann Hypothesis is given here.
 A nice Quora entry by Alan Amit discusses the implications of RH being false. In passing he points out that Lagarias's version of RH is elementary enough that a counterexample to RH would mean a disproof in Peano arithmetic, not something that can be easily deduced from finding a zero off the critical line.

Out of curiosity I plotted the divisor function \(\sigma(n)\) (red) against the RHS of Laragias's inequality \(H_n+\ln(H_n)\exp(H_n)\) for the first \(10^5\) values of \(n\). I find it hard to imagine betting against RH!
 On the subject of computational data on RH, Dave Platt, Tim Trudgian have a preprint "The Riemann hypothesis is true up to 3×10^{12}". It is a good source for related work.
Theorem no. 198: The Art Gallery Theorem

 Original sources for this theorem:
 Chvátal, V., "A combinatorial theorem in plane geometry", J. Comb. Theory, Series B, Vol. 18, Issue 1, 1975, pp. 39–41; online.
 Fisk, S., "A short proof of Chvátal's watchman theorem", J. Comb. Theory, Series B, Vol. 24, Issue 3, 1978, p. 374; online.
 There is a good account of the Art Gallery Theorem by Erica Klarreich for Quanta Magazine here. This talk (1.7MB pdf) by Hemanshu Kaul gives good overview of theorems and algorithms
in the 'Art Gallery' field.
 The theorem as proved here guards an art gallery with 'vertex guards', who are stationed at vertices of the polygon, although it is stated in terms of 'point guards, who may be anywhere inside (or on the boundary of) the polygon. The example chosen is of an 'orthogonal' gallery, in which all walls meet at angle \(\tau/4\). In fact, for orthogonal galleries a sharper result is possible: \(\lfloor n/4 \rfloor\) vertex guards are sufficient (and a square version of the 'comb' polygon shown on the theorem page proves necessity). J. Kahn, M. Klawe and D. Kleitman, "Traditional galleries require fewer watchmen", SIAM Journal on Algebraic Discrete Methods, 1983, vol. 4, No. 2 : pp. 194–206; online paywalled.
 A natural extension of the problem guards an \(n\)vertex polygon containing \(h\) disjoint polygons ('holes'). Here the tight bound for point guards is \(\lfloor (n+h)/3 \rfloor\). See I. BjorlingSachs & D. L. Souvaine, "An efficient algorithm for guard placement in polygons with holes", Discrete & Computational Geometry, vol. 13, pp. 77–109 (1995); online.
In fact \(\lfloor (n+h)/3 \rfloor\) is conjectured to be sufficient even if only vertex guards are allowed (Shermer, 1982). Again for orthogonal galleries the numerator is conjectured to be 4, for vertex guards (Shermer again).
A good survey is given by Paweł Żyliński "Placing guards in art galleries by graph coloring", chapter 13 in Marek Kubale (ed.), Graph Colorings, American Mathematical Society, 2004.
Theorem no. 199: Fermat's TwoSquares Theorem

 The Wiki page for this theorem gives a good account of its history.
 The representation of a prime \(p=4n+1\) as a sum of two squares is unique (upto order of summands). A nice proof using Gaussian primes, is given here.
 Strictly speaking, Lagrange's Lemma only goes one way: if \(p\) is congruent to 1 mod 4 then \(1\) is a quadratic residue mod \(p\). Lagrange used his lemma in 1773 to give a simpler proof than Euler's of Fermat's theorem. There is a short discussion in chapter 6 of Stillwell's Elements of Number Theory, Springer 2003 (see sections 6.5 and 6.8 and chapter 9 for the converse). The lemma is an instance of the Law of Quadratic Reciprocity (see this, for example).
 There is a famous 'onesentence' proof of this theorem by Don Zagier, "A onesentence proof that every prime \(p = 1\ (\!\!\!\mod 4)\) is a sum of two squares", Amer. Math. Monthly, Vol. 97, No. 2, 1990, p. 144; online (paywall, pdf download here, April 2021, and there is a good explanation of the proof by James McIvor here (Lecture 6).
 In fact, there is a Wiki page devoted to proofs of this theorem.
Theorem no. 200: Minkowski's Convex Body Theorem

 Our proof of Fermat's 2Squares theorem (theorem no. 199) uses Minkowski's theorem (i.e. is a theorem in his 'geometry of numbers'). Some good notes (125KB pdf) by Pete L. Clark, prove Lagrange's 4squares theorem in a similar manner, as well as giving a thorough background on Minkowski's theorem.
Theorem no. 201: Jensen's Inequality

 This theorem's illustration originally featured soup cans based on Andy Warhol's Campbell's Soup pop art. However, Campbell Soup Company declined to give me permission for this use "in part because the image you have used is not a reproduction of our famous trademarks, but rather what we consider a 'mutilation' of our marks." If you would like to view this mutilation for yourself let me know and I will smuggle you a copy!
 A good reallife application from the world of finance is given here, taken from Sam L. Savage, The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty, John Wiley & Sons, paperback edition, 2012.
 Meanwhile the issue in our illustration, that of choosing height and radius for a tin can, is given reallife treatment in this charming tweet from @mathematicsprof.
Theorem no. 202: The Freidlander–Iwaniec Theorem

 The web link for this theorem is to an overview of its proof. The actual proof is set out in a subsequent paper of almost a hundred pages: John Friedlander and Henryk Iwaniec, "The polynomial X^{2}+Y^{4} captures its primes", Annals of Mathematics, Vol. 148, no. 3, 1998, pp 945–1040; preprint.
 The question of whether there are infinitely many primes of the form \(n^2+1\) is the first of Landau's Problems, all unsolved as of October 2020. See this by János Pinz for more details.
Theorem no. 203: Euler's Continued Fraction Correspondence

 There is more on the \(\tau\) continued fraction here.Tony Foster gives a continued fraction in terms of cubes for \(\pi=\tau/2\) via a nice exploitation of Nilakantha's series in the same vein as the derivation by Douglas Bowman. Suggests a nice exercise to give the corresponding result for \(\tau\). He also has a similar derivation for the golden ratio to contrast to the simple continued fraction (all 1's) which everyone knows.
Theorem no. 204: Singmaster's Binomial Multiplicity Bound

 Original sources for this theorem:
 Singmaster, D., "Research Problems: How often does an integer occur as a binomial coefficient?", American Mathematical Monthly, 78 (4), 1971, pp. 385–386; online (paywall, there is a copy here at fermatslibrary.com).
 H. L. Abbott, P. Erdős, D. Hanson, "On the number of times an integer occurs as a binomial coefficient", American Mathematical Monthly, 81 (3), 1974, pp. 256–261; online (paywall, there is a copy here at The Erdős Project).
 Daniel M. Kane, "Improved bounds on the number of ways of expressing t as a binomial coefficient", Integers, Vol. 7, 2007, pp. 1–7; online.
 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not be \(N(k)\leq 8\) for all \(k\).
 There is a nice entry on Singmaster's conjecture at Gödel's Lost Letter.

A version of Singmaster's conjecture in terms of algebraic geometry is given in Hugo Jenkins, "Repeated binomial coefficients and highdegree curves", Integers, vol. 16, 2016.
Theorem no. 205: The Classification of the Semiregular Tilings

 Original sources for this theorem:
 Kepler's Harmonices Mundi has its own Wiki page.
 Paul Robin, "Carrelage illimité en polygones réguliers", La Nature, 1887 : Quinzième année, deuxième semestre : n. 731 à 756,
pp. 95–96; online (facsimile).
 A. Andreini, "Sulle reti di poliedri regolari e semiregolari e sulle corrispondenti reti correlative", Mem. Società Italiana della Scienze, Ser.3, 14, 1905, pp. 75–129 online (facsimile via oeis.org, 4.5MB pdf)
 D. M. Y. Sommerville, "Semiregular networks of the plane in absolute geometry", Earth and Environmental Science Transactions of The Royal Society of Edinburgh, Vol. 41, Issue 3, 1906, pp. 725–747; online (paywall, facsimile).
 There is more on Kepler's investigation into plane tilings in this lovely article by plus magazine. In fact they have a whole collection of tiling articles!
 The general subject of plane tilings is deep and wide: for example, the decidability of whether a given single tile will tile the plane is an open question (very elegantly introduced by Chaim GoodmanStrauss in "Can't Decide? Undecide!", Notices of the AMS, Vol. 57, No. 3, 2010, 343–356, online here).
Theorem no. 206: Euler's Formula

 A sweet, and sweetly presented, proof of Euler's Formula, given by Math With Bad Drawings, makes a very nice example of solving differential equations by separation of variables.
 The appearance of a protractor (angle measurer) in the illustration of this theorem gives me an excuse to link to this delightful little History of Protractors from Life Through a Mathematician's Eyes.
Theorem no. 207: The EratosthenesLegendre Sieve

 Original source for this theorem: A. M. Legendre, Essai sur la théorie des nombres, Coucier, Paris, second edition, 1808. I owe this citation to James A. Farrugia's excellent dissertation "Brun's 1920 Theorem on Goldbach's Conjecture" Utah State University, 2018; online (see footnote, page 6).
 A very nice motivation for sieving in number theory is given by Terence Tao here.
Theorem no. 208: Torricelli's Trumpet

 In response to my query, Paolo Mancosu kindly gave me following comments on the origins of Torricelli's result and his methods:

"I am quite sure [that] Oresme and Fermat and Roberval certainly did not anticipate Torricelli's discovery. Fermat wrote about similar solids (de infinitis hyperbolis) after Torricelli. As for Roberval, I cite that report of Mersenne (given by Torricelli) where it is reported that, according to Mersenne, Roberval had written some kind of speech claiming that Torricelli's result was impossible! Had he anticipated him, he would certainly have claimed priority rather than trying to prove that the result was impossible. You are right that Torricelli does not prove that the lateral surface is infinite. I do not know who first did that." 

 In his review (Notre Dame J. Formal Logic, vol. 40, no. 3, 1999, 447–454) of Mancosu's Philosophy of Mathematics & Mathematical Practice in the Seventeenth Century, Craig Fraser says (p. 448) "Torricelli discovered the remarkable fact that the solid of revolution obtained by rotating the hyperbola \(y=1/x\) about the \(x\)axis has finite volume and infinite surface area." I have found no other evidence that Torricelli calculated the surface area of his solid, however.
 Dave Richeson's wellknown Division by Zero blog has provided a template for constructing a paper version ofTorricelli's Trumpet (under the pseudonym Gabriel's Horn). (Possibly the arxiv version of Richeson's paper is more of a permalink than the one given in the blog entry.)
 A very nice animated illustration of indivisibles, applied to circular area, is given by Matt Henderson here.
Theorem no. 209: The Erdős Discrepancy Problem

 This was originally posted as a 'theorem under construction': September 2015 brought news of Terence Tao's complete resolution of the conjecture. This made it the 2nd 'declassification', after Kepler's Conjecture. The role of the Polymath in Tao's proof has been commented on helpfully by Gowers here.
 There is much more to the background to the conjecture, and approaches to it, at the official Polymath 5 page.
 The sequence of length 1160 appearing in the table in this theorem description, reproduced from Konev and Lisitsa's paper arxiv.org/abs/1402.2184, is available in Excel 2003 here. The first 11 terms:
− + + − + − − + + − +
happen to constitute a maximumlength sequence with discrepancy C=1. The terms sum to +1 so continuing with a +1 would give a summation to 2; the evenindex terms sum to −1, so continuing with a −1 would give a summation to −2. The sequence 0,11,1160, ... is oeis.org/A237695; it is known (Konev and Lisitsa) that the next term exceeds 130000.
 There is a very good description of Konev and Lisitsa's proof of \(C=2\) by Richard Lipton and Ken Regan here.
 Erdős mentions Nikolai Chudakov (spelt 'Tchudakoff') in connection with this conjecture here (in Problem 49).
 Although Mathias' paper was published in a 1997 tribute volume for Erdős, this was actually the proceedings of Erdős' 80th birthday celebration, held in March 1993.
 Terence Tao has a valuable blog entry on 'nearcounterexamples' to the conjecture and its unexpected relationship to the 'Elliott Conjecture'. (But see note 1!)
Theorem no. 210: The Basel Problem

 The proof given in the description of this theorem is called the 'Lewin argument' by Kalman and McKinzie who cite its first known appearance as Leonard Lewin's Polylogarithms and Associated Functions, Elsevier Science, 1981 (although they stress that Lewin did not claim credit). The book is an update of Lewin's earlier Dilogarithms and Associated Functions, Macdonald, 1958, in which the same material may be found (chapter 1, section 3.1). Both books are out of print, sadly. There is a valuable review of the former by Richard Askey in Bull. AMS, vol. 6, no. 2.
 Regarding Euler's solution to the Basel Problem, the accepted sequence of events appears to be: discovered in 1734, presented in 1735, published in 1740.
The Euler Archive gives the date of presentation of Euler's paper as December 5,1735. It might appear that 'December 5, 1734' is the correct date. In "Euler and the Zeta Function" Raymond Ayoub gives 1934 as the year of "Euler's first triumph" and says "Euler communicated his result to Daniel Bernoulli and, while unfortunately this letter has been lost, the reply does exist". However, the reply is dated in the Euler Archive as 12th September 1736. Like many of Bernoulli's letters to Euler it deals with several matters giving no clue as to when they arose but it would seem more consistent with a 1735 letter from Euler than a 1734 one.
Theorem no. 211: Willans' Formula

 The source for this page is C. P. Willans, "On formulae for the nth prime number", The Mathematical Gazette,
Vol. 48, No. 366 (Dec., 1964), pp. 413–415. Not free to view online but 1st page can be previewed here.
 The reduction, for composite \(k=a\times b\), of \(\sin^2(1+(k1)!)\tau/4k)\) to \(\sin^2(\tau/4k)\) requires justification. Indeed \(1+(k1)!\equiv 1\!\!\!\mod k\) because both \(a\) and \(b\) will divide \((k1)!\). And the quotient of \((k1)!)/k\) will be a multiple of 4 when there are sufficient even factors in \((k1)!\), which is when \((k1)/2>2+\log_2 k\). This occurs for \( k >12\) (but \( k = 6, 9,10,12\) all reduce to \(\tau/4k\) since the greatest power of 2 in their factorisations is low).
 See note 1 for Theorem 194 regarding the computation required to implement Willans' formula.
 The famous 26variable polynomial of Jones–Sato–Wada–Wiens whose positive values, over the positive integers, are precisely the prime numbers, is also based on Wilson's Theorem. There is a nice explanation here. Another nice paper of Tsangaris and Jones, by the way, describes how a 19th century summation formula for GCD, due to Mathias Jacob Hacks, can be fashioned into summation formulae for \(\pi(x)\), nth prime number and next prime number: Panayiotis G. Tsangaris and James P. Jones, "And Old Theorem on the GCD and Its Application to Primes", Fibonacci Quarterly, Vol 30, No. 30, 1992, pp. 194–198; online.
Theorem no. 212: Vizing's Theorem

 Details regarding Gupta's discovery of this theorem are supplied in the preface of
Michael Stiebitz, Diego Scheide, Bjarne Toft and Lene M. Favrholdt, Graph Edge Coloring: Vizing's Theorem and Goldberg's Conjecture,
WileyBlackwell, 2012:

"Vizing's bound was discovered independently by Ram Prakash Gupta during his Ph.D. studies, mostly at the Tata Institute of Fundamental Research in Bombay, 1965–1967, supervised by Sharadchandra Shankar Shrikhande, and stimulated by Claude Berge. Also Gupta's proof was based on a variation of the fan idea (discovered independently by Gupta), and it was extended to locally bounded infinite graphs i.e. infinite graphs with a finite maximum degree." 

(Curiously, Gupta's rather nominal Wikipedia entry says his supervisor was C.R. Rao, at the Indian Statistical Institute, Calcutta, and their Math Geneology entries support this. However, Rao is a statistician while Shrikhande is a combinatorialist. I suspect there are two R.P. Guptas.)
 The 'fan idea' is the basis of textbook proofs of Vizing's theorem (but not the proof from Schrijver which I have chosen to link to) and extends to prove the generalisation to graphs with multiple edges: \(X'(G)\leq \Delta +m\), where \(m\) is the maximum edge multiplicity of \(G\). A good account is here.
Theorem no. 213: The 6Circles Theorem

 This theorem is sometimes referred to as the MoneyCoutts Theorem although it is not clear why MoneyCoutts deserves more credit than Evelyn or Tyrrell. The name 'Six Circles' is a bit ambiguous: cuttheknot, for instance, has three quite distinct theorems which qualify for the name (located here, alphabetically).
 Although Tyrrell has been described as a 'professional' and Evelyn and MoneyCoutts as 'amateurs', an obituary of Evelyn (by Tyrrell) appears in Bull. London Math. Soc., vol. 9, no. 3, 1977. He published professionallevel work during the 1930s and then again in the 1960s.
 A generalisation by Serge Tabachnikov in a different direction from that discussed in this theorem description, namely from triangles to ngons, is given in "Going in Circles: Variations on the MoneyCoutts Theorem", Geometriae Dedicata,
80, 2000,
201–209, online (paywall);
preprint.
Theorem no. 214: A Theorem on Maximal SumFree Sets in Groups

 Original source for this theorem (which is the weblink from the theorem page): Michael Giudici and Sarah Hart, "Small maximal sumfree sets", Electronic J. Comb., Vol. 16, Issue 1, 2009, Article R59; online.
 Finite groups do not necessarily have large sumfree sets: W.T. Gowers, "Quasirandom groups", Combinatorics, Probability and Computing, Vol. 17, Issue 3, 2008, pp. 363–387; online (paywall, arxiv).
 Extremal problems concerning sumfree sets in abelian groups are the subject of a blog entry by Terence Tao.
 This theorem gets a neat description in the context of Sarah Hart's other mathematical activities here at Gödel's Lost Letter.
Theorem no. 215: Wedderburn's Little Theorem

 Multiplication in the quaternions is described in the description of Moufang's Theorem; you can check that the given multiplication table for the Dickson nearfield of order 9 is identical to quaternion multiplication under the isomorphism:
$$\left(\begin{array}{rrrrrrrr}
1 & a & b & c & d & e & f & g \\
1 & 1 & i & j & k & i & k & j
\end{array}\right)$$
whereby the multiplication is seen to be almost commutative in the sense that the table is skew symmetric.
 Zinovy Reichstein has drawn my attention to an uncomfortable but unavoidable footnote: an elegant, onepage, grouptheoretic proof of Wedderburn's Little Theorem was published by the socalled Unabomber, Ted Kaczynski, while a PhD student at the University of Michigan. A reference can be found here.
 Onepage proofs continue to appear. E.g. John Schue, "The Wedderburn Theorem of Finite Division Rings", Amer. Math. Monthly, Vol. 95, No. 5 (May, 1988), pp. 436437 (using properties of field extensions); Nicolas Lichiardopol,"A New Proof of Wedderburn's Theorem", Amer. Math. Monthly, Vol. 110, No. 8 (Oct., 2003), pp. 736737 (ring theory, exploiting, like Kaczynski's proof, an initial lemma from number theory).
 Regarding the original discovery and proof of Wedderburn's theorem, Karen Parshall is the authority: “In Search of the Finite Division Algebra Theorem and
Beyond: Joseph H. M. Wedderburn, Leonard E. Dickson, and Oswald
Veblen”,
Archives Internationales d’Histoires des Sciences, vol.
35
(1983), pages 274–299. There is a nice exploration of one aspect by Michael Adam and Birte Julia Mutschler:
"On Wedderburn's theorem about finite division algebras"; paper 99 here.
 It long remained an intriguing circumstance that Wedderburn's theorem gave an algebraic proof that Desargue's theorem implies Pappus's for finite projective planes, and that no geometric proof was known (see, e.g., Peter Cameron's Projective and Polar Spaces, chapter 2, page 23). John Bamberg and Tim Penttila have resolved the issue by providing a geometric proof of Wedderburn, "Completing Segre's proof of Wedderburn's little theorem", Bull. Lond. Math. Soc., vol. 47, no. 3, 2015, pp. 483–492; preprint. (Additionally, the paper is an excellent source on Wedderburn's theorem generally.)
Theorem no. 216: Irrationality of Circumference of Unit Circle

 Original sources for this theorem:
 Lambert, Johann Heinrich, "Mémoire sur quelques propriétés remarquables des quantités transcendentes circulaires et logarithmiques", Histoire de l'Académie Royale des Sciences et des BellesLettres de Berlin, 17, 1768, pp. 265–322; online (facsimile, it is reproduced in J. Lennart Berggren, Jonathan M. Borwein and Peter B. Borwein, PI: A Source Book, 3rd edition, SpringerVerlag, New York, 2004, followed by a translation into English of the part relating to Lambert's irrationality proof. They give the date for Lambert's first announcement of his proof as 1766, but published in 1770 in "Vorläufige Kenntnisse für die, so die Quadratur und Rectification des Circuls suchen", Beyträge zum Gebrauche der Mathematik und deren Anwendung, Berlin, 1770, 140–169; online (facsimile by Göttinger Digitaisierungszentrum).
 Charles Hermite, "Extrait d'une lettre de Mr. Ch. Hermite à Mr. Borchardt", Journal für die reine und angewandte Mathematik, 76, 1873, pp. 342–344; online (facsimile by DigiZeitschriften), he had laid the groundwork in an earlier article in the same volume "Extrait d'une lettre de Monsieur Ch. Hermite à Monsieur Paul Gordan", pp. 303–311.
 A more detailed account of Lambert's irrationality proof is given at here at math.stackexchange. Wikipedia has a page on the proof of irrationality of Pi.
 Featured in Math Scholar's thread Simple proofs of great theorems.
Theorem no. 217: Taylor's Theorem

 An annotated English translation of Brook Taylor's Methodus Incrementorum Directa & Inversa can be found here at Ian Bruce's invaluable 17centurymaths.com.
 An alternative justification for Hugh Worthington's Rule is given by Colin Beveridge here. The explanation illustrating theorem no. 217 is by Tony Forbes, M500 magazine, issue 260, 2014, p. 17. He observes that using degrees instead of radians allows an even better approximation: \(\displaystyle \tan^{1}\frac{a}{b}\approx \frac{172a}{b+2c}\) where \(c=\sqrt{a^2+b^2}\).
 A stepbystep proof of the Lagrange remainder form of Taylor's theorem is given by Gowers here.
Theorem no. 218: The Riemann Rearrangement Theorem

 Riemann's habilitation thesis "Ueber die Darstellbarkeit einer Function durch eine trigonometrische Reihe" was published posthumously in 1867. A facsimile can be found here and the text is transcribed here. An English translation is available although it may be out of print. Riemann's habilitation work is discussed in detail in Detlef Laugwitz (transl. Abe Shenitzer), Bernhard Riemann 1826–1866: Turning Points in the Conception of Mathematics, Birkhauser, 2nd printing, 2008. A French translation is here (§1–§8) and here (§9–§13) (presumably the one by Darboux and Houel, c.f. these notes, although no credit is given).
 It seems worthwhile to give an English translation of Riemann's proof of his rearrangement theorem (from §3 of his thesis):

"In Crelle’s Journal in January 1829 a memoir by Dirichlet appeared in which rigorous conditions were established for representing, by trigonometric series, functions which are integrable and which do not possess infinitely many maxima or minima.
"He discovered the correct path to follow to solve this problem by consideration of the fact that infinite series fall into two classes according to whether or not they remain convergent when all their terms are made positive. In the first class, the terms may be permuted in an arbitrary manner; whereas in the second class, the value of the series depends on the ordering of the terms. Indeed, if one denotes, in a series of the second class, the positive terms by
$$a_1,a_2,a_3,\ldots,$$
and the negative terms by
$$b_1,b_2,b_3,\ldots,$$
it is clear that \(\sum a\), and similarly \(\sum b\), must be infinite; for if both sums were finite then the series would still be convergent on giving all terms the same sign; if just one of the sums where infinite, then the series would diverge. It is now clear that the series, if its terms are placed in a suitable order, may take an arbitrary given value \(C\); for if one takes alternately the positive terms of the series until its value exceeds \(C\), and then the negative terms until the value falls below \(C\), the difference between this value and \(C\) will never exceed the value of the term immediately preceeding the most recent change of sign. Now the \(a\) values, and similarly the \(b\) values must eventually become infinitesimally small as their indices increase, and thus the differences between the series sum and \(C\) must also become infinitesimally small, as one extends the series sufficiently long, which is to say that the series converges to \(C\).
"It is only series of the first class which are amenable to the laws governing finite sums; only they may be considered as the collection of their terms; those of the second class may not be so considered: a circumstance which was missed by mathematicians of the last century, in the main because series which extend according to ascending powers of a variable belong, generally speaking (which is to say, with the exception of certain exceptional values of that variable), to the first class.
" 

 Regarding the rearrangements of Leibniz's series given in figure A of the theorem description, it is remarkable that closed forms may be given to their sums (allowing for special functions). Thus for the highest valued rearrangement shown (approx. 0.95868) we have (thanks to Maple):
$$\sum_{k=0}^{\infty}\left(\frac{1}{8k+1}+\frac{1}{8k+5}\frac{1}{4k+3}\right)=\frac14\gamma\frac34\ln 2+\frac{1}{16}\tau\frac18\Psi\left(\frac18\right)\frac18\Psi\left(\frac58\right),$$
where \(\gamma\) is the EulerMascheroni constant, \(\tau\) is circumference of unit circle, and \(\Psi\) is the digamma function (the slope of the log of the gamma function).
 The alternating harmonic series provides an even more fascinating example of Riemann's theorem in the hands of Larry Riddle in this article which originally appeared in Kenyon Mathematics Quarterly, vol. 1, no. 2 (1990), 6–21. It is also given an animation by here by CindyJS.
 In another version, Riemann's theorem tells us that a series is absolutely convergent if and only if every rearrangement converges. This becomes a test for absolute convergence, in principle, if it can be shown that a finite number of convergent rearrangements is enough. This proposed 'rearragement number' is the object of study by Andreas Blass, Will Brian, Joel David Hamkins, Michael Hardy and Paul Larson.
Theorem no. 219: Integration by Parts

 A very attractive discussion about "striking applications of integration by parts" is ongoing at stackexchange.
 Ian Bruce wrote to me of his experience with his valuable project 17centurymaths: "Most of the elementary calculus material can be found in Euler's Differential and Integral Calculus books, and in fact he starts Book I on Integration with integration by parts; Ch.1 of this book is Top of the Pops in my line of business, and gets first place consistently in downloads, followed by Newton's definitions & Axioms ...
Euler's work is still highly readable, and more so than others of that age and before; in fact he seems to have set the standard for generations of mathematicians to come."
 Ernst Hairer has provided an elegant geometrical interpretation of integration by parts, which may be viewed here.
 There is a nice description here by Murray Bourne of an alternative to integration by parts called the Tanzalin Method which is apparently commonly used in Indonesia. As you will see, it too can lead to infinite series!
Theorem no. 220: The Pappus–Guldin Theorems

 According to Andrew Leahy's article, no proof by Pappus of his theorems has been discovered and Paul Guldin gave no proof, the first known proof being supplied by Giannantonio Rocca in 1644.
 There is a very nice discussion of the 17th century precalculus debate in Chapter 5 of Amir Alexander's Infinitesimal: How a Dangerous Mathematical Theory Shaped the Modern World, Oneworld Publications, 2014, with a generous extract here.
 Peter Harremoës has drawn my attention to a little irony: the surface area of the torus is usually given in terms of its major radius \(R\) and minor radius \(r\), as \(\tau^2rR\). You can instead use inner radius \(a=Rr\) and outer radius \(A=R+r\) and in this case surface area is given as \(\pi^2(A^2a^2)\). Rather sneakily it is the latter, less standard, presentation which is used about 2.5 mins into this film debate on the π vs τ question as an argument that pi makes things simpler!
Theorem no. 221: The InclusionExclusion Principle

 Attributions of InclusionExclusion often include the name of Poincaré (e.g. 'formule du crible de Poincaré') and this seems a bit obscure. In Encyclopaedia of Mathematics, Supplement III: 3 (ed. Michiel Hazewinkel, Springer, 2002) this attribution carries a reference to Poincaré's book Calcul des probabilités, GauthierVillars, 1896, and this book may have been influential in making the principle widely known in France.
 InclusionExclusion may be generalised in several ways. A good example is given here by Stewart Weiss; probably the most famous is due to GianCarlo Rota and is described by Peter Cameron in Lecture 9 of this course. Rota's original article can (and should!) be read here.
 Stewart N. Ethier has contributed the following: "the expected number of boxes needed for a full set of coupons has the nice formula \(\displaystyle n\!\left(1+\frac12+\frac13+\ldots+\frac{1}{n}\right)\), which either can be derived from [the inclusionexclusion formula] (via Theorem 1.4.2 of my book [The Doctrine of Chances: Probabilistic Aspects of Gambling]), or can be derived directly by writing the random variable of interest as the sum of \(n\) independent geometric random variables with success probabilities \(1, (n1)/n, (n2)/n, \ldots, 1/n\) (and using the fact that a geometric(\(p\)) random variable has mean \(1/p\))."
 I should perhaps be recorded that the number of surjections from \(m\) objects onto \(n\) is directly expressable in terms of Stirling numbers of the second kind, \(S(m,n)\) being the number of ways to partition \(m\) objects into \(n\) nonempty subsets: thus \(S(m,n)\) counts all ways to choose which objects will map to the same image point, and then \(n!S(m,n)\) incorporates the order in which we choose the image points.
 A nice application of InclusionExclusion is to vary the standard combinatorial question "How many ways to put n indistinguishable balls into k distinguished boxes" by adding "so that no box gets more than C balls". An excellent explanation of the answer is given here by Brian M. Scott.
Theorem no. 222: Faulhaber's Formula

 For a general survey of properties of Bernoulli numbers, Pascal Sabah and Xavier Gourdon's article here is excellent.
 There is a fascinating investigation of what Faulhaber achieved and how he achieved it in Knuth, D.E., "Johann Faulhaber and sums of powers", Math. Comp., Vol. 61, No. 203, 1993, pp. 277–294; online.
 Seki's discovery of the Bernoulli numbers is described in Silke WimmerZagier and Don Zagier's chapter in Eberhard Knobloch, Hikosaburo Komatsu and Dun Liu (eds.), Seki, Founder of Modern Mathematics in Japan: A Commemoration on His Tercentenary, Springer, 2013. Online here (April 2021). There is some further information at the beginning of Tsuneo Arakawa, Tomoyoshi Ibukiyama and Masanobu Kaneko, Bernoulli Numbers and Zeta Functions, Springer, 2014, which can be read via the Look Inside feature at the amazon.co.uk entry (via bibliography).
Theorem no. 223: Tutte's Golden Identity

 Tutte's Golden Inquality appears in W.T.Tutte, "On chromatic polynomials and the golden ratio", J. Comb. Theory,
Vol. 9, Issue 3, 1970, pp. 289–296; online, inspired by his investigations with Gerald Berman reported in G.Berman and W.T.Tutte, "The golden root of a chromatic polynomial", J. Comb. Theory,
Vol. 6, Issue 3, 1969, pp. 301–302; online. The Golden Identity appears in W.W. Tutte, "The golden ratio in the theory of chromatic polynomials", Ann. New York Acad. Sci., Vol. 175(1), 1970, pp. 391–402; online (paywall)..
 There is a description of Tutte's work on graph polynomials in the excellent obituary by Arthur Hobbs and James Oxley which appeared in Notices Amer. Math. Soc. 51 (2004), 320330.
Theorem no. 224: Green's Theorem

 The original source for this theorem is "An Essay on the Application of mathematical Analysis to the theories of Electricity and Magnetism" which Ralf Stephan has transcribed here. It was published by Green at his own expense but received little attention until William Thomson (later Lord Kelvin) rediscovered it and arranged for its publication in Crelle's Journal in the 1840s.
 Paul Nahin, whose Inside Interesting Integrals is the recommended further reading for this theorem, also writes interestingly about its background in Chapter 7 of An Imaginary Tale: The Story of \(\small\underline{\sqrt{1}}\), Princeton University Press, 1998.
 An important exhibition commemorating Green was held at the University of Nottingham in the autumn of 2014 and the curator's blog post is very interesting. Much of historical as well as scientific interest can be found in Lawrie Challis and Fred Sheard, "The Green of Green Functions", Physics Today, 56, 12, 2003, 41–46; a reprint here.
Theorem no. 225: The Spherical Law of Cosines

 You can find latitudes and longitudes of cities, and compute greatcircle distances between them here.
 Pat Ballew has drawn my attention to a dual version of this theorem, relating three angles A, B, C and one side, say c:
$$\cos(C)=\cos(A)\cos(B)+\sin(A)\sin(B)\cos(c),$$
(this is referred to by Van Brumellen in Heavenly Mathematics as the "Law of Cosines for Angles").
 A nmemonic of Napier for spherical trigonometry (also from Van Brumellen's book, I think) has been nicely summarised by John D. Cook here.
 plus magazine have provided a very nice introduction to longitude and latitude.
Theorem no. 226: Wolstenholme's Theorem

 The converse of Wolstenholme's Theorem, that \({2n1\choose n1}\not\equiv 1 \hspace{0.05in}\mod n^3\) for all composite values of \(n\), is a famous open question. It is known to be true for even \(n\) and for all \(n<10^9\). See for example, Vilmar Trevisan and Kenneth Weber, "Testing the converse of Wolstenholme's theorem", Matemática Contemporânea, 21 (2001), 275–286; online. Recent progress on the conjecture is described in Saud Hussein, "A note on the converse of Wolstenholme’s Theorem", Integers, vol. 18 (2018), Paper No. A94; online, where it is attributed to James P. Jones.
 A generalisation of Wolstenholme due to James Whitbread Lee Glaisher in 1900, says that \({kp1\choose p1}\equiv 1 \hspace{0.05in}\mod p^3\) for any prime \(p\geq 5\) and any positive integer \(k\). In this case the converse does not hold. Small counterexamples exist for \(p=4,9,25\), for example (thus \(p=4,k=33\) gives \({131\choose 3}\equiv 1 \hspace{0.05in}\mod 64\)).
 A proof of Babbage's \(p^2\) prototype of the theorem is given here.
 More on the 'harmonic numbers' context for Wolstensholme can be found in ZhiWei Sun, "Arithmetic theory of harmonic numbers", Proc. Amer. Math. Soc., 140, no. 2, 2012, 415–428, online.
Theorem no. 227: Cauchy's Theorem in Group Theory
 A thorough analysis of the origin of Cachy's 1845 'Mémoire sur les arrangements ...', in which his theorem is asserted, has been given by Peter M. Neumann, 'On the date of Cauchy's contributions to the founding of the theory of groups', Bulletin of the Australian Mathematical Society, vol. 40, 1989, 293–302; online.
 Incidentally to the choice of \(D_{10}\) to illustrate this theorem, is a corollary to Cauchy's theorem that any group of order twice an odd prime is either cyclic or dihedral. This is Prop. 3.34 in our recommended book, Smith and Tabachnikova's Topics in Group Theory, Springer, London, 2000.
 Michael Meo claims, in his article "The mathematical life of Cauchy's Group Theorem", Historia Mathematica, vol. 31, issue 2, pp. 196–221; online, that Cauchy's proof of his theorem contains an 'egregious error' and that a subsequent attempt by Dedekind in the 1850 is also incomplete. This would suggest that the first complete proof of the theorem comes as a corollary to its own generalisation (Sylow's theorems of 1872). However it seems fairer to assume that Cauchy was at least completely in command of his material.
Theorem no. 228: Fisher's Inequality

 Original source for this theorem: R.A. Fisher, "An examination of the different possible solutions of a problem in incomplete blocks", Annals of Eugenics, vol. 10, 1940, pp. 52–75; online. Bose's paper containing his short proof of the inequality is R. C. Bose, "A note on Fisher's inequality for balanced incomplete block designs", Ann. Math. Statist.,
Vol. 20, Number 4 (1949), pp. 619–620; online.
Theorem no. 229: Poncelet's Porism

 An elegant and (relatively) simple proof is given by Lorenz Halbeisen and Norbert Hungerbühler in "A Simple Proof of Poncelet’s Theorem
(on the occasion of its bicentennial)", American Mathematical Monthly, in press, (preprint). For a modern proof, from algebraic geometry, see this by David Speyer.
 A bicentennial survey of past and current research into Poncelet's theorem is given by Vladimir Dragović and Milena Radnović in "Bicentennial of the Great Poncelet Theorem (1813–2013): Current advances",
Bull. Amer. Math. Soc., Vol. 51, No. 3, 2014, 373–445.
 The example given of a quadrilateral inscribed in and circumscribing two eillipses is a cheat! The parameters were chosen by trial and error to give an adequate illustration: outer ellipse is centered on the origin and inclined at \(\tau/8\) to \(x\) axis; major radius \(a = 9.3\), minor radius \(b = 4.1\); inner ellipse is centered at \((1.05046,1.3)\) and has no inclination; major radius \(c = 4.0448\), minor radius \(d = 3.22\).
 Some very good notes by Tony Forbes are available here (about 1.5MB), including detailed instructions for creating genuine examples for all combinations of conics (not just ellipses) and also giving a brief description of the link to elliptic curves.
 Poncelet's Porism is also known as his Closure Theorem for reasons made beautifully clear in Jonathan King, "Three Problems in Search of a Measure", The American Mathematical Monthly, Vol. 101 (1994), pp. 609–628, online here. We find in the same article that the theorem is intimately related to
Gelfand's Question!
 There is a French version of this theorem description. If you read French the weblink from the French version is a very fine popular account of Poncelet's porism.
Theorem no. 230: Ore's Theorem in Graph Theory

 Original source for this theorem: Ore, Ø, "Note on Hamilton circuits", American Mathematical Monthly, 67 (1), 1960. p. 55; online (paywall).
 Bondy's short proof appears in "Short proofs of classical theorems", J. Graph Theory, Vol. 44, No. 3, 2003, 159–165; online (paywall, there was a pdf copy here, January 2021). The algorithmic interpretation given here is similar in spirit to an adaptation of Ore's original proof by E.M. Palmer, "The hidden algorithm of Ore's theorem on Hamiltonian cycles", Computers & Mathematics with Applications, Vol. 34, No. 11, 1997, 113–119; online.
 Apart from the leftmost, the graphs illustrating this theorem were generated in Maple. However I manually replaced the vertices in order to get the permuted numberings (I was too lazy to work out how to get Maple to do this).
Theorem no. 231: Sophie Germain's Identity

 The correspondence of Sophie Germain is online at the Bibliothèque nationale de France via gallica. The letter reproduced here is located by searching for '9118' under 'Manuscripts'.
 Leonard Dickson's History of the Theory of Numbers, Volume I: Divisibility and Primality can be read online here courtesy of archive.org. The references to Euler and Germain are on pages 381 and 382, respectively.
 The letter from Euler to Goldbach cited by Dickson can be read at the Euler Archive (August 28 in the 1742 correspondence). Not every letter from Euler to Goldbach of that year is online but it seems clear that this is the one which Dickson intends.
Theorem no. 232: The Riemann Explicit Formula

 Original source for this theorem is Riemann's 1859 paper "Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse". The paper is so famous as to have its own Wiki page!
 An excellent account of the explicit formula, starting from scratch, is this at medium.com by Jørgen Veisdal.
 The relationship between the distribution of primes and (logarithmic) spirals has a rich history. A good example is given by Matthew Watkins here; the idea of spotting patterns in prime spirals goes back to (at least) Ulam in 1963. A nice variant by Edmund Harriss can be found here, and there is a very elegant 3D conical spiral by Dan Bach here.
 The weblink for this theorem by Matthew Watkins offers a very clear account of Riemann's formula in the Chebyshev \(\psi\) function version (as preferred in the Wikipedia entry for example). He has much more on the Riemann Hypothesis here; and indeed, the whole prime distribution story is the subject of his trilogy of books (with illustrator Mark Tweed) Secrets of Creation.
 There is a famous Bonn University inagural lecture by Don Zagier on the subject of Riemann's prime counting function which can be found in English translation here (pdf, 2.5MB).
 Much intriquing recent commentary on the Riemann Hypothesis, including an extended essay by Alaine Connes, can be found starting at this post from Not Even Wrong.
Theorem no. 233: The Circle Area Theorem

 The Archimedes proof of this theorem still qualifies as a textbook one, e.g. here, although calculus variants of the \(\int_0^{r\tau}\frac12r\mbox{dt}\) variety are presumably more respectable from a modern perspective.
Theorem no. 234: A Generalised Hlawka Inequality

 The original source for Hlawka's Inequality is Hans Hornich, "Eine Ungleichung für Vektorlängen",
Mathematische Zeitschrift,
Volume 48, Issue 1, (1942/43), 268–274. It may be viewed online here thanks to the Göttinger Digitalisierungszentrum. (Hornich says merely "For the special case m = 1, n = 2, Herr Hlawka has given me a purely algebraic proof..." so that the name Hornich–Hlawka as preferred by de.wikipedia.org seems more appropriate. However 'Hlawka' seems to be the generally adopted nomenclature.)
 The original result of Dragomir Djoković appeared in "Generalizations of Hlawka's inequality", Glasnik MatematičkoFizicki i Astronomski, Ser. II, vol. 18, (1963), issue 3, 169–175; online (direct 1.4MB pdf download, only Glasnik Matematički, the successor to Glasnik MatematičkoFizicki i Astronomski appears to be fully online ). D.M. Smiley & M.F. Smiley's paper is "The polygonal inequalities", Amer. Math. Monthly, Vol. 71, No. 7 (1964), 755–760; online (paywall). In both papers something more general is proved for the sequence of \(n\) vectors: that for \(2\leq k<n\) we have $$d_k\leq {n2\choose k2}d_n+{n2\choose k1}d_1,$$ using the notation in the statement of the theorem. The inequality as stated is found by summing over \(k\). Djoković and Smiley & Smiley also gave conditions for equality.
 A nice derivation of Hlawka's Inequality from the Ptolomeic Inequality is given by Alice Simon and Peter Volkmann in Annales Mathematicae Silesianae, Vol. 9, 1995, 137140. The article is online here.
Theorem no. 235: A Theorem on Modular Fibonacci Periodicity

 The period lengths of the moduloreduced Fibonacci sequences continue to be the subject of intensive research. A good recent (2012) example is here. They also go by the name of Pisano periods. (after Leonardo Pisano aka Fibonacci). They are sequence A001175 at oeis.org.
 Of particular interest is the socalled 'Wall's Question': for a prime \(p\), is it possible that the period mod \(p\) and mod \(p^2\) should be equal? Such a prime is termed a Wall–Sun–Sun prime. The question has links to the Fermat's Last Theorem via Germain's Theorem. See, Klaška, J., "Criteria for testing Wall's question", Czechoslovak Mathematical Journal, vol. 58 (2008), issue 4, pp. 12411246, online.
 D.D. Wall's paper is "Fibonacci series modulo m", The American Mathematical Monthly, Vol. 67, No. 6, 1960, 525–532; online (paywall). Covering rather the same material is a roughly contemporary paper, "The Fibonacci matrix modulo m" by the Caltech physicist David W. Robinson. This was published in the 2nd ever issue of Fibonacci Quarterly and this is free online here.
 The papers of Morgan Ward on linear recurrences are a good source of information on modular periodicity. They appears in Transactions of the American Mathematical Society and are free online here (1931) and here (1933). The main result from 1931 is that if \(m\) has prime decomposition \(p_1^{a_1}p_2^{a_2}\cdots p_n^{a_r}\) then period length mod \(m\) is equal to the LCM of period lengths mod \(p_i^{a_i}, i=1,\ldots, r\).
Theorem no. 236: Kemeny's Constant

 Original source for this theorem, as indicated on the theorem page, is John G. Kemeny and J. Laurie Snell, Finite Markov Chains, Van Nostrand, Princeton, NJ, 1960, Chapter 4, section 4.10 (this will have changed in the new Springer edition found in our bibliography).
 The directed graph modelling Alice's casino is a finite automaton which finds the remainder of an input binary number (with \(H=1\) and \(T=0\)) mod 8. Doubling appends a zero to a binary number; adding 1 thereafter appends instead a 1, so the action of the automaton is the same as step (3) in the casino game. Another example of such an automaton illustrates The Pumping Lemma. There is a nice nonbinary take on this (for mod 7, but see comments) by David Wilson guesting at Tanya Khovanova's Math Blog.
 If you operate a casino and would like to compete with Alice using Kemeny's constant, Tony Forbes has offered a neater and more intuitive version of her game (750KB pdf, see p. 20) in M500 magazine.
 There is an interesting contrast between \(K\), the expected time to reach the stationary distribution, and the probability of reaching the distribution in fewer than \(K\) steps. The latter will be greater than \(1/2\) (to compensate for the occasional long runs). So Bob will often find himself losing money to Alice but he will be seduced by the prospect of a long run, just as in any lottery you hardly ever win anything but play for the prospect of a jackpot.
 An interesting question from Piers Myers is: can other averages for time to stationary distribution, e.g. median, also have constant values for Markov chains? For the 8state chain used here the answer for median values appears to be, roughly, yes, according to simulations: 2500 runs from each starting state to a target state selected u.a.r. gave median times 5,4,5,4,4,4,4,5. But Piers points out that this cannot hold in general: the chain \(\left(\begin{array}{cc} 0 & 1 \\ 1/100 & 99/100\end{array}\right)\) has stationary distribution \((1/101, 100/101)\); Kemeny's constant is \(100/101\); but median time to reach stationary distribution is 1 from state 1 and 0 from state 2.
 A very nice Markov chain animation provided by setosa.io might be of use for 'visualising' Kemeny's constant for small chains
Theorem no. 237: Sylvester's Catalecticant

 The last section of "On the AlexanderHirschowitz Theorem" by Maria Chiara Brambilla and Giorgio Ottaviani is very good on the history of Waring's problem for forms. Also very good are the opening pages of Power Sums, Gorenstein Algebras, and Determinantal Loci, Springer 1999, by Anthony Iarrobino and Vassil Kanev. This paper by Zach Teitler and Alexander Woo gives a good picture of the current state of play.
 Zach Teitler provided me with much help in getting to grips with the subtleties of Sylvester's work to the point where I felt it worth quoting his comments verbatim, in the form of some additional notes.
 A classic paper in the theory of binary forms is Joseph P. S. Kung and GianCarlo Rota, "The invariant theory of binary forms", Bull. Amer. Math. Soc. (N.S.), Vol. 10, No. 1, (1984), 27–85, online here.
 Bruce Reznick draws my attention to the charming description of Sylvester on his work on binary forms:

"I discovered and developed the whole theory of canonical binary forms
for odd degrees, and, as far as yet made out, for even degrees too, at one
evening sitting, with a decanter of port wine to sustain nature's flagging
energies, in a back office in Lincoln's Inn Fields. The work was done, and
well done, but at the usual cost of racking thought—a brain on fire,
and feet feeling, or feelingless, as if plunged in an icepail. That night
we slept no more." 

(which can be found on p. xxiv of The Collected Mathematical Papers of James Joseph Sylvester: Volume 4, 18821897. Bruce observes, aptly I think, "If he had been known as a writer, rather than as a mathematician, this would be a famous
quote!" (Of course Sylvester was proud of, although not remembered for, his poetry, see Chapter 8 of Karen Hunger Parshall's, James
Joseph Sylvester: Jewish Mathematician in a Victorian World, The Johns
Hopkins University Press, 2006.) By the way, an excellent slideshow by Bruce Reznick on representations of forms can be found here (1.2MB pdf).
Theorem no. 238: Euler's Even Zeta Formula

 Euler discovered his formula in 1739 and it appeared in De seriebus quibusdam considerationes which can be read in the original Latin and in German or English translation as entry E130 at the Euler Archive. The role of the Bernoulli numbers was made explicit by Euler in his 1755 classic textbook Institutiones calculi differentialis cum eius usu in analysi finitorum ac doctrina serierum, volume 1 which is entry E212.
 Max Woon (publishing as See Chin Woon) gave his binary tree generation of the sequence of Bernoulli numbers in "A Tree for Generating Bernoulli Numbers", Mathematics Magazine,
Vol. 70, No. 1, 1997, 51–56; online (paywall). A generalisation to arbitrary complex sequences using elementary methods has been given by Petr Fuchs: "Bernoulli numbers and binary trees", Tatra Mountain Mathematical Publications, 20 (2000), 111–117, online (postscript) here.
 Although there may be no direct calculations of \(\zeta(2n+1)\) in terms of Bernoulli numbers, there are infinite series formulae. The most famous approach is probably Ramanujan's, see Section 3 of this nice overview of Ramanujan's notebooks by Bruce C. Berndt: the approach is given a thorough workout by Marc Chamberland and Patrick Lopatto in "Formulas for Odd Zeta Values and Powers of \(\pi\)", Journal of Integer Sequences, Vol. 14 (2011), Article 11.2.5, online here. The bestknown formula is a special case of Ramanujan's first discovered by Mathias Lerch in 1901: if \(n\) is odd then $$\zeta(2n+1)=\frac12\tau^{2n+1}\sum_{k=0}^{n+1}(1)^{k+1}\frac{B_{2k}}{(2k)!}\frac{B_{2n+22k}}{(2n+22k)!}2\sum_{k=1}^{\infty}\frac{k^{2n1}}{e^{k\tau}1},$$ whereby \(\zeta(2n+1)\), for large, odd \(n\), is very close to a rational multiple of \(\tau^{2n+1}\).
Theorem no. 239: Kuratowski's 14Set Theorem

 This theorem was apparently made famous when it featured as an exercise in John L. Kelley's General Topology (first published by Van Nostrand, 1955). It again appears as an exercise in James Munkres' Topology (chapter 2, section 17, exercise 21) where it is also required to find a set which generates the maximum of 14. The answers to Munkres' exercises are helpfully provided by Jesper Michael Møller here.
 The particular 14set chosen for my illustration was generated with the help of Mark Bowron's fun interactive diagram.
 There is actually no need to state this as a theorem about topological spaces. P. C. Hammer, "Kuratowski’s closure theorem", Nieuw Archief voor Wiskunde, 7 (1960),
74–80, has shown that Kuratowski's theorem remains true for a more abstract closure operator defined settheoretically. There is a nice discussion by Jeffrey Shallit and Ross Willard here.
 Joshua Zelinsky has brought to my attention an attractive paper of David Sherman generalising Kuratowski in terms of number of operators and number of sets: David Sherman, "Variations on Kuratowski's 14set theorem", American Math. Monthly, Vol. 117, no. , 2010, pp.
113–123; online (paywall, there was a reprint on Sherman's home page, Jan, 2021).
 For French speakers, this blog account of the theorem by Blogdemaths is very fine.
Theorem no. 240: The Jones Knot Polynomial Theorem

 Strictly speaking, our presentation of this theorem uses the 'normalised bracket polynomial'. The substitution \(x=q^{1/4}\) is used in the Jones polynomial proper (as recorded in the Knot Atlas, for example). I asked Louis Kauffman about this and he explained it as " a historical accident having to do with the fact that Jones defined the invariant in a different way (via a representation of the braid group to a Temperley—Lieb algebra) than I did by using the bracket state sum. The state sum is close to physics via ideas in statistical mechanics. The Temperley—Lieb algebra is close to physics also."
 The definitive published resource for relationships between knot theory and physics must be Louis Kauffman's Knots and Physics, World Scientific, 4th revised edition, 2013. Kauffman's webpage is also an essential visit, with such gems as his "New Invariants in the Theory of Knots" (an Amer. Math. Monthly writeup of his 1987 breakthrough).
 There is apparently no convention for orienting links when calculating the writhe of a multicomponent link. However, if a link has more than one component then the orientations of individual components only changes the value of the Jones polynomial by a power of its variable. See, for example, Sandy Ganzell, Janet Huffman, Leslie Mavrakis, Kaitlin Tademy and Griffin Walker, "Jones polynomials of unoriented links", preprint online here.
 One of the biggest questions in knot theory is whether the Jones polynomial distinguishes the unknot, that is, can any knot \(K\) other than the unknot have \(J(K)=1\)? In the case of links with more than one component the Jones polynomial cannot distinguish the unlink, as shown for example by Shalom Eliahoua, Louis H. Kauffman and Morwen B. Thistlethwaite in "Infinite families of links with trivial Jones polynomial", Topology, vol. 42, no. 1, 2003, 155–169, online via Elsevier Open access. It is known, Haken's Unknot Theorem, that distinguishing the unknot is decidable, but by an algorithm that is vastly more complex than evaluating the Jones polynomial.
 Erica Klarreich has a good short article on knot invariants in Quanta magazine.
Theorem no. 241: The Large Prime Gaps Theorem

 Original source for this theorem: Kevin Ford, Ben Green, Sergei Konyagin, James Maynard and Terence ChiShen Tao, "Long gaps between primes", J. Amer. Math. Soc., 31, no. 1, 2018, pp. 65–105; online (paywall, AMS preprint version, 500KB pdf).
 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not mean something approaching Cramér's 1936 conjecture.
 The composite sequence in our example is \(m+k\) where \(m=293357\) and \(k=1,\ldots,25\). The fact that \(Y(17)=25\) does not mean that larger \(k\) values will give primes: in fact \(293357+k\) is composite until \(k=42\). By the way, online Chinese Remainder Theorem solvers generally don't appear to accept congruences of the form \(m=a_p\!\!\mod p\) (this by MathCelebrity.com is an exception) but solving with positive \(a_p\) and then negating the answer is fine, as can be seen immediately at (Theorem 5, notes(1)).
 Work on long prime gaps has historically used the Jacobsthal function: \(j(n)\), for positive integer \(n\), is the smallest positive integer \(m\), such that every sequence of \(m\) consecutive integers contains an integer coprime to \(n\) (alternatively, \(j(n)\) is the maximal gap between integers coprime to \(n\)).The first thirty values A048669 are \(1, 2, 2, 2, 2, 4, 2, 2, 2, 4, 2, 4, 2, 4, 3, 2, 2, 4, 2, 4, 3, 4, 2, 4, 2, 4, 2, 4, 2, 6,\ldots\). Ford et al's paper observes that \(Y(x)=j(P(x))1\) (with \(P(x)\) the product of primes not exceeding \(x\)).
 An excellent article about Cramer's model of the prime numbers and his conjecture is this by Andrew Granville, hosted at Chance News, who also have a whole series of lectures on probabilistic number theory, with part 2 focussing on Cramer's work.
Theorem no. 242: The Pólya–Redfield Enumeration Theorem

 Original sources for this theorem:
 J. Howard Redfield, "The theory of groupreduced distributions", Amer. J. Math.,
Vol. 49, No. 3, 1927, pp. 433–455; online (paywall).
 J. Howard Redfield, "Enumeration by frame group and range groups", J. Graph Theory, Vol. 8, No. 2, 1984, pp. 205–223; online (paywall). Accompanied by a modern reading of Redfield's original 1940 submission: J. I. Hall, E. M. Palmer and R. W. Robinson, "Redfield's lost paper in a modern context", J. Graph Theory, Vol. 8, No. 2, 1984, pp. 225–240; online (paywall). The E. Keith Lloyd paper providing one of the weblinks from the theorem page, is also indispensable: "Redfield's contirubtions to enumeration", MATCH Communications in Mathematical and in Computer Chemistry, Vol. 46, 2002, pp. 215–233; online.
 G. Pólya, "Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen",
Acta Math., 68, 1937, pp. 145–254; online.
 Space did not allow for the evaluation of the cycle index for the group action on the edges of the tetrahedron. The calculation gives \({b}^{6}+{b}^{5}r+2\,{b}^{4}{r}^{2}+4\,{b}^{3}{r}^{3}+2\,{b}^{2}{r}^{4}+b{r}^{5}+{r}^{6}\), whence the determination that there are, up to rotational symmetry, four colourings with three red and three blue edges. The total number of colourings is \(1+1+2+4+2+1+1=12\), as was already established by using the Cauchy–Frobenius Lemma.
 The description of this theorem makes a simplification by going straight from a set of labels \(L\) to the formal power sums \(\sum x_i^k, x_i\in L,\) substituted into the cycle index. More properly, we should associate a weight with each label and it is power sums of weights which are substituted. Thus, for example, our redblue edge colourings of the tetrahedron each 'choose' a subset of the edges (say, the blue edges). We can think of this as having a label 'absent' and a label 'present' with weights 1 and \(x\), respectively. And we can enumerate, say, the number of different 3sets up to tetrahedral symmetry by substituting into the cycle index the polynomials \(1+x^k, k=1,\ldots,3\). More formally still, we can define a 'figurecounting series' \(A(t)=\sum a_it^i\) in which \(a_i\) is the number of 'figures' (labels) having weight \(i\). Then what is substituted into the cycle index are the polynomials \(A(t^k)\). This allows \(A(t)\) to be an infinite sum (with constant coefficients!). In Peter Cameron, Permutation Groups, Cambridge University Press, 1999, section 5.13, this approach gives the enumeration of \(n\)sets up to symmetry via the figure counting series \(A(t)=t^0+t^1=1+t,\) the weights being 0 and 1.
 There is a lovely body of theory called 'orbital combinatorics' which combines enumeration up to both symmetry and structure. It originates in Peter J. Cameron, Bill Jackson and Jason D. Rudd, "Orbitcounting polynomials for graphs and codes", Discrete Mathematics, Vol. 308, Issues 5–6, 2008, 920–930, online here. There is a more uptodate overview on Cameron's blog and see also this presentation.
Theorem no. 243: A Theorem of Anderson, Cameron and Preece on Groups of Units

 Original sources for this theorem (which are the weblinks from the theorem page):
 D. A. Preece and Ian Anderson , "Obtaining all or half of U_{n} as ⟨ x ⟩ × ⟨ x + 1 ⟩, Integers, Vol. 12, 2012, paper #A52; online.
 P. J. Cameron and D. A. Preece, "Threefactor decompositions of U_{n} with the three generators in arithmetic progression", arxiv.
 A good presentation of this work in context by Peter Cameron, as well as much else of relevance and interest, are linked from his blog entry on the Donald Preece Memorial Day.
 The fact that \(3\) is a quadratic residue mod \(p\) for an odd prime \(p\) if and only if \(p\equiv 1 \pmod 6\) is a textbook exercise, c.f. Problems 9.3, no. 5(a) in David M. Burton, Elementary Number Theory 7th edition, McGrawHill, 2010.
Theorem no. 244: The LYM Inequality

 There is a classic probabilitistic proof of the LYM inequality due to Peter Frankl, "A probabilistic proof for the lyminequality", Discrete Math., Vol. 43, Issues 2–3, 1983, p. 325; online. A slightly more accessible account is here "Notes and Exercises 4".
 How LYM is situated in the study of antichains in partially ordered sets is very well explained by Dominic Yeo in this Eventually Almost Everywhere post.
Theorem no. 245: The Alternating Series Test

 A good source of information on the origins of the concept of series convergence is Giovanni Ferraro, "Convergence and formal manipulation in the theory of series from 1730 to 1815", Historia Mathematica, Vol. 34, Issue 1, 2007, 62–88, online here.
 There are some good discussions about alternating series at mathstackexchange.com: for example this, and this.
 A wonderful compilation of proofs of divergence of the harmonic series is: Steven J. Kifowit and Terra A. Stamps, "The Harmonic Series Diverges Again and Again", The AMATYC Review, Vol. 27, No.2, Spring 2006. A preprint can be found here. The first proof of divergence is attributed to Nicolas Oresme in the 14th century.
 A wellknown occurrence of the harmonic series is in creating a large overhang when stacking overlapping blocks: the stack remains stable when the overlaps are, successively, \(1/2,1/4,1/6,\ldots\), giving a total overhang of \(\frac12 H_n\), \(H_n\) being the \(n\)th partial sum of the harmonic series. Thus an unbounded overhang is possible. But in fact much more can be achieved, as demonstrated by Paterson and Zwick in 2007. See this prepint, for example.
 Mathwithbaddrawings has a lovely entry about how slowly the harmonic series diverges and the curious fact that, omitting terms containing the digit 9 restores convergence (the Kempner series).
Theorem no. 246: Euler's Product Formula for ζ(s)

 The convergence of Euler's formula for real \(s>1\) is sometimes attributed to Kronecker in the 1870s. However, it seems likely that convergence issues would have been resolved before that, even by the time of Riemann's investigations in the 1850s. Some good background is given here.
 Euler's formula is the wellspring from which emerged group representations and harmonic analysis in the masterful account by Anthony Knapp in the April 1996 issue of the AMS Notices.
Theorem no. 247: Euler's Product Formula for Sine

 Euler's cotangent series can also be derived from a 'halfangle' formula: \(f(x)=\frac12\left(f(x/2)+f\left((x\pm 1)/2\right)\right).\) See, e.g., Konrad Knopp (transl. Young), Theory And Application Of Infinite Series, Dover edition, 1990. The book is online here via archive.org and the relevant text can be found on page 205ff. My presentation (with thanks to Andy Rich) is essentially the same and needs to be acknowledged as what is referred to by Knopp as an "in general faulty mode of passage to the limit" (his emphasis), which he spends two pages making rigorous (even confined to the reals). The derivation of Euler's cotangent formula from the halfangle formula is attributed by Knopp to Heinrich Schröter, 1868. There is more on its history in Chapter 11 (§ 2) of Reinhold Remmert's, Theory of Complex Functions, Springer, 1991, (transl. Robert Burckel).
 A very interesting account of the MittagLeffler theorem, which is the 'ultimate' generalisation of Euler's formula, is available in this dissertation by Laura E. Turner.
 An explanation by Jim Belk of why convergence of infinite products is subsumed within convergence of infinite series.
 Euler's formula's LHS may be replaced with the more elegant (from this website's perspective!) \(\sin\tau x\) but at the expense of elegance in the RHS, where the factors in the product become \(14x^2/k^2.\)
Theorem no. 248: A Theorem about Gaussian Moats
 This is a 'theorem under construction': I hope to chart exciting developments here towards an eventual final version, which may or may not mean a proof that Gaussian moats can have unbounded width.
 Original source for this theorem: Ellen Gethner, Stan Wagon, and Brian Wick, A stroll through the Gaussian primes", American Mathematical Monthly, vol. 105 (1998), pp. 327–337; online (this is the recommended weblink from the theorem page).
 The widest Gaussian moats found thus far have width \(D=6\), found in 2004 by Nobuyuki Tsuchimura: "Computational Results for Gaussian Moat Problem", IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Volume E88A Issue 5, May 2005, 1267–1273; online (paywalled); preprint (400KB pdf).
 An arxiv preprint by Madhuparna Das asserts that there is no sequence of Gaussian primes of form \(a^2+b^2\) with bounded distance between consecutive entries. Das's arxiv account lists three or four other articles on Gaussian moats.
 Ellen Gethner has kindly supplied information on the origins of Gordon's Gaussian Moat problem:

"One of the most difficult aspects of the problem was in finding out who actually posed it; the paper by Jordan and Rabung attributes that to Erdős. I had the opportunity to ask
Erdős about the problem at an analytic number theory conference at the University of Illinois in 1995. I remember going to a conference party at someone’s home, and there was Erdős on a chaise lounge in the middle of an enormous back yard with no other chairs in sight and 100+ mathematicians milling around him. I had a fairly lengthy conversation with him about the Gaussian Moat problem; I learned right away that he hadn’t posed the problem and he wasn’t sure who had. I asked him if he thought the conjecture was true (i.e., that there are indeed arbitrarily wide Gaussian Moats) and his response was a pause followed by “what do YOU think?” I answered that I thought the conjecture was correct, to which he responded “so do I!”
In the meantime, later on at the same conference, I happened to be in a car with several other mathematicians on the way to a session. One of the mathematicians was Basil Gordon; I had heard from one of his PhD students that Gordon might be able to help me find who had posed the problem. During that car ride, I asked my question, and oddly enough, Gordon turned out to be the original poser! I think (I’m a little fuzzy here) that he said that he had posed the problem during a session of one of the ICMs. In any case, all of the encounters were purely serendipitous and in looking back on the whole thing, I’m surprised that I succeeded in solving some of the mysteries." 

(the "paper by Jordan and Rabung" is J. H. Jordan and J. R. Rabung, "A conjecture of Paul Erdős concerning Gaussian primes", Math. Comp., vol. 24 (1970) 221–223. They construct a 4moat, i.e. they show that steps of size at least \(D=4\) are required to reach infinity.)
 I found these notes (pdf) on Gaussian integers by Christian Wuthrich of great value, as also these by Noah Snyder.
Theorem no. 249: Bézout's Identity

 It should be noted that the discrete logarithm problem is solved by quantum computers: Peter W. Shor, "Algorithms for quantum computation: discrete logarithms and factoring", in Proc. 35nd Annual Symposium on Foundations of Computer Science (Shafi Goldwasser, ed.), IEEE Computer Society Press, 1994, 124–134 (last time I looked — Dec. 2020 — there was an online copy here (the last entry under 'Quantum Computation', the pages of the pdf version appear in reverse order for some reason), Shor's arxiv version is also available but this is expanded from the FOCS publication and is a bit less accessible, in my opinion.)
 I find that the photograph featuring Merkel and Obama which accompanied the whitehouse.org blog entry cited on the theorem page, does so no longer.
Theorem no. 250: the Power of a Point Theorem

 Michael N. Fried, the author of "Mathematics as the science of patterns", the recommended weblink from this theorem, gave me the following insight, which I find charming and valuable:

\(h\), as you defined it, can be thought of in a slightly different way. Consider the function \(f(x,y)=(xa)^2+(yb)^2r^2\). Its zeros are the points on a circle with center \((a,b)\) and radius \(r\); but its value at an arbitrary point \(P(x,y)\) is the power of \(P\) with respect to that circle. Students typically learn to solve equations \(f(x,y)=0\), that is, to find the curve they represent, but then ignore the values of \(f\) at other points. For example, the curve given by \(f(x,y)=ax+by+c=0\) (normalized so that \(a^2+b^2=1\)) is of course a straight line \(L\), while the value of \(f\) at other points is the distance between those points and \(L\) (including of course those points whose distance from \(L\) is zero!). 

Michael Fried has, by the way, a fascinating Youtube presentation comparing the relevant bits of Euclid with Steiner's geometry.
 CutTheKnot offers a very nice application of the Intersecting Chords Theorem.
Theorem no. 251: The Hanani–Tutte Theorem

 Hanani's original 1934 paper, published under his Polish birth name of Chaim Chojnacki as "Über wesentlich unplättbare Kurven im dreidimensionalen Raume", Fundamenta Mathematicae, Vol. 23, Issue 1, 1934, pages 135–142, can be read online here. Bill Tutte's 1970 paper "Toward a theory of crossing numbers", Journal of Combinatorial Theory, Vol. 8, Issue 1, 1970, pages 45–53, can be read online here.
 The algebraic specification of planarity is independently credited to WenJun Wu, "On the planar imbedding of linear graphs", Journal of Systems Science and Mathematical Sciences, 1985 Issue 4, pages 290–302, with independent work of some others preceding it. More details at the Wikipedia entry for the theorem.
 Although we can solve equations to turn a graph drawing into one which has only evenly crossing independent edge pairs it is not obvious how to turn the result into a drawing with no crossings at all. For this we need a direct algorithmic proof of Hanani–Tutte and this was first provided by Michael J. Pelsmajer, Marcus Schaefer and Daniel Štefankovic in "Removing even crossings", Journal of Combinatorial Theory B, Vol. 97, Issue 4, 2007, pages 489–500, online here.
 For small graphs, testing solvability of the Tutte–Wu equation system allows planarity testing without recourse to graph algorithms or data structures. However, for large graphs this does not compete realistically with known linear algorithms since the worstcase running time is \(O(n^6)\) where \(n\) is the number of vertices of the graph (strictly, the number of edges is involved in the running time but a nonlinear number of edges guarantees nonplanarity by the \(3n6\) bound, see Kuratowski's Theorem).
Theorem no. 252: Bertrand's Ballot Theorem

 Original source for the Cycle Lemma is A. Dvoretzky and Th. Motzkin, "A problem of arrangements", Duke Math. J., Vol. 14, No. 2 (1947), 305–313; online (paywall). A good secondary source is Nachum Dershowitz and Shmuel Zaks, "The cycle lemma and some applications", Europ. J. Combinatorics, vol. 11, no. 1, 1990, pp. 35–40; online.
 As so often, the theorem is not accurately named, since it was apparently first stated by William Allen Whitworth in 1878. The general form I have given is due to neither, having evolved from the original case \(k=1\). Details may be found in Marc Renault, "Four Proofs of the Ballot Theorem", Mathematics Magazine, vol. 80, no. 5, 2007, pp 345–352; online via Renault's Ballot Problem page (which is the recommended weblink from the theorem page).
 Just to fill in the details, there is a claim on the theorem page that removing a sequence of \(k\) \(a\)'s and a \(b\) from a cycle with \(n\) \(b\)'s and \(m=n(k1)+S\) \(a\)'s, "gives a new cycle in which the surplus \(S\) is reduced by exactly 1." Indeed, with \(M=mk\) \(a\)'s and \(N=n1\) \(b\)'s remaining, we have \(M=n(k1)k+S=N(k1)+k1k+S=N(k1)+(S1).\) (The proofs of the Cycle Lemma I have seen invoke the pigeon hole principle to repeatedly remove the \(a\)\(b\) sequences. But I find this obscure — what are the boxes? Saying 'a counting argument shows...' would seem more appropriate. And I have gone so far as to actually spell out the counting argument.)
Theorem no. 253: The Third Isomorphism Theorem

 This page has been separated off from an ealier combined description of the 2nd and 3rd isomorphism theorems, see Theorem 35, notes(1).
 There is a temptation, when dealing with quotient groups to use shorthand group notation as in \(F_{20}/C_5\cong C_4\). It has to be kept in mind, however, that quotienting by isomorphic subgroups need not result in isomorphic quotient groups. See this by mathcounterexamples.net, for example.
Theorem no. 254: Kasteleyn's Theorem

 Original source for this theorem is: P. W. Kasteleyn, "Dimer statistics and phase transitions", J. Math. Phys., 4, 1963, pp. 287–293; online (paywall). I have not seen this paper but its abstract certainly seems to concern the general planar graph version of Kasteleyn's theorem. But the theorem seems more generally attached to the paper "Graph theory and crystal physics", in Graph Theory and Theoretical Physics, F. Harary, ed., Academic Press, London, 1967, pp. 43–110. There is a Wiki page for the theorem which calls it the FKT algorithm, giving equal billing to Fisher and Temperley. Sources for their earlier contributions can be found there.
 This theorem is sometimes stated with the argument of the square root being the absolute value of the determinant. This protects against an eventuality, of the determinant being negative, which in fact cannot arise, since the matrix in question is necessarily real, skew symmetric and thus has nonnegative determinant.
 Donald Knuth gives some valuable history of the Pfaffian function in "Overlapping Pfaffians", Electronic Journal of Combinatorics, vol. 3, no. 2, 1996; online.
 David E. Speyer has given short topological proofs of Kasteleyn's theorem, and variants of it, in "Variations on a theme of Kasteleyn, with application to the totally nonnegative Grassmannian", Electronic Journal of Combinatorics, vol. 23, no. 2, 2016; online.
Theorem no. 255: Countability of the Rationals

 I am not sure if Cantor ever explicitly published the countability of the rationals as a theorem. It is mentioned in the Wiki article on Cantor's first (1874) set theory paper as arising in correspondence between Dedekind and Cantor. It follows at once from more general results such as the countability of a countable union of countable sets (but this latter result requires the axiom of choice!)
 The version of the Stern–Brocot tree used in my illustration is often attributed to Neil Calkin and Herbert Wilf (2000). However it has been traced back to George Raney (1973) by Alessandro De Luca and Christophe Reutenauer in "Christoffel words and the CalkinWilf tree", The Electronic Journal of Combinatorics, 18(2), 2011, P22; online.
Theorem no. 256: Perfect's Necklace Formula

 Hazel Perfect's formula was published as a note in The Mathematical Gazette: "2588. Concerning Arrangements in a Circle", Vol. 40, No. 331 (Feb., 1956), pp. 45–46. Not freetoview online unfortunately but a 1page preview gives you everything you need!
 The bestknown algorithm for generating necklaces for a given number of beads and colours is by Harold Fredericksen, Irving J. Kessler and James Maiorana and is known as the FKM algorithm (you can see it in action on Jason Davies' necklaces page). An interesting alternative and a good source of information is Frank Ruskey, Terry MinYih Wang and Carla D.Savage, "Generating Necklaces", Journal of Algorithms, vol. 13, no. 3, 1992, 414–43; preprint.
Theorem no. 257: Distribution of Local Maxima in Random Samples

 The original publication is T. Austin, R. Fagen, T. Lehrer, and W. Penney, "The Distribution of the Number of Locally Maximal Elements in a Random Sample", Annals of Mathematical Statistics, Vol. 28, Number 3 (1957), 786790; online.
 T. Lehrer is apparently the Tom Lehrer famous as a satirical singersongwriter. The website thetomlehrer.weebly.com mentions the above and another article with the somewhat mysterious commentary "Unfortunately these mathematical publications did not have a lasting effect on society."
 At any rate, the Austin et al article provoked a reaction in the profession: M.O. Glasgow, "Note on the Factorial Moments of the Distribution of Locally Maximal Elements in a Random Sample", Ann. Math. Statist., Vol. 30, Number 2 (1959), 586–590; online.
 There is a nice tribute to Lehrer at Gödel's Lost Letter which links to a previous short entry on his theorem with Austin et al.
Theorem no. 258: Sylow's Theorems

 The original publication is L. Sylow, "Théorèmes sur les groupes de substitutions", Mathematische Annalen, Vol. 5, 1872, pp. 584–594; online (paywall); at Göttinger Digitaisierungszentrum. An English translation is provided here.
 Geoff Smith, whose book is the recommended reading for this theorem, gave me following nice picture of \(A_4\) not having an order6 subgroup:

When discussing the nonexistence of a subgroup of order 6 in \(A_4\), you do have the option to geometrize. Colour the vertices of a cube red and blue so that no vertices joined by an edge are the same colour. The group of rotations of the cube which preserve the blue vertices is a copy of \(A_4\) (label the blue vertices 1 to 4). The elements of this \(A_4\) are then rotations about grand diagonals and rotations through pi using skewers centreface to centreface. This enables one to reason geometrically about \(A_4\) and to "see" what is going on. This has the disadvantage that people with poor geometric intuition will melt, but the advantage of dealing with things more concrete than permutations. 

 The bellringing illustration for this theorem deserves a little amplification, which space on the page itself did not allow. The Plain Bob method starts with the 2Sylow subgroup and is completed by switching to its cosets in \(\mbox{Sym}_4\): permutations 8 to 15 are the left coset by \((2 4 3)\); permutations 16 to 23 are the left coset by \((3 4)\). Bell ringing in general provides good examples of Lagrange's theorem in action! There is a good introductory article in this vein "Bells, Motels and Permutations Groups" by Gary McGuire.
 On the subject of bell ringing, kudos to ringingroom.com, a site "built for change ringers to continue ringing with one another even when socially distanced".
 A couple of very valuable blog posts on Sylow appeared close on each other's heels at the end of 2020: this by Daniel Litt and this by Qiaochu Yuan.
Theorem no. 259: Schur's Commuting Matrices Bound

 Original sources for this theorem:
 Schur, J., "Zur Theorie der vertauschbaren Matrizen", Journal für die reine und angewandte Mathematik , Vol. 130, 1905, pp. 66–76; online (paywall; facsimile at Göttinger Digitaisierungszentrum). The first few paragraphs are translated into English here.
 Jacobson, N. "Schur's theorems on commutative matrices", Bull. Amer. Math. Soc., Vol. 50, Number 6, 1944, pp. 431–436; online.
 A short proof of Schur's theorem (in Jacobson's extension to arbitrary fields) is given in M. Mirzakhani, "A simple proof of a theorem of Schur", American Math. Monthly, Vol. 105, no. 3, 1998, pp. 260–262; online.
 In a general modern setting, this theorem is about the dimension of various kinds of subalgebra. See this arxiv post by J. Szigeti, J. van den Berg, L. van Wyk and M. Ziembowski, for example.
 The values of \(\lfloor n^2/4\rfloor+1\) are sequence A033638 at OEIS where many further references may be found.
Theorem no. 260: Dunn and Pretty's TriangleHalving Deltoid

 Original sources for this theorem:

Dunn, J.A. and Pretty, J.E., "Halving a triangle", Math. Gaz., Vol. 56, No. 396, 1972, pp. 105–108; online (paywall).
 The history of triangle, and tetrahedron, bisection is authoritatively traced in W. A. Beyer and Blair Swartz, "Bisectors of triangles and tetrahedra", The American Mathematical Monthly,
Vol. 100, No. 7, 1993, pp. 626–640; online (paywall). To quote from their introductory remarks, "... the problems have a much older history in hydrostatics and naval architecture, as they are also connected with the orientation and stability of floating bodies." They formulate a version of the deltoid theorem which they say 'extends' various textbook entries. To give a flavour using sources available online: p. 190 of George Greenhill, A treatise on hydrostatics, MacMillan, London, 1894; online; and p. 232, e.g. 3 of Horace Lamb, Statics, including hydrostatics and the elements of the theory of elasticity, Cambridge University Press, 1912; online.
 Subsequent work on the bisection deltoid is recorded in Allan Berele and Stefan Catoiu, "Bisecting the perimeter of a triangle",
Mathematics Magazine, Vol. 91, Issue 2, 2018, pp. 121–133; online.
 Variations on the triangle area bisection theme may be found on blogs and forums: math.stackexchange, Saving School Math, wolfram.com.
 A description of how the hyperbolae were plotted in the illustration for this theorem page is given here and how the bisecting line of arbitrary slope was plotted is described here.
Theorem no. 261: Euclid's Pythagorean Formula

 Original sources for this theorem: David E. Joyce is the recommended source for an English language reading of Euclid. The relevant page is here. Richard Fitzpatrick's duallanguage source allows something like the original Greek to be viewed.
 St Exupéry's problem appears in various places on the internet, for instance on the official Antoine de SaintExupéry website, whence the French version of the problem text. A variant of the problem is found here, which may actually be the original, the Egyptian story having been added by popularisers subsequently. His Pharoah problem has led to his name being given to those integers which are products over Pythagorean triples, the SaintExupéry numbers being entry A057096 at OEIS. It is observed that it is unknown if there can be two triples yielding the same StExupéry number.
 Antoine de Saint Exupéry was obviously at ease with elementary number theory. As a little footnote, an article in the International Herald Tribune, "350 Years Later, Math Conundrum Bites the Dust," by Gina Kolata, June 25, 1993, celebrated Andrew Wiles's announcement of a proof of Fermat's Last Theorem. In response came a letter (IHT, July 29, 1993) from Isia Leviant (who I think must be the Isreali artist of that name):

Antoine de SaintExupéry, the famous French writer, was a fan of mathematics. In April 1943, a few months before being downed over the Mediterranean, he had lunch with me in an Algiers bistro. Knowing my training in math, he wanted to show me that he had solved the Fermat riddle. He started writing a series of equations on a paper napkin. Unfortunately, I had to stop him halfway: There was a mistake in his calculations. My own mistake was not to have kept the paper napkin with its precious manuscript.
ISIA LEVIANT.
Paris. 

Theorem no. 262: The Polygonal Number Theorem

 Original sources for this theorem:
 A.L. Cauchy, "Démonstration du théorème général de Fermat sur les nombres polygones", Mémoires de la classe des Sciences mathématiques et physiques de l'Institut de France, 14 (1813–1815), pp. 177–220; I don't find this online. Cauchy's result generally seems to be credited to him with the date 1813. However, catalogue entries for this paper cite it as "lu à` l'académie, le 13 novembre 1815", e.g. here.
 Melvyn B. Nathanson, "A short proof of Cauchy's polygonal number theorem", Proc. Amer. Math. Soc., Vol. 99, No. 1, 1987, pp. 22–24; online.
 My presentation of Nathanson's proof risks giving an impression of circularity: locate \(b\) such that quadratic equations in \(b\) specify an interval allowing \(b\) to be located. Lack of space prevented me from clarifying: for a given \(n\) and \(m\) an interval for \(b\) may be expressed, via the quadratic formula, purely in terms of \(n\) and \(m\). Namely, the interval \([1/2+\sqrt{6(n/m)3}, 2/3+\sqrt{8(n/m)8}\,]\) is guaranteed to be bounded by the zeros of the quadratics and to have length at least 4 for \(n\geq 120 m\).
 Nathanson's proof gives a stronger version of Cauchy's theorem: any nonnegative integer may be written as a sum of \(m+2\) polygonal numbers of order \(m+2\) at most four being greater than 1. (Nathanson says \(m+1\) polygonal numbers but I don't see this for \(m=3\) when his \(0\leq r\leq m3\) will fail to give residue zero).
 Polygonal numbers are a special case of figurate numbers. The weblink from the theorem page is to a presentation of Elena Deza and Michel Marie Deza's, Figurate Numbers, World Scientific, 2012.
 The diagrammatic construction of pentagonal numbers illustrated on the theorem page follows the presentation here at mathandmultimedia.com (observe that their formula for the "\(k\)th pentagonal number" equates 'order \(n\)' with our 'order \(m+2\)': their \(P(n,k)\) is our \(P_{n2}(k)\).
 Some more context for this theorem is given in this presentation (600KB pdf).
