Mathematics / Critical Essay

Vol. 7, NO. 3 / October 2022

Reflections on an Essay by Wigner

Sergiu Klainerman

Letters to the Editors

In response to “Reflections on an Essay by Wigner


In the spring of 1960, Eugene Wigner delivered a lecture at New York University. Published as an essay the following year under the title “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,”1 Wigner’s remarks sparked a debate that continues to the present day. Indeed, the significance and implications of the essay have been discussed far beyond the realms of mathematics and physics.

Wigner’s essay has long been a source of fascination for me. I was a graduate student when I first read the essay and I have returned to it many times over the intervening years. Although I have often found myself admiring the clarity and articulation of Wigner’s observations, it is the mystery he pointed to that first caught my imagination and is at the heart of its enduring appeal.

The mystery Wigner described can be stated as follows: mathematical concepts introduced for solving specific problems turn out to have unexpected and mysterious consequences in seemingly unrelated areas. This is a form of mathematical entanglement that both mathematicians and theoretical physicists are familiar with.

Wigner gives as an example the appearance of the number $\pi$, the ratio of the circumference of a circle to its diameter, in the Gaussian distribution formula,

\[f(x)=\frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}},\]

where $f(x)$ denotes the population density function.2 “Surely the population,” he goes on to add in the voice of a simple minded character, “has nothing to do with the circumference of [a] circle.”3

Later in the lecture, Wigner marvels at the fundamental importance of complex numbers in quantum mechanics,4 when, at its origin, the number $i=\sqrt{-1}$ was nothing more than a fictitious quantity introduced by some clever fifteenth-century Italian mathematicians to solve cubic algebraic equations.

Wigner also offers a surprising characterization of mathematics as “the science of skillful operations with concepts and rules invented just for this purpose.”5

The mystery Wigner points out arises in part from the perennial question of whether mathematics is a science, advanced by exploration and discovery like the main physical theories, or whether it is an invention—a creation of the human mind. Wigner, like many other natural scientists,6 seems to have adopted the latter point of view: underlying concepts of algebra and analysis were inventions made by great mathematicians.7 I will argue here that, on the contrary, although human genius was at play, the “science of skillful operations” developed naturally through exploration and discovery. In the opening section of this essay I will attempt to demonstrate how the fundamental concepts of algebra and calculus have indeed developed organically, starting with the operational rules of numbers and leading all the way to differential equations—the most fundamental mathematical concept relevant to all fields of physics. To do justice to this enormous task would require a whole book; this section may be viewed as a first sketch of one.

At the center of Wigner’s mystery is the question of why so many mathematical concepts turn out to play such a fundamental role in physics. To understand this further, one has to probe the nature of the relationship between mathematics and natural sciences, a task that I will undertake in the latter part of this essay. Mathematics starts with the indispensable concepts of numbers and shapes. Physics adds to these primordial building blocks the elusive concept of time, as well as various observables—such as weight, density, velocity, acceleration and so on—associated with natural processes. These observables can be measured, compared, and related to each other by performing careful experiments. Beginning with numbers and the need to operate using them, mathematics and physics thus share a common origin; a starting point from which they both advanced, often in pursuit of different goals, but still sharing an absolute need for consistency.

It so happens that consistency, in the case of all basic physical theories, is ensured by a rigorous mathematical framework. Physics is indeed, as Wigner intimates, inconceivable without mathematics.

But why does physics rely on the kind of mathematics that it does, developed by mathematicians in pursuit of problems that may have had little, if anything, to do with the issues faced by physicists at any given time? Could they not have developed their own mathematics, as it was needed?

I do not have a fully satisfactory solution to propose, but insofar that mathematics developed from numbers and shapes, it may not be surprising that the mathematical theories most relevant to physics are to be found among those that have developed precisely in that fashion.8

Mathematics, however, was not only useful to physics by that mysterious process of entanglement. Indeed, it has often enriched itself by being devoted to problems motivated by physics. Nobody has described this better than Henri Poincaré:

The combinations that can be formed with numbers and symbols are an infinite multitude. In this thicket how shall we choose those that are worthy of our attention? Shall we be guided only by whimsy? … [This] would undoubtedly carry us far from each other, and we would rapidly cease to understand each other. But that is only the minor side of the problem. Not only will physics perhaps prevent us from getting lost, but it will also protect us from a more fearsome danger of turning around forever in circles. History [shows that] physics has not only forced us to choose [from the multitude of problems which arise],but it has also imposed on us directions that would never have been dreamed of otherwise … What could be more useful!9

Thus, while advancing according to different goals and methodologies, mathematics and physics have often crossed paths and deeply influenced one another. Moreover, as I will argue in this essay, at a certain stage in their development, the natural sciences become themselves part of mathematics. As a result, they were developed by processes typical to mathematics, such as the quest for generalization, completeness, and rigor, the freedom to ask seemingly unrelated questions and make connections to other parts of mathematics, and the search for formal beauty.

All this does not crack Wigner’s mystery, but it places it, I think, in a different perspective.


Science of Skillful Operations

Mathematics is the science of skillful operations with concepts and rules invented just for this purpose. The principal emphasis is on the invention of concepts.
—Eugene Wigner10

After briefly reviewing some of the main stages in the development of algebra and analysis,11 I will argue that what Wigner refers to as the “invention of concepts and rules” is rather the invention of good mathematical notation followed by the discovery of concepts that often lie behind them and the extension of preexisting rules. Definitions, reflecting human choices made by great mathematicians, also play an important role in pinning down a new concept, as revealed in the process of understanding a difficult mathematical problem.12 But most importantly, the domain of mathematics followed a natural process of exploration, extension, and completion, starting with its basic operations: the addition and multiplication of natural numbers.

It is not my intention here to offer an exhaustive historical account describing how these developments occurred,13 but rather to demonstrate how they were driven by simple mathematical necessity. I will limit myself to the description of the main developmental steps that led to the concepts of equations, both algebraic and differential—crucial concepts in the formulation of all known physical theories. In doing so, I will necessarily have to neglect other crucial aspects concerning the relevance of mathematics to the physical world, such as probability theory, statistics, or the development of efficient methods of calculation.14

In the beginning, there were only the natural numbers and the human need to manipulate them to solve real world problems. The positive integers 1, 2, 3 … are the simplest and most intuitive mathematical objects. Among the elementary operations, addition and multiplication are also very intuitive, subtraction and division less so. The elementary operations are both commutative and associative, and, moreover, multiplication is distributive with respect to addition. These are the basic laws of arithmetic,15 akin, one might say, to the basic laws of a physical theory. These simple properties made it possible for the first mathematicians to devise ingenious algorithms for adding and multiplying large numbers.16 Subtraction and division, on the other hand, cannot always be performed within the framework of positive integers. To solve elementary word problems,17 the most obvious practical mathematical task of the time, early mathematicians had to perform these more difficult operations. This work led, in turn, to the introduction of the number zero,18 negative numbers, and fractions—that is, rational numbers.

The discovery of rational numbers, the first major accomplishment of mathematics, was achieved through the process of a simple algebraic extension. Namely, a process by which the original concept of numbers and the elementary operations have been extended to the simplest framework in which, with the exception of division by zero, all four basic operations can be performed. Moreover, and this is essential, the extended operations of addition and multiplication still verify the basic laws mentioned above. Thus, once the new numbers were introduced, mathematicians could operate with them by following the same familiar rules. In modern language, the set of rational numbers, $\mathbb{Q}$,19 together with the operations of addition and multiplication form a commutative division ring or field.20 The important thing here is that elementary word problems, formulated with numbers in $\mathbb{Q}$, are solvable in $\mathbb{Q}$.

The heart of the matter is the basic arithmetic rules: associativity, commutativity and distributivity (ACD). The universality of these rules made algebra possible, since the most natural way to express them is to invoke adequate symbolic notations, i.e., for arbitrary numbers $a$, $b$, $c$,

$a+(b+c) = (a+b)+c,$ $a+b=b+a$
$a\times (b\times c) = (a\times b)\times c,$ $a\times b=b\times a$
$a\times (b+c) = a\times b +a\times c.$

Note that denoting numbers by $a, b, c$ is not that different from denoting addition by $+$ and multiplication by $\times$, or $\cdot$, the unit for addition by $0$, $a\times a$ by $a^2$, or by the conventions concerning brackets. That is, they are all useful notational conventions.

Yet, and this is of fundamental importance, good notations often reveal important concepts behind it.21 In this case they happen to reveal the algebra of polynomials. Indeed, we can formally repeat the same operations with the same letters; thus deriving a multitude of formal identities, such as the magical binomial formula

$(a+b)^2= a^2 +2ab +b^2$

or

$(a+b)\times (a-b)= a^2-b^2.$22

Monomials are simple expressions involving only products such as

$a^2bc=a \times a \times b \times c$

and polynomials’ formal expressions involving sums of monomials, such as

$a^2+ a +1= 1\times a\times a + 1\times a +1.$

Two such formal expressions, polynomials, can also be summed or multiplied by making use of the ACD rules and one can easily check that these new operations with polynomials verify precisely the same ACD laws.23 As a result, we are able to manipulate abstract symbols as if they were numbers.

One is naturally led—although this was, once again, a lengthy historical process—to the class of all polynomials in $n$-variables, meaning all formal expressions obtained by first taking arbitrary products of the variables $(x_1, x_2\ldots, x_n)$ denoted by $x_1^{\alpha_1} \ldots x_n^{\alpha_n}$, known as monomials, and then arbitrary combinations of these monomials

  1. \[ P(x_1,\ldots x_n)=\sum A_{\alpha_1\ldots \alpha_n} x_1^{\alpha_1} \ldots x_n^{\alpha_n}. \]

where $\alpha_1, \alpha_2\ldots \alpha_n$ are arbitrary positive integers and $A_{\alpha_1\ldots \alpha_n}$ are also arbitrary numbers.24 The polynomial is said to be of degree $k$ if the sum in (1) extends for all indices $\alpha_1, \alpha_2\ldots \alpha_n$ with $|\alpha|=\alpha_1+\alpha_2+\ldots \alpha_n\le k$. Note that at this level of abstraction there no way to differentiate the variables $( x_1, x_2, \ldots, x_n)$ from the coefficients $A_{\alpha_1\ldots \alpha_n}$ in (2). The differentiation only becomes manifest when we interpret $P$ as a function in the variables $(x_1, x_2, \ldots, x_n)$.

A Mighty Triad: Polynomials, Functions, and Equations

Formal manipulations with abstract symbols, such as polynomials, are an essential part of algebra, but by no means the main thing. The more important breakthrough occurred when mathematicians recognized that any elementary word problem can be formulated as an equation, or as systems of equations, using variables $(x_1, x_2, \ldots, x_n)$ with numerical coefficients, and solved by cleverly manipulating the ACD rules. To associate equations to formal symbols requires, implicitly,25 the even more abstract concept of function.

Thus, for example, the formal polynomial $2 x +3$ becomes the function that sends any number $x$ to the number $2\times x +3$, to which we can associate the equation $2x+3=0$. More generally, to any polynomial $P(x_1,\ldots x_n)$, as in (1), we associate the function that takes any $n$-tuple of numbers $(x_1, x_2\ldots, x_n)$ to the number obtained by replacing the formal variables of $P$ with numbers of the $n$-tuple. Polynomials, as formal expressions, and their associated polynomial functions are, rightly, identified with each other but, as a consequence, we tend to forget what an enormous conceptual advance this was. Once the identification is done we can talk about the roots of the polynomial $P(x_1,\ldots x_n)=0$ with no hesitation.26 To hope to get a unique or rather a finite set of solutions,27 we need $n$ such polynomials, $P_1, P_2, \ldots P_n$, and the associated system of equations, with $m=n$,28

  1. \[P_1(x_1,\ldots, x_n)= P_2(x_1,\ldots, x_n)=\cdots =P_m(x_1,\ldots, x_n)=0.\]

A solution of the system is any $n$-tuple of numbers that simultaneously verifies all $m$ equations. Generally speaking, such equations are extremely difficult to solve, not just from a technical point of view, but also conceptually. In pursuit of this goal, mathematicians had to first revolutionize our understanding of numbers.29

There is, however, one very important case in which we can write down explicit solutions, even when we restrict the coefficients of our equations to rational numbers. This is the case when our polynomials are linear, i.e., the case when the corresponding monomials are just the formal variables $x_1,\ldots, x_n$,

$P_i(x_1,\ldots, x_n)=\sum_{j=1}^n a_{ij} x_j - c_i,$

where $a_{ij}$ and $b_i$ are given and fixed,numbers—say, in $\mathbb{Q}$. Linear algebra is nothing other than a systematic theory of how we can solve such equations,30 and as such it has an enormous range of applications. At this point, it is very tempting to make a detour and analyze the conceptual framework of this extraordinarily beautiful and important theory, which provides many more examples of how artful notation has led the way to the discovery of new and powerful abstract concepts such as matrices, determinants, matrix algebra, vector spaces, linear operators, eigenvalues, and so on. Such a detour would make this essay unreasonably lengthy so I will resist the temptation to delve any deeper here.

To summarize: The introduction of abstract symbols and the formal expressions that can be made with them, i.e., polynomials, have led, via their associated functions, to the fundamental concept of algebraic equations and systems. In turn, the study of linear algebraic systems has led, through the introduction of a remarkable number of skillful notations, to the beautiful, exotic, world of determinants, eigenvalues, matrix algebra, vector spaces and linear operators. All these provide brilliant examples of how notations, introduced first for formal convenience, turn out to reveal a higher form of mathematical reality.31 The powerful trio—formal expressions, functions and equations—represents an extraordinary conceptual breakthrough, possibly the most important one in the history of science,32 opening the way for an avalanche of other formal inventions, in particular and most importantly those that lie at the foundations of calculus.33

Beyond Rational Numbers

Rational numbers suffice to solve linear equations with rational coefficients, corresponding to the simplest possible word problems, i.e., linear. This is no longer the case with even the simplest nonlinear equations, such as $x^2- 2=0$, which has no solutions in the class of rational numbers. As well known this remarkable discovery, attributed to Pitagora long before algebra was “invented,” shook the greek mathematical community to its core.34 The solution of the problem, as we understand it today, requires the introduction of another clever process of algebraic extension, similar to the one of passing from positive integers to rationals. Introduce $\sqrt{2}$, first as nothing more than an abstract symbol, verifying the convention $\sqrt{2}\times \sqrt{2}=2$. Next, consider all symbolic numbers of the form $a+\sqrt{2} b$, denoted by $\mathbb{Q}[\sqrt{2}]$, and extend formally the operations of addition and multiplication by following exactly the same ACD rules, keeping track of the additional convention that $\sqrt{2}\times \sqrt{2}= 2$. More precisely,

$(a+\sqrt{2} b)+( c+\sqrt{2} d) = a+c+ \sqrt{2}(b+d),$
$(a+\sqrt{2} b)\times( c+\sqrt{2} d) = ac+ 2 bd+\sqrt{2} (ad +bc).$

One can easily check that the extended operations verify the same commutativity and distributivity laws as those for $\mathbb{Q}$. Thus, at formal level, calculations in $\mathbb{Q}[\sqrt{2}]$ are done precisely in the same manner as in $\mathbb{Q}$. Since no contradiction can arise in any calculations involving $\sqrt{2}$, we, modern mathematicians, have no problem to consider it “real.”35 In fact, further simple manipulations allow us to place it on the real line somewhere between the rationals 1.41 and 1.42, or even more precisely, between 1.4141 and 1.4143.

Although this formal procedure extends the notion of number, together with the ACD rules, forcing such equations to become solvable—all in a self-consistent manner—would not have satisfied Greek mathematicians. They only came to terms with the new numbers, which they named irrational, after they were able to make sense of them constructively. That is, by showing that they can be approximated by converging sequences of rational numbers. This procedure, attributed to Eudoxus, belongs properly to infinitesimal calculus.36 It was perfected much later by mathematicians working in the nineteenth century, such as Augustin-Louis Cauchy, Richard Dedekind and Georg Cantor, who refined it to include all real numbers. Today, the standard definition of a real number is based either on Dedekind cuts or, better in my view, the classes of equivalence of Cauchy sequences.37 Both procedures allow us to extend the operations of addition and multiplication, from $\mathbb{Q}$ to all real numbers, such that the same laws of arithmetic manipulations still hold true. The set of all real numbers thus obtained, denoted by $\mathbb{R}$, is a commutative division ring just like $\mathbb{Q}$. This defines a natural,38 completion of $\mathbb{Q}$ in which, in particular, any polynomial equation $P(x)=0$, where the associated function $P(x)$ takes both positive and negative values, must have a “real” solution.39

Complex Numbers

What about other quadratic equations, such as $x^2+1=0$? Can this be treated in the same way? Obviously not, since the square of any number, positive or negative is always positive. Yet some enterprising Italian mathematicians during the fifteenth century found it quite useful to introduce a fictitious number, denoted $i=\sqrt{-1}$, as well as all symbolic expressions of the form $a+i b$, with $a$, $b$ real numbers. Perfectly conscious that these expressions are “imaginary” they proceeded nevertheless to manipulate them as if they were real, by defining additions and multiplication based on the ACD rules together with the convention that every time $\sqrt{-1}$ is multiplied with itself we replace the product by $-1$, i.e., $i^2=i\times i=-1$. Thus,

$(a+\sqrt{-1}\, b)\times( c+\sqrt{-1} \, d) = ac- bd+\sqrt{-1} (ad +bc)$
$(a+\sqrt{-1} \, b)+( c+\sqrt{-1} \, d) = a+c+ \sqrt{-1}(b+d).$

Just as in the case of the irrational numbers $\mathbb{Q}[\sqrt{2}]$, this procedure defines the set of all complex numbers

$\mathbb{C}=\mathbb{R}[\sqrt{-1}] =\Big\{ a+i b/ a, b\in \mathbb{R}\Big\}.$

It is easy to check that these extended operations verify the ACD rules. Moreover, $\mathbb{C}$ is closed to divisions by non zero elements i.e., it is a commutative division ring. Note that though the extension procedure here is very similar to the one for $\mathbb{Q}[\sqrt{2}]$, there is a hugely consequential difference.Once $i=\sqrt{-1}$ has been introduced the equation $x^2=i$ can also be solved in $\mathbb{C}$ while the same is not true for the equation $x^2=\sqrt{2}$ which cannot be solved in $\mathbb{Q}[\sqrt{2}]$.40 This is a simple manifestation of an even more miraculous fact. It turns out that all polynomial equations with coefficients $a_0, a_1,...a_n$, of the form

$P(x)=a_0 + a_1 x+\cdots + a_n x^n =0$

are solvable in $\mathbb{C}$. This is the mighty fundamental theorem of algebra, first proved by Gauss.41 In the particular case of the general quadratic equation $a x^2+ bx +c =0$, the solutions can be written explicitly in the form

$x=\frac{- b\pm \sqrt{\Delta}}{2a},$ $\qquad \Delta= b^2 - 4 a c.$

In fact complex numbers were originally introduced just to keep track of such expressions when the quantity $\Delta$ is non-positive,42 and thus to give a unified description of all possible cases.

There are many reasons why the introduction of complex numbers was more revolutionary than that of irrationals. To start with there was no obvious reason for their introduction. For, unlike the case of irrational numbers which were introduced because of the need to solve simple geometric problems, there is, of course, no meaningful word problem which would lead one to solve the equation $x^2=-1$. The magical “number” $i$ started its existence as nothing more than a modest convention.

Moreover, unlike irrationals which can be approximated by converging sequences of rational numbers, there is no such procedure for complex numbers. As such, ancient Greek mathematicians, even those after Eudoxus, would have found their use unacceptable. This also applies to many skeptical European mathematicians prior to Gauss and before complex numbers were given a geometric interpretation,43 the so-called complex plane interpretation, discovered by Jean-Robert Argand and Gauss. Mathematicians are not always so keen to “skirt the impermissible” after all.

To summarize: The need to solve nonlinear algebraic equations has forced mathematicians to extend their understanding of numbers beyond the rationals. The real numbers $\mathbb{R}$ are derived from $\mathbb{Q}$ by a well defined process of mathematical completion.44 Yet the discovery of complex numbers $\mathbb{C}$ is entirely accidental. As such, it is the most striking example in the history of mathematics of a concept revealing notation. It also provides the ultimate example of an algebraic extension, that is, an extension of $\mathbb{R}$ and its operations, such that:

  • The same, or very similar computational rules apply.45
  • We can solve a much larger class of equations in the extended context, in this case all, non constant, polynomial equations of the form $P(x)=0$.

The Cartesian Revolution and Differential Calculus

The introduction of abstract notations and the discovery of functions and equations opened the way to other extraordinary advances. The first of these involved applying the new concepts of algebra to geometry. It was René Descartes who first realized that any basic geometric figure in the plane or space can be represented by algebraic equations or systems of equations, via his brilliant idea of introducing coordinates. At the same time, abstract algebraic equations could now be visualized by geometric figures. Thus, for example, the circle of radius 1, centered at origin, corresponds exactly to the set of point in the plane of coordinates $(x, y)$, which verify the equation $x^2+ y^2=1$. On the other hand any algebraic equation of the form

$ax^2+2 hxy+by^2 +2 gx +2f y +c=0.$

for any real coefficients $a, h, b, g, f, y$ represent a conic section, i.e., an ellipse, a parabola or hyperbola in the plane. In fact, any system of algebraic equations of the form $P_1=P_2=\ldots= P_m$ with $P$’s polynomials as in equation 1, and $m< n$ has a geometric representation as an $m$ dimensional geometric object in $\mathbb{R}^n$.46

The unification Descartes achieved between geometry and algebra, two separate branches of mathematics, each with their own histories, must rank as high as any other major scientific revolution. It led, in relatively short order, straight to differential calculus which, in turn, made possible the new science of dynamics—the beginning of modern science. The ability to go back and forth between algebraic and geometric concepts continues to play a fundamental role in contemporary mathematics as well as in physics.47

The transition between cartesian geometry and differential calculus was initiated by Fermat,48 who realized that once you could represent geometric figures by equations it was natural to also describe analytically the tangent direction to a curve at a given point, expressed in terms of the defining function of the curve.49 This leads straight to the definition of the derivative. It soon turned out that the same definition can also be used to define the instantaneous velocity of a particle whose position $x$ can be expressed as a function of time $x=f(t)$, that is,50 at $t=t_0$,

  1. \[f'(t_0) =\frac{d}{dt} f(t_0) =\lim_{t\to t_0} \frac{f(t)- f(t_0)}{t-t_0}.\]

The definition of derivatives of functions led to the second major avalanche of formal inventions in the history of mathematics, after the introduction of algebra. Differential calculus operates with abstract functions instead of abstract numbers, by following new and specific, rules. Thus, to be able to calculate efficiently derivatives of functions, it was useful to codify operations between functions in a similar way as done earlier with numbers. Elementary functions,51 like polynomials, can be added and multiplied, verifying ACD rules. In addition, elementary functions can be composed. That is, given two functions—$f, g$—one can define a third

$(f\circ g)(t)= f(g(t)).$

Differentiation introduces a fourth and crucial operation, which takes a function $f$ into its derivative $\frac{d}{dt} f$. There are three new simple laws connecting differentiation to addition, multiplication and composition. The simplest, the linearity property, for every real number is $\lambda, \mu$,

$\frac{d}{dt} (\lambda f+ \mu g) = \lambda \frac{d}{dt} f +\mu \frac{d}{dt} g.$

The rules involving multiplication and composition are

$\frac{d}{dt} (fg) = f \frac{d}{dt} g+ f \frac{d}{dt} f g\\ (f\circ g) ' = f'( g(t) ) g'(t).$

These laws, which are easy to deduce from the abstract definition of derivatives, are sufficient to calculate the derivative of any elementary function. Indeed, any complicated elementary function can be decomposed into simple pieces by addition, multiplication, and the composition of functions. Knowing how to differentiate the simple pieces, such as $t$ and $\sin t$, we can differentiate the more complicated functions, such as $\sin^2 t$ or $\sin(\sin t)$. We can also define higher derivatives of a function $f=f(t)$ recursively,

$\frac{d^{n+1}}{dt^{n+1}} f= \frac{d}{dt} \big(\frac{d^n}{dt^n} f\big).$

Also in Isaac Newton’s notation,

$f''=\frac{d^{2}}{dt^{2}} f$, $\quad f^{'''}=\frac{d^{3}}{dt^{3}} f$

Integral Calculus

Integration theory has its origin in the straightforward and perfectly natural question:52 given a function $f=f(t)$, find a function $u=u(t)$ whose derivative is given by $f$,

  1. \[\label{Integration}\frac{d}{dt} u= f.\]

The corresponding solution, $u$,53 is termed a primitive of $f$, denoted $\int f$. The rules of differentiation, mentioned above, have simple counterparts in rules of integrations. For example, the linearity rules of differentiation become

\[\int( \lambda f+ \mu g)= \lambda \int f +\mu \int g.\]

To find the integral of a given function $f$ one can try to implement a method similar to the ones used for calculating derivatives, i.e., try to decompose the $f$ into simple pieces whose primitive we know how to compute. But this is just the formal aspect of integration theory. The major breakthrough, made independently by Newton and Gottfried Wilhelm Leibniz,54 was the realization that one can use this formal inverse derivative operation to calculate areas of complicated geometric figures.55 Thus, as the story has it,56 was begotten the glorious era of calculus!

The relation between derivatives and integrals is less transparent for functions involving more variables. The integration theory of such general functions was perfected in the nineteenth and early twentieth century by mathematicians like Bernhard Riemann and Henri Lebesgue. It was later extended into the more abstract framework of measure theory with a vast number of applications,57 particularly in probability theory.

The Second Triad: Functions, Differential Operators and Differential Equations

A differential expression applied to a function $u=u(t)$ is a formal expression involving addition, multiplication, and derivatives of $u$ in the form

  1. \[P[u]= a_0 u + a_1 \frac{d}{dt} u+ a\frac{d^2}{dt^2} u +\cdots + a_m \frac{d^n}{dt^n} u,\]

where, in the simplest case, $a_0, a_1, \ldots, a_m$ are either given constants or functions of $t$. In both cases, the operator $P$ is linear, i.e., for any constants $\lambda$, $\mu$, such that

$P[\lambda u+\mu v]= \lambda P[u] +\mu P[v].$

In the general and nonlinear case, the coefficients may also depend on $u$ and its derivatives. Note that $P$ is an operator, or functional, i.e., it operates on functions and produces functions. The corresponding differential equation attached to the operator $P$ is the differential equation $P[u] =0$,58 whose solutions are functions.

To be more precise, solving the polynomial equation $P(x)=0$ meant finding a number $x$ such that $P$, understood as a function, vanishes when evaluated at $x$. To solve the differential equation $P[u]=0$ means, instead, to find a function $u$ such that $P$, understood as an operator, vanishes identically when evaluated at the function $u$. Though the words to describe the two situations are similar, the difference in terms of the potential applications is enormous. Indeed, functions $u=u(t)$ can be used to describe the paths of a particle in motion, while the operator $P[u]$ is a mathematical representation of a given law of motion. The first derivative $u'(t)$ can then be interpreted as instantaneous velocity and the second derivative $u''(t)$ as instantaneous acceleration. Given a physical law, prescribed by the second order operator $P$,59 the solutions $u$ of $P[u]=0$ are all possible trajectories of particles within the physical process governed by the law $P$. Thus, unlike algebraic equations $P(x)=0$, in one variable $x$, which cannot have more solutions than the degree of the corresponding polynomial $P$, the differential equation $P[u]=0$ associated to the differential operator (5) has infinitely many solutions. We know from experience that the trajectories of particles in Newtonian mechanics depend only on their original positions and velocities.60 This has a simple and very general mathematical formulation in terms of what is known as the Cauchy problem. More precisely, under very reasonable smoothness assumptions on the defining function $P=P(u)$ in (5), solutions of the equation of $P[u]=0$ are uniquely specified by the values of $u$ and its first $m-1$ derivatives at a fixed value of the parameter $t$. Since physical laws are given by second order operators we see indeed that physical intuition in this case aligns perfectly with the mathematical properties of second order differential equations.

The formalism can be extended to systems of particles described by vector functions $u(t):=\big((u^1(t), u^2(t),\ldots, u^n(t)\big)$. The corresponding physical law, describing interactions between the $m$ particles, will be then represented by operators $P_1(u),P_2(u),\ldots, P_n(u)$ and the trajectories followed by each particle are solutions of the system of differential equations

  1. \[P_1[u]=P_2[u]=\ldots = P_n[u] =0.\]

In the particular case when the orders of all $P_i(u)$, $1\le i\le n$, is one, we can rewrite the system,61 under some simple non-degeneracy condition, in the more convenient form

  1. \[\frac{d\textbf{u}}{dt} = f(\textbf{u}, t),\]

with $\textbf{u}=( u^1,\ldots u^n)$ and $f:\mathbb{R}^n\rightarrow \mathbb{R}^n$. To solve the initial value problem for (7) means to specify the value of the vector $\textbf{u}$ at some value $t=t_0$. Typical systems are autonomous, i.e., $f=f(\textbf{u})$.

In the linear case, when $f$ is a linear function, i.e., $f(\textbf{u})=A\textbf{u}$ with $A$ an $n\times n$ matrix, the system takes the form

  1. \[\frac{d\textbf{u}}{dt} = A\textbf{u}.\]

To solve the initial value problem for (8) means to find solutions $\textbf{u}$ such that $\textbf{u}(t_0)=\textbf{u}_0$ for an arbitrary vector $\textbf{u}_0\in \mathbb{R}^n$.

In the simplest s case, when $n=1$, and $A$ is the constant $\lambda$, the equation

  1. \[\frac{d u}{dt} = \lambda u, \qquad u(0)=u_0,\]

can be easily solved by the exponential function $u=u_0e^{\lambda t}$. More generally, the vector $\textbf{u}= \textbf{u}_0 e^{\lambda t }$ is a solution of the linear system (8) if and only if $\textbf{u}_0$ is an eigenvector of the matrix $A$ with eigenvector $\lambda$, i.e.,

$A\textbf{u}_0= \lambda \textbf{u}_0.$

With a little bit more work one can solve the general initial value problem for (8). In fact, that problem reduces to the problem of finding all eigenfunctions of $A$ and the corresponding eigenvalues,62 that is a problem of linear algebra.

The simplest example of a second order differential equations describes the motion of the linear harmonic oscillator,63

  1. \[\frac{d^2}{dt^2} u+ \omega^2 u=0.\]

Looking for solution of the form $u= e^{\lambda t}$ we find that $\lambda^2+\omega^2=0$, a result that may seem discouraging because we are looking for real solution. Yet this provides another powerful example of the usefulness of complex numbers. We find the complex solutions $e^{i \omega t}$ and $e^{-i\omega t}$. A general solution can then be written in the form

$u(t)= a e^{i \omega t}+ b e^{-i \omega t}.$

If the initial data (10) are $u(0)= u_0, \partial_t u(u)= u_1$, with $u_0, u_1$ reals, we can solve the system in $a, b$,

$u_0= a + b, \qquad u_1= ia \omega-i b\omega$

and, using the Euler’s formula $e^{i\omega t}= \cos(\omega t)+ i\sin(\omega t)$, we find

$u(t) = \cos( t\omega) u_0+ \omega^{-1} \sin(t\omega) u_1.$

As was already the case for algebraic equations, nonlinear equations are rarely solvable in terms of specific formulas.64 This has forced the development of a powerful qualitative theory which allows us to infer various properties of solutions in the absence of specific representations. All qualitative studies of solutions start with the fundamental theorem of ODEs, according to which, for very broad assumptions on $f$ and any initial data $u_0$, there exists a sufficiently small $\epsilon > 0$ and a unique solution $u : [t_0, t_0 + \epsilon)\rightarrow \mathbb{R}^n$ of the system (7) verifying $u(t_0) = u_0$.65

Complex Differentiation

Once we know what it means to take the formal derivative of a real function $f(t)$, i.e., a function defined from an interval $I\subset \mathbb{R}$ with values in $\mathbb{R}$, denoted by $f:I\longrightarrow \mathbb{R}$, it makes sense, by pure analogy, to ask if we can perform a similar operation for a complex function $f(z)$, i.e., a function defined from a domain $D\subset \mathbb{C}$ with values in $\mathbb{C}$, $f:D\longrightarrow \mathbb{C}$. We try to mimic the definition (3) as follows:

  1. \[f'(z_0)=\lim_{z\to z_0} \frac{f(z)- f(z_0)}{z-z_0}.\]

Unlike the case of real functions, for which the limit makes sense whenever $f$ is smooth enough,66 this is absolutely not the case here. The limit makes sense for simple polynomial functions in $z=x+iy$, such as $z^n$, but not for other perfectly smooth functions, such as polynomials in $\overline{z}= x-iy$. Functions $f:D\longrightarrow \mathbb{C}$ for which the limit makes sense at all points of $D$ are called holomorphic. These functions are the object of study in one of the most beautiful and consequential branches of mathematics: complex analysis.

If we decompose a holomorphic function $f$ into its real and imaginary parts, i.e, $f=u+iv$, and interpret both $u$ and $v$ as functions in the variables $x, y$, we find that they must satisfy the following compatibility conditions:67

  1. \[\frac{\partial}{\partial x } u =\frac{\partial}{\partial y} v, \qquad \frac{\partial}{\partial y} u =-\frac{\partial}{\partial x} v.\]

These are the so called Cauchy-Riemann (CR) equations, the simplest classical system of partial differential equations.68 The restrictions imposed by the CR equations lead to an extraordinary number of remarkable properties.

  • The CR equations are not only linear, i.e., linear combinations of solutions are themselves solutions, but they also have the remarkable properties that the product and composition of two CR maps, i.e., maps $(x, y)\longrightarrow (u, v)$, is again a CR map.69
  • CR maps are conformal invariant, i.e., they preserve the angle between planar curves. Moreover, according to the celebrated Riemann mapping theorem,70 any simple connected domain in $\mathbb{R}^2$,71 different from $\mathbb{R}^2$, can be mapped by a CR map into the unit sphere of $\mathbb{R}^2$.
  • Both components of a CR map are harmonic, i.e., they verify the Laplace equation,
    1. \[\Delta u=\Delta v=0, \qquad \Delta= \frac{\partial^2}{(\partial x )^2}+ \frac{\partial^2}{(\partial y )^2}.\]
    This ties complex function theory to potential theory.72
  • The most useful property of holomorphic functions is the powerful Cauchy formula which relates the value of an holomorphic function at a point to the integral along a closed curved around the point.73

What is remarkable about the theory of complex holomorphic functions, initiated by Cauchy in the first few decades of the nineteenth century,74 is that, unlike differential calculus, which was intimately tied to the contemporaneous development of physics, the work in this area began as a mathematical enterprise driven by pure mathematical considerations: intellectual curiosity, analogies, careful analysis. That the theory turned out to have such an extraordinary range of applications, including modern physics, is another brilliant illustration of the miracle described by Wigner.

Partial Differential Equations

In the same way that polynomials can be extended from one variable to many, differential equations can be extended to functions depending on many variables, $u=u(x^1, \ldots, x^n)$. Given such a function we can take partial derivatives $\partial_i u =\frac{\partial}{\partial x^i} u$ obtained by differentiating $u$ with respect to the $x^i$ variable while keeping all the others fixed. A higher order mixed partial derivative of $u$ can be written in the form $\partial^\alpha u=\partial_1^{\alpha_1}\partial_2^{\alpha_2}\cdots \partial_n^{\alpha_n} u$ with $\alpha=(\alpha_1, \alpha_2,\ldots \alpha_n)$,75 where $|\alpha|=\alpha_1+\alpha_2+\cdots \alpha_n$ reflects the order of differentiation. Denoting by $\Lambda^k u$ the set of all partial derivatives of $u$ on the order of $\le k$, a general partial differential expression takes the form

  1. \[P[u]= A\big(x,\Lambda^k u(x)\big),\]

where $A$ is a given specified function. We associate to the formal expression the corresponding differential operator $u\rightarrow P[u]$ and the partial differential equation

  1. \[P[u]=0.\]

The equation is said to be linear if the operator $ P[u]$ is linear, i.e.,

$P[\lambda u +\mu v]=\lambda P[u]+\mu P[v].$

These operators $P$ can then be written in the form

  1. \[P[u] = \sum_{|\alpha|\le k } a_{\alpha} \partial^\alpha u,\]

where the coefficients $a_{\alpha}$ are functions of the variables $x=(x_1,\ldots, x_n)$. One can study scalar equations, such as (14), or systems of equations where $u$ is itself a vector $u=(u_1, \ldots u_k)$ verifying the multiple equations

$P_1[u]=P_2[u]=\ldots P_m[u]=0.$

There is very little one can say about partial differential equations (PDEs) at this level of generality. Unlike ordinary differential equations (ODEs), for which we have a satisfactory general local existence and uniqueness result, no such result is known in the context of general PDEs.76 Even in the summary style of this essay it is impossible to describe anything remotely substantial about the enormous range of PDEs.77 I will restrict myself here to a few brief remarks.

  • While ODEs provide the right mathematical framework for describing the motion of point particles, PDEs provide the perfect language to describe the motion of continuum of particles, or the more modern physics concept of fields. Thus the fundamental equations in continuum mechanics, electrodynamics, and general relativity are all PDEs
  • The only general class of equations for which one can develop a sufficiently general theory are linear equations with constant coefficients (LCCE), i.e., equations of the form (16) for which the coefficients $a_\alpha$ are constant. In that case, the Fourier transform method, first developed by Joseph Fourier at the beginning of nineteenth century during his studies of the heat equation, provides a very powerful and general tool. But even in that case, developing a general theory is prohibitively complicated, cumbersome, and not particularly illuminating.78 It is a lot more useful to concentrate instead on special equations using either Fourier methods or other methods based on the construction of a fundamental solution. Indeed, among the LCCE class there are a few equations that are ubiquitous throughout mathematics and physics, and disproportionately important in classifying and understanding larger classes of equations.79 Thus, for example, the Laplace operator

    $\Delta =\partial_1^ 2 +\partial_2^2 +\partial_3^2$

    is typical to the class of elliptic equations, while the D’Alembertian operator

    $\square=-\partial_0^2+\partial_1^ 2 +\partial_2^2 +\partial_3^2$

    is typical to wave equations. Other examples of second order scalar equations are the Heat operator $\mathscr{H}=-\partial_t+\Delta$ and the Schrödinger operator ${\mathcal S}=-i\partial_t+\Delta$. At the level of systems of equations, the Cauchy-Riemann equations, see (12),

    $\partial_2 u_1+ \partial_1 u_2=0,$ $\partial_1 u_1-\partial_2 u_2=0,$

    and the Maxwell equations are by far the most conspicuous.
  • The range of relevance for specific PDEs is phenomenal. Indeed, specific PDEs are at the heart of fully-fledged areas of physics. Thus one can argue that hydrodynamics is defined as the body of results, both theoretical and experimental, concerning the Navier-Stokes and incompressible Euler equations. In the same spirit, electrodynamics deals with the Maxwell equations and general relativity is really the study of the Einstein field equations, a geometric PDE by excellence. Similar remarks can be made about magneto-hydrodynamics and non-relativistic quantum mechanics. Moreover, entire fields of mathematics such as complex analysis, several complex variables, minimal surfaces, harmonic maps, connections on principal bundles, Kähler and Einstein geometry, and geometric flows are also organized around specific PDEs or classes of PDEs.
  • Since very few PDEs can be explicitly solved, mathematicians have been forced to develop indirect methods that allow them to describe the most important properties associated with solutions of the important equations.80 Thus, even though a developing a meaningful general theory is a pipe dream, mathematicians have been able to develop an impressive body of methods and techniques which are applicable to various equations.81
  • While the range of all possible equations is enormous, only a few special ones appear in physics. It is remarkable that the most important such equations can be derived using another unreasonably effective formal procedure known as the variational principle.82
  • Despite the enormous progress made with PDEs during the last two centuries, there remain a large number of fundamental problems for which our understanding is very limited. The problem of turbulence, as it manifests itself in the simplest mathematical context of the Navier-Stokes equations, or the cosmic censorship conjecture in general relativity are but two of the most conspicuous examples.

One of the most impressive mathematical results of the last hundred years is the solution to the Poincaré conjecture using Hamilton’s Ricci heat flow.83 This is a huge achievement and one in which the modern theory of PDEs played a crucial role. It is a dramatic demonstration of Wigner’s reflections on how ideas originating in specific areas of mathematics or physics percolate in other seemingly unrelated areas.84 The mystery can be stated as follows: how does a heat flow, originating in Joseph Fourier’s study of heat conduction, have anything to do with the Poincaré conjecture about the topological properties of the 3-dimensional sphere? The main results are due to Grigori Perelman, but his achievement ought to be viewed as the culmination of the immense progress made during the last century with elliptic and parabolic PDEs. The introduction of the Ricci flow itself,85 and the first important results based on it are due to Richard Hamilton. The geometrization conjecture that put the Poincaré conjecture in a full classification setting of 3-compact manifolds was introduced by William Thurston. The development of techniques for dealing with nonlinear parabolic and elliptic equations in order to analyze general solutions of the Ricci flow is due to great mathematicians such as Sergei Bernstein, Ennio de Giorgi, David Hilbert, Eberhard Hopf, Jürgen Moser, John Nash, Louis Nirenberg, Aleksei Pogorelov, Poincaré, Riemann, Juliusz Schauder, Sergei Sobolev, Hermann Weyl, and many others throughout the last century. The more recent blending of Riemannian geometry with PDEs was pioneered by mathematicians such as Thierry Aubin, Richard Schoen, Karen Uhlenbeck, and Shing-Tung Yau.

Manifolds and Tensor Calculus

In our subject of differential geometry, where you talk about manifolds, one difficulty is that the geometry is described by coordinates, but the coordinates do not have meaning. They are allowed to undergo transformation. And in order to handle this kind of situation, an important tool is the so-called tensor analysis, or Ricci calculus, which was new to mathematicians.
—S. S. Chern86

As part of the earlier discussion about the revolutionary fusion between geometry—with its lines, circles, and triangles—and algebra—with its abstract equations—the crucial contribution of Descartes was noted. Namely, his insight that geometric figures could be described by equations and vice versa. This is true, but there are, in fact, many ways to describe a given geometric object by equations. Depending on where a system of cartesian coordinates is centered, the standard sphere of radius 1, denoted $\mathbb{S}^2$, can be expressed by the equation

$x^2+ y^2 + z^2 =1,$

as well as

$(x-x_0)^2+ (y-y_0)^2 + (z-z_0)^2 =1.$

But there are also non-cartesian systems of coordinates, such as polar coordinates,

$z=r\cos \theta, \quad x=r \sin\theta cos\varphi, \quad z=r \sin\theta \sin\varphi,$

in which case the sphere becomes simply $r=1$. The problem becomes far more acute if the usual calculus is extended to functions on the sphere. If $f$ is such a function, $f:\mathbb{S}^2\longrightarrow \mathbb{R}$, how are derivatives defined along $\mathbb{S}^2$? For this purpose it is necessary to parametrize the sphere. For example, near its north pole

$N=(x=0, y=0, z=1),$

we can use the parametric equations

$x=u, \quad y-v,\quad z=\sqrt{1-u^2-v^2}.$

The function $f$ on $\mathbb{S}^2$ can then be described as the composition

$(u, v) \longrightarrow f(u, v, \sqrt{1-u^2-v^2}),$

which can then be differentiated with respect to $u$, $v$ as many times as needed. The problem is that the same function can be represented in many other ways depending on which parametrization is used. Thus, for example, a polar coordinate $\theta$, $\varphi$ could also have been used, in which case the function $f$ would be represented by

$(\theta, \varphi)\longrightarrow f( \sin\theta cos\varphi, \sin\theta \sin\varphi, \cos\theta).$

These are just two possible parametric representations of this function, but there are, in fact, infinitely many possible parametric representations of the sphere. Thus there are also infinitely many possible representations of the function and infinitely many ways to differentiate it. The problem is that the result of differentiation depends heavily on which parametrization is chosen and it is cumbersome to pass from one expression to another. As there is no a priori reason to prefer one over another, how should a parametrization be chosen? Is there a good definition of the derivatives of $f$ that makes it possible to pass easily from one parametrization to another?

Tensorial calculus was developed by mathematicians precisely in order to solve this problem. A proper definition of tensorial quantities starts first with an abstract definition of a manifold, completely removed from any visualization, as is the case of the sphere.87 The concept, first envisioned by Riemann,88 is founded on the notion that smooth geometric objects can be described locally purely in terms of local parametrizations, also known as local coordinates, and transformation maps between them. Tensor-fields on a manifold are quantities that also transform according to simple transformation laws. To define a good notion of the differentiation of tensors, which transform as tensors, it is necessary to endow the manifold with an additional structure called a connection—an innovation introduced much later by Weyl. At the time of his own work, however, Riemann knew nothing of it.

In order to define a notion of distance between points, Riemann also endowed his manifold with a metric, which turns out to be itself a tensorial quantity, i.e., it transforms with respect to coordinate transformations. Starting with a given metric, he was then able to generalize the notion of curvature for embedded 2-surfaces, which were discovered by Gauss as part of his famous investigations of the geometry of surfaces.89 This turns out to be the most important and non-trivial tensorial quantity—the mighty Riemann Curvature tensor.90 The metric also defines a unique and compatible connection, the Levi-Civitta connection, with respect to which full-fledged tensorial calculus on a Riemannian manifold can then be performed. It is important to note that the curvature tensor depends on two derivatives of the metric, while the Levi-Civitta connection, which is not a tensorial quantity itself, depends on one derivative of the metric. Moreover, the curvature tensor has a simple expression in terms of the connection and its first derivatives.

The curvature tensor is at the heart of Albert Einstein’s theory of general relativity, but not exactly the way Riemann defined it. In attempting to generalize Gauss’s theory of surfaces, Riemann naturally assumed his metric to be positive definite, while Einstein’s theory deals with so-called Lorentzian metrics.91 The passage from the Riemannian case to the Lorentzian goes through Hermann Minkowski, who was able to reformulate special relativity in terms of precisely such a metric, the simplest, known as the Minkowski metric. It turns out that special relativity can be described in its entirety by the Minkowski metric together with a version of tensor calculus restricted to the linear change of coordinates which preserve the metric, i.e., Lorentz transformations. General relativity is an extension of special relativity where Minkowski space is replaced by a general Lorentzian manifold, Lorentz transformations are replaced by arbitrary changes of coordinates, and all physical relevant quantities are tensor-fields. This latter statement is the mathematical embodiment of Einstein’s equivalence principle.92 Finally, the relation between the metric and various matter fields acting on the manifold is expressed in terms of an equation, namely the Einstein field equations:

$G = 8\pi T$.

The tensor $G$ on the left, which Einstein referred to as being made of marble, depends on only the metric and its curvature, while $T$, the so-called energy momentum tensor of matter, depends on the particular type of matter carried by the spacetime. Einstein refers to the right hand side as being made of wood, i.e., reflecting our contingent and imperfect understanding of it. Remarkably—yet another miracle—the Einstein field equations can themselves be derived by a variational principle.

Conclusions for this Section

Formal manipulation with abstract symbols led to the first fundamental triplet: an algebra of formal expressions, functions, and algebraic equations. In the same manner, formal manipulation with functions, including derivation, leads to the second fundamental triplet: differential calculus, differential operators and differential equations—including PDEs. It is important to note that both developments were intrinsic to mathematics, that is, they followed the inner logic of formal mathematical processes.93 The need to extend differential calculus and differential calculus to manifolds has also led, by a similar process, to tensorial calculus which has had a profound impact on modern physics. The fact that ODEs, respectively PDEs, and their modern tensorial reformulations, turned out to provide the perfect language for Newtonian mechanics, respectively the right formalism for continuum mechanics, electrodynamics and general relativity is, of course, part and parcel of Wigner’s mystery.


Limits of the Permissible

The great mathematician fully, almost ruthlessly, exploits the domain of permissible reasoning and skirts the impermissible. That his recklessness does not lead him into a morass of contradictions is a miracle in itself: certainly it is hard to believe that our reasoning power was brought, by Darwin’s process of natural selection, to the perfection which it seems to possess.
—Eugene Wigner94

As can be seen from the exposition in the earlier sections of this essay, the history of mathematics offers plenty of examples where, in the pursuit of specific problems, mathematicians are forced to transcend what is permissible by extending the objects and rules with which they operate.

The expansion of the concept of numbers from positive integers to rational, real, and complex numbers is an obvious example. An even more dramatic example occurred in connection to differential equations. To start with, even to define the derivative of a function is not at all obvious, since it requires taking a limit of fractions where the denominator converges to zero—meaning that one has to make sense of division by 0, a seemingly impermissible task. One has to understand what it means to take a limit—that is, making sense of an infinite process95—and give precise definitions of intuitive notions, such as continuity and various degrees of smoothness for functions.96

The need to clarify these issues was spurred by applications, yet the way mathematicians dealt with them is typical of the inner workings of mathematics: precise definitions, simple examples, generalizations, analogies, symmetry considerations, and a quest for completeness. That is, the search for the broadest setting in which various operations make sense. A striking example of the process of completeness, as we have seen, is the development of real and complex numbers. In order to solve specific examples of polynomial equations, mathematicians were led to the modern concepts of real and complex numbers.

A similar development has occurred in the theory of functions. To understand the broadest setting for which integration and differentiations makes sense, mathematicians developed Lebesgue integration theory, distributions, and various function spaces,97 such as the conspicuous Sobolev spaces.98 Most importantly, this led to the ability to formulate precise and general notions of solutions to differential equations. This is a truly remarkable development because, as was already apparent for algebraic equations,99 only a very small number of equations can be solved explicitly. To be able to show that solutions exist, without the ability to determine them explicitly in terms of elementary functions, is one of the most important achievements of mathematics during the last two centuries.

The driving idea behind this landmark development was that, despite the inability to explicitly represent solutions, it should still be possible to describe their essential properties. The starting point of such a process is the development of precise notions of general solutions, as mentioned above. Once such solutions are shown to exist, by an elaborated convergence process, one can then extract specific qualitative features of the solutions such as uniqueness, continuous dependence on initial conditions, smoothness, specific bounds, asymptotic behavior, the presence of singularities, and so on.


Wigner’s Great Mystery and Notions of Reality

The first point is that the enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious and that there is no rational explanation for it.
—Eugene Wigner100

As I wrote in the introduction, I am not aware of a fully rational explanation for Wigner’s mystery. Moreover, I am doubtful that one can ever be found at all. It is, after all, a mystery as great as any of the other great puzzles confronting us, such as the fine tuning of our universe, the origins of life, or why there is something rather than nothing.101 Wigner’s mystery is at the heart of what we are and how we interact with the external world. The best one can hope for is to shed some light on the “how” rather than the “why.”

If one postulates that physical reality exists as an all-encompassing reality, perceptible through our senses, but independent of them,102 the mystery pointed out by Wigner not only lacks a credible rational explanation, but seems entirely incomprehensible. Indeed, according to this viewpoint, mathematics deals with abstractions of the mind, which are nothing but a function of the brain—that is to say, a highly organized and evolved part of the same physical reality we are trying to describe. Yet if the brain is a natural evolutionary machine, as claimed by some modern neurobiologists,103 how can these mathematical abstractions, developed within it by purely physiological processes, attain that marvelous effectiveness described by Wigner in his lecture? “It is hard to believe,” he observes, “that our reasoning power was brought, by Charles Darwin’s process of natural selection, to the perfection which it seems to possess.”104 How, he marvels, is the natural selection mechanism capable of explaining the amazing ability of mathematicians to weave together thousands of abstract logical steps without falling into a “morass of contradictions”?105 In connection with the surprising presence of complex numbers at the heart of quantum mechanics,106 he writes:

It is difficult to avoid the impression that a miracle confronts us here, quite comparable in its striking nature to the miracle that the human mind can string a thousand arguments together without getting itself into contradictions, or to the two miracles of laws of nature and of the human mind’s capacity to divine them.107

Can that “natural evolutionary machine” concept of the mind explain all this? Can it explain why, as has often happened in the history of physics, abstractions pursued by mathematicians for seemingly esoteric reasons happen to be exactly what is needed for a new theory of the natural world?

There are other issues with the materialistic viewpoint. Not least among them is the fact that whatever we mean by the term physical reality is intimately tied to our own experiences of space-time and causality. As general relativity made clear, these are concepts that can only be described precisely through mathematics. But more about this later.

Physical Reality

There is another view, brilliantly illustrated by Plato in his allegory of the cave, according to which physical reality is itself simply a shadow of a non-physical reality, which can only be accessed by the mind. Plato’s insight is largely dismissed nowadays, both as a matter of principle in favor of materialism, but also because it leads to an artificial proliferation of rather vague ideal forms, such as that of a chair or a house, or highly elusive ones, such as justice or beauty.

Yet it is hard not to take Plato’s ideas seriously when it comes to mathematics. In a famous passage in his Republic, he points to the fact that mathematical objects, such as the circle, seem to have an objective reality independent of our own.108 We associate objectivity to physical things and processes first, because of our senses. We see a glass of wine, we can touch it, smell and taste the wine within it. We can either drink the wine or hear the sound made by breaking the glass. We can also share our experience with friends and not be at all surprised that they have impressions identical to ours. We can also leave the glass on the table and discover a day later that it is still in the same place. It is this rigorous coherence and consistency between the various ways we experience the glass that gives it its sense of objectivity, that is reality.

It makes sense to attempt to define the objectivity, or reality, of a physical object or process by the coherence and consistency of all our sensory experiences,109 including the exchange of impressions with other people. But what about things we consider real, but which cannot be experienced directly through our senses, such as viruses and bacteria, or stars invisible to the naked eye? Those can be brought within the same definition of reality with the help of instruments, such as a microscope or telescope, which vastly enhance our senses. But it is difficult to extend this definition to even smaller things like atoms, electrons, or quarks, for which microscopes are powerless, or massive things like black holes, which are intrinsically not directly observable.110

To account for their reality we need to extend our notion of objectivity by making a huge stretch. We consider them objective not because they are directly coherent and consistent with our senses, which they are not, but because they have a measurable effect, through an observation or experiment, within the framework of an accepted physical theory. These measurable effects may be quite remote from our senses; they need only be consistent, through logical inference, with all other known facts of the theory. But this is not enough. An acceptable physical theory must not only be consistent with all the measurable effects alluded to above, as well as all previously accepted physical facts, but also with itself—that is, with its entire logical framework. This is a mighty task and one that only mathematics is able to accomplish.111 Mathematics indeed provides an unambiguous and highly efficient language to tie together our various physical experiences into coherent physical laws and use them to make precise measurable predictions which can then be confronted by experiments.112

In his lecture, Wigner offers two examples of the fundamental role played by mathematics in the formulation of physical law—planetary motion and quantum mechanics. Special and general relativity provide equally striking examples. Geometry, that is Euclidean geometry, is itself the first known example of a physical theory. But this is not all. Not only are physical theories formulated in the language of mathematics, even more remarkable is the fact that new physical theories are almost always first designed in the laboratory of mathematics to explain facts unaccounted for by old theories and to make unexpected predictions which can then be tested experimentally.113 String theorists even argue, or at least have argued in the past, that the mathematical difficulties involved in reconciling quantum mechanics with general relativity are so formidable that only a unique theory—that elusive theory of everything—would be able to accomplish this feat.114 The obvious and paradoxical corollary of such a statement is that the mathematical design of the theory may suffice without immediate regard for experimental verification. A few generations earlier, Einstein offered the equally striking observation that new physics will have to wait for revolutionary new progress in mathematics.115

Mathematical Reality

What about mathematics? Plato argues that mathematical objects, such as the circle, are not only themselves real, but that they are in fact more real than those we experience by our senses. For anyone other than a mathematician or a physicist this may seem hard to swallow. Is not the circle just an unsubstantiated idealization of the real circles of our natural world? Plato argues, however, that it is the ideal circle that has true reality and that those we deem real are, in fact, but their imperfect embodiments. Leaving aside this claim of a more perfect reality, are mathematical objects real in the same way as our glass of wine? If one insists that reality has to be defined as coherence and consistency of sensory experiences the answer is no.116 But this definition is much too restrictive for a meaningful understanding of the physical world. If one accepts, however, the broader point of view that the reality of an object is defined by the consistency of our experiences with it, whether physical or mental, then mathematical objects, such as the circle, have a powerful claim to reality. Remarkably, one can spend years studying various properties of the circle, together with other geometrical concepts such as points, lines, triangles, ellipses and parabolas, and never arrive at a contradiction. Or one can try, in ignorance, to prove a false statement about these or other more abstract mathematical objects such as groups, manifolds, differential equations, and so on—only to realize that the incredible resistance encountered is harder than that of any rock. Or the extraordinary sense of satisfaction experienced when two completely different calculations arrive at the same result. Though people usually disagree on almost any issue not directly verifiable by their senses, a theorem like that of Pythagoras, proved more than two thousand years ago, is still recognized as valid today by anybody who cares to go through its proof. For those of us who have dedicated enough time to the pursuit of mathematics there is no doubt that mathematics deals with real, self-consistent, objects, that are imperceptible to the senses, but comprehensible to the mind.

An argument has been made above that an object is physically real only if it leads to observable and measurable effects consistent with all the other facts of an acceptable physical theory. This is, of course, a contingent definition; physical theories may change as new observable facts are brought to light. Mathematical reality, on the other hand, has only to be consistent within itself—that is, within the realm of its own definitions, concepts and theorems. Now, and this may bring us closer to the essence of Wigner’s mystery, the acceptable physical theory, needed as a crucial ingredient to describe physical reality, is itself a mathematical object—that is, an object which has mathematical coherence, hence reality, independent of its relevance to physics.117 One can study classical geometry, celestial mechanics, and quantum mechanics or relativity as pure mathematical theories, without the need, if one so wishes,118 for any regard to its applications to physics.119 Moreover, while physical reality is naturally constrained by our intuitive representations of space, time, and causality, the mathematical world is free of any such considerations. Not only are mathematical objects causally unrelated, the very concept of spacetime is itself a mathematical object, or rather objects. Mathematics is able to unambiguously define and study not only one, but various versions of spacetimes, of which only one can claim physical reality. It is this freedom, removed from our innate intuition of it through the sensory world, that made possible the revolutionary reinterpretation of spacetime in special and general relativity. While this intuition led Newton and Immanuel Kant to postulate an absolute notion of space and time, independent of the physical objects it contains, the new relativistic understanding makes spacetime another physical object in active interaction with all other physical objects within it. This radical change of view would have been inconceivable without the mathematician’s ability to freely play with concepts and theories as objects of the mathematical world. Quantum mechanics offers an even more radical departure from sensory based physical intuition. The duality between waves and particles, the uncertainty principle or entanglement, are incomprehensible outside the mathematical framework of the theory. Thus, in the words of Werner Heisenberg, “the smallest units of matter are not physical objects in the ordinary sense; they are forms, ideas.”120

Conclusions for this Section

Advances in theoretical physics are easier to fathom if we give up on any transcendental notions of reality,121 such as that of an eternal material world, independent of our perceptions of it and according to which, the human mind is but one of its manifestations. This point of view is ultimately circular and like the ether in pre-relativistic physics, it is more cumbersome than helpful. If, instead, reality is to be defined by the consistency and coherence of experiences, physical or mental, then mathematical objects have an equal or even better claim to it. While our senses can be illusory, logic applied to well-defined mathematical objects is infallible. Moreover, the physical objects of modern physics, such as electrons, quarks, strings or black holes are themselves mathematical objects impossible to fathom outside their natural mathematical framework.122 Though it is prudent to keep insisting on a fundamental distinction between physical and mathematical reality, it is hard not to notice that the more advanced a physical theory is, the more elusive this distinction becomes. And here, maybe, lies in a more outrageous form, the crux of the mystery; are these two so distinct after all?

In his remarkable book The Road to Reality, Roger Penrose gives an interesting illustration of the mysterious and paradoxical relations between the three worlds:123 the Platonic realm of mathematical forms, the physical, and the mental. A first mystery, the one pointed out by Wigner, is, in Penrose’s account, the fact that the physical world is entirely “illuminated” by a portion of the mathematical one.124 The second mystery is that the mental world is itself entirely “explained,” or determined, by a portion of the physical one, while the third holds that the mathematical world is entirely accessible to a limited portion of the mental one.

By anchoring the definition of reality to objectivity, meaning coherence and consistency of representations both sensory and mental,125 I find myself strongly in favor of the notion that mathematics is a science,126 in that it deals with mathematical discoveries rather than inventions,127 or creations of the human mind,128 by following its own version of the scientific method. Is there, however, a role for human creativity in mathematics? Of course there is. As is the case in any other science, faith in a certain outcome, the determination and persistence to pursue it, as well as the ability to change course when facts prove one wrong, are also part and parcel of mathematical research. But there is more, something unique to mathematics. Poincaré described it as “the feeling of mathematical beauty, of the harmony of numbers and forms, of geometric elegance.”129 It is, he added, “a genuinely aesthetic feeling which all mathematicians know.” According to Poincaré, it is this aesthetic sensibility that guides the mathematician to make inspired choices when faced with myriads of possible avenues in solving a problem.130 Similar aesthetic considerations are also at play when one chooses which problems to work on in the first place. Mathematics is thus both science and art; truth and beauty joined together in the most successful and inspiring saga of the human spirit.131


Closing Remarks

Modern physics leads to a conception of reality in which objectivity, measured by the consistency of our representations of it, is the ultimate arbiter. In that sense, mathematical objects are no less real than physical ones, although we still make an essential distinction. Physics starts with the raw notion of reality based on our direct experience of it, through our senses, and proceeds to extend its domain by incorporating any observable and measurable effects consistent with an accepted mathematical framework. At times, when new observations or experiments are found to be inconsistent with one or some of the laws, it reformulates them by adopting a new mathematical framework. Incompatibilities between theories used to describe different domains of physical reality,132 such as that between quantum mechanics and general relativity, are also powerful drivers in the pursuit of new mathematical theories in which the incompatibilities may be resolved. Mathematics, on the other hand, is only constrained by logical consistency.133 Its various branches are never incompatible with each others.134 This gives mathematics an enormous amount of freedom to explore and develop in many possible directions.

Upon closer inspection, however, mathematics does not deal with random abstract concepts, but has in fact begun its development from the most primitive notions of numbers and shapes. Starting with numbers and the practical need to manipulate them, mathematicians were able to extract the simple ACD laws of addition and multiplication. As I have argued in this essay, algebra begins with a progressive awareness of these laws and the extraordinary convenience of expressing them using simple abstract symbols. A related development occurred in geometry.135 Though initially very different disciplines, algebra and geometry were brought together when Descartes and others realized that all the elementary shapes of geometry can be described using algebraic equations. It was this momentous discovery that made calculus, with its unlimited number of applications, possible. Once the notion of derivatives was introduced as a formal expression for the tangent to a curve, mathematics followed a similar pattern of discovery as the development of algebra leading to the second formal triad mentioned in this essay, of differential calculus, differential operators and differential equations.136 The principal focus of both triads is on equations; algebraic in the first case, differential in the second. I am not too far from the truth, I think, in saying that solving equations, algebraic, differential, and otherwise,137 is the primary business of mathematics.138 Solving equations is crucial to all applications of mathematics; essentially all word problems occurring in engineering, the physical sciences, statistics, biology, and economics can all be translated into equations. And, of course, in classical physics,139 the basic laws are nothing but differential equations.140 It thus seems appropriate to update Pythagoras’s simple organizing belief that “All is number,” or Galileo’s “All is Geometry,”141 to the post-Newtonian “All is Equation.”

Acknowledgement: I am grateful to David Berlinski for his patient reading of a previous version of the text, constructive criticism, and numerous suggestions. I am also grateful to the editors of Inference for their assistance in preparing the essay.

Endmark

  1. Eugene Wigner, “The Unreasonable Effectiveness of Mathematics in the Natural Sciences. Richard Courant Lecture in Mathematical Sciences Delivered at New York University, May 11, 1959,” Communications in Pure and Applied Mathematics 13, no. 1 (1960), doi:10.1002/cpa.3160130102. 
  2. He could have given many more examples. In his own area of expertise, for example, he could have equally well marveled at how the notion of group, introduced by Évariste Galois to study the issues of solvability of algebraic equations, turns out to play such an important role in particle physics. Or, in an even more striking example, how the introduction of the notion of curvature of a surface by Carl Friedrich Gauss, extended later by Bernhard Riemann to the abstract concept of n-dimensional manifolds—in an act of pure mathematical curiosity at the time—turned out to be exactly the concept Albert Einstein needed to describe gravitational forces in his theory of general relativity. Moreover, this act of unparalleled magic did not occur before Hermann Minkowski was able to rephrase the theory of special relativity in a purely geometric language. 
  3. Wigner, “The Unreasonable Effectiveness of Mathematics,” 1. 
  4. Wigner writes:
    Let us not forget that the Hilbert space of quantum mechanics is the complex Hilbert space, with a Hermitean scalar product. Surely to the unpreoccupied mind, complex numbers are far from natural or simple and they cannot be suggested by physical observations. Furthermore, the use of complex numbers is in this case not a calculational trick of applied mathematics but comes close to being a necessity in the formulation of the laws of quantum mechanics.
    Wigner, “The Unreasonable Effectiveness of Mathematics,” 7. 
  5. He also asserts that “the great mathematician fully, almost ruthlessly, exploits the domain of permissible reasoning and skirts the impermissible.” 
  6. Consider the following famous quote by Albert Einstein: “To the extent that the laws of mathematics are certain, they do not refer to reality; and to the extent that they refer to reality, they are not certain.” The quote appears in Albert Einstein, “Geometry and Experience,” Address to the Prussian Academy of Sciences in Berlin on January 27th, 1921. David Berlinski has remarked: “I wonder whether [Einstein] appreciated the devastating consequences of his own argument?” Berlinski goes on to observe that
    [i]f the numbers are creations of the human mind, then it follows that without human minds, there are no numbers. In that case, what of the assertion that there is a natural number between three and five? It is true now; but at some time before the appearance of human beings on the earth, it must have been false. The proposition that there exists a natural number between three and five cannot be both true and false, and so it must be essentially indexical, its truth value changing over time. Napoleon being alive is accordingly true during his life and false before and afterwards. But if the proposition that there exists a natural number between three and five is false at some time in the past, the laws of physics must have been false as well, since the laws of physics appeal directly to the properties of the natural numbers.
    David Berlinski, “Mathematics and Its Applications,” in Mathematics, Substance and Surmise: Views on the Meaning and Ontology of Mathematics, ed. Ernest Davis and Philip Davis (Cham, Switzerland: Springer, 2015), 101–31. 
  7. Wigner also states that: “The principal emphasis is on the invention of concepts.” I attribute this phrasing to careless metaphorical language—he certainly knew that mathematicians don’t make up rules like in the game of chess. 
  8. This statement presupposes that there are parts of mathematics that have developed in a different fashion. Mathematical logic might be one example. 
  9. Donald Albers, Gerald Alexanderson, and Constance Reid, International Mathematical Congresses: An Illustrated History 1893–1986 (New York: Springer-Verlag, 1987), 5. The original appears in Henri Poincaré, “Sur les rapports de l’analyse pure et de la physique mathématique: Conférence de M. H. Poincaré au congrès international des mathématiciens, à Zürich, en 1897,” Acta Mathematica 21 (1897): 337, doi:10.1007/BF02417984. 
  10. Wigner, “The Unreasonable Effectiveness of Mathematics,” 2. 
  11. For historical comments about algebra I refer to John Derbyshire, Unknown Quantity: A Real and Imaginary History of Algebra (Washington DC: National Academies Press, 2006). 
  12. Such is the notion of an abstract group introduced by Galois to understand the quintic equation. Or the correct definitions of limits, continuity and differentiability of functions made by Cauchy after almost two centuries of arguments made by intuition, without rigor, and often faulty. 
  13. Thus, for example, the number zero and negative numbers were introduced rather late in the development of algebra, long after mathematicians were already comfortable with symbolic manipulation. Babylonian mathematicians, on the other hand, were apparently able to solve some quadratic equations before having a proper formalism in place. 
  14. By producing clever ways to manipulate numbers, calculate distances between far away objects using trigonometry and later in myriad numerical calculations in engineering, weather prediction, economics, and so on. The algorithmic nature of mathematical calculations is the real secret as to why computers have been so enormously successful. 
  15. As David Berlinski reminded me, these, of course, are not so basic after all as they can be deduced by the far more basic Dedekind–Peano axioms, which do not make any reference to these operations. For an excellent account of how this is done starting with just human need and the ability to count see David Berlinski, One, Two, Three: Absolutely Elementary Mathematics (New York: Vintage Books, 2012), 
  16. Finding optimal ways to represent and perform large calculations has a long and fascinating history in which human ingenuity played a major role. This is ultimately a matter of good notation such as the use of zero, first introduced by Indian mathematicians and later expanded by Arabs by introducing positional notation, that is expansion in powers of 10. For a more detailed account, see Berlinski, One, Two, Three
  17. These are problems which can be expressed, in modern language, as systems of linear equations with integer coefficients. 
  18. Zero as algebraic unit for the addition—different from the zero as positional symbol introduced by Indian mathematicians. 
  19. The correct, modern, definition of fractions, which identifies the fraction $p/q$ as the equivalence class of all integers $(p’, q’) $ such $p\times q’ =p’\times q$, illustrates the importance of good definitions, as discussed in Berlinski, One, Two, Three
  20. The extension implicitly assumes the cancellation laws for addition and multiplication, that is $a+(-a)=0$ and $a\times a^{-1}=1$. It is also notable that this procedure of extension, starting with the natural numbers, is unique. 
  21. The introduction of abstract symbols in calculations followed a very long and painful historical process. Though the Babylonians seemed to have been close to developing it, the first known mathematician who used a letter to express an unknown quantity was Diophantus, in the second or third century CE. He is considered by many as the father of algebra. Finally, it was François Viète, towards the end of the sixteenth century, who introduced letter symbols for both the coefficients and unknowns of equations. 
  22. I still remember the excitement I felt seeing these formulas for the first time as a kid. It was like an entire magical new world was open to me. 
  23. Verification should not come as a surprise given the way these operations are defined. We can also define units for addition and multiplication of polynomials. Note however that division by two polynomials is not a polynomial; the set of polynomials endowed with the operations of addition and multiplication form what is called a commutative ring. 
  24. For the moment we can take them to be rationals, but any other type of numbers, such as reals or complex, will do. 
  25. The actual concept of a function developed much later by René Descartes, Gottfried Wilhelm Leibniz, Leonhard Euler, et al., as an essential concept of calculus. 
  26. For $n=1$ we expect at most a finite number of solutions, while for $n\ge 2$ we can have infinitely many solutions, or none at all, as in the case of the equation $x_1^2+x_2^2=-1$. 
  27. The number of solutions depends also on the degree of the polynomials involved. 
  28. The other cases $m < n$ (underdetermined) and $m>n$ (over determined) are also studied. 
  29. That is to say, such equations can rarely have rational solutions. Number theory, at least as it is traditionally understood, is the branch of mathematics dedicated to finding integer solutions of polynomial equations, i.e., solving diophantine equations. 
  30. This includes the extraordinary ability to provide efficient algorithms to solve highly complex and elementary word problems. 
  31. See the section “Mathematical Reality” for a discussion of what I mean by this term. 
  32. Once the power of symbolic manipulation and its relevance in solving complex word problems, by interpreting them as algebraic equations, was finally understood, all other abstract developments of mathematics occurred with extraordinary speed—almost instantaneously by comparison. 
  33. This, however, could not have been done before mathematicians were able to extend the notion of numbers beyond rationals. 
  34. The problem arose when trying to calculate the diagonal of a right square of size 1. 
  35. The same process can be repeated with other equations, such as $x^2-3=0$ or $x^3=2$. 
  36. In the case of $\sqrt{2}$, his procedure amounts to finding more and more precise approximations between two infinite sequences of rationals. 
  37. Both definitions meet the requirement of making calculus rigorous. In the seventeenth century, Descartes’s understanding of real numbers was restricted to the roots of polynomials. Such a definition contains, according to modern terminology, only the algebraic real numbers. These, as Georg Cantor later proved, are only countable, while the reals, as we understand today, are uncountable. To prove this famous result Cantor needed to have the same precise understanding of real numbers as used in real analysis texts today. 
  38. A functor in the language of category theory. 
  39. In particular, an equation of the form $ x^k=a$, where $ a, k$ are positive rationals, admits a unique, real solution. 
  40. For that we need an extension corresponding to the equation $x^4 =2$. 
  41. He produced, in fact, five different and beautiful proofs. 
  42. Thus, once more a symbolic notation turned out to unveil the extraordinary reality of complex numbers. Note also the remarkable fact that complex numbers appear naturally in the general Cardano formulas for solving cubic equations, in situations when the final solutions are in fact real! 
  43. It is interesting to note that their mathematical reality was rejected by many, not because they would lead to a contradiction, they did not, but because, until the discovery of the complex plane, they could not be given an interpretation in terms of familiar mathematical, and geometrical objects. In a similar manner, Newton refused to use calculus in his Philosophiæ Naturalis Principia Mathematica, relying instead on classical geometric constructions and arguments. 
  44. That associates a unique real number to any complete sequence of rationals. 
  45. $\mathbb{C}$ is in fact a commutative division ring. There are, in addition to $\mathbb{R}$ and $\mathbb{C}$, two more division rings: the quaternions $\mathbb{Q}$ and octonions $\mathbb{O}$. The former is not commutative and the latter also fails to be associative. Some intrepid mathematicians have tried to relate these four to the four fundamental forces known in physics. 
  46. Or in $\mathbb{C}^n$ where the system has many more solutions and is easier to analyze. Modern algebraic geometry is almost always studied in $\mathbb{C}^n$. 
  47. An excellent account can be found in Alain Connes’s expository article, “A View of Mathematics,” Inspire HEP, 2004. 
  48. Here is what Newton wrote later in a letter: “I had the hint of this method [of fluxions] from Fermat’s way of drawing tangents, and by applying it to abstract equations, directly and invertedly, I made it general.” Quoted in Louis Trenchard More, Isaac Newton: A Biography (NY: Dover Publications, 1962). Another important early developer of the notion of derivatives was Isaac Barrow, the immediate precursor of Newton. 
  49. The concept of a function had appeared already in Descartes’s work in that an equation in two variables corresponding to a plane curve, i.e., $F(x, y)=0$, indicates such a dependence, i.e., we can write $y=f(x)$ or $x=g(y)$. The term function was first used by Leibniz in his correspondence with Daniel Bernoulli in 1673, and was continually expanded until it reached its general modern form in the work of Richard Dedekind “Was sind und was sollen die Zahlen” (1888). The notation $y=f(x)$ is due to Euler. The modern notation $f:I\longrightarrow\mathbb{R}$ specifies both the domain $I\subset \mathbb{R}$ and the range $\mathbb{R}$ of the function. 
  50. The first notation, with $f’$ of $\dot{f}$ is that of Isaac Newton, the second $\frac{d}{dt} f $ is due to Leibniz. This is another example where good notation can make a big difference. It is said that mathematics stagnated in Great Britain for close to two centuries after Newton because people insisted on sticking to the notation used by the master. One obvious reason for the superiority of Leibniz’s notation is that it naturally suggests the crucial fact that $\frac{d}{dt} $ is itself an operator, independent on the function to which it is applied. This was another major conceptual revelation. 
  51. The only functions considered at the time of Descartes, called elementary functions today, were those obtainable by combining polynomials and the basic trigonometric functions by the simple operations of addition, multiplication, and composition. 
  52. The equation (4) can also be interpreted as the simplest differential equation. 
  53. Or rather solutions. Different solutions differ by constants. 
  54. The debate about who was first seems unimportant. Given the immense conceptual and formal advances made beforehand, some by Newton and Leibniz themselves, it was just a matter of time before another mathematician discovered the compelling connection between areas and the inverse derivative. 
  55. This is another extraordinary example of how a problem—areas of geometric figures—in one area of mathematics finds its correct solution in the development of another branch of mathematics—the new calculus. 
  56. I beg to differ somewhat on this. In my view, the most important early discovery of calculus was that of the magnificent universe of differential equations with their enormous range of applications. The fundamental theorem of calculus happens to be just one such application. 
  57. In works by Émile Borel, Henri Lebesgue, Johann Radon, René Maurice Frèchet, et al. The mathematical framework developed by these mathematicians was later applied to probability theory by Andrey Kolmogorov. 
  58. When $P$ is linear, the corresponding equation $P[u]=0$ is called a linear differential equation. 
  59. Note that Newton’s laws restrict the operators under consideration to second order, that is, not more than two derivatives are allowed. 
  60. Such as particles moving in the gravitational field of the Earth. 
  61. The general case can also be expressed in the form (6) by introducing many more variables. 
  62. More correctly, the problem reduces to finding the Jordan canonical form of $A$. 
  63. More generally, we can have $ \frac{d^2}{dt^2} u+V(u)=0$ with $V$ a given function of $u$. This is a nonlinear equation. 
  64. There are, however, a few, exceptional and completely integrable cases—i.e., when the solution can be expressed by specific elementary functions—such as the Kepler two body problem solved by Newton in his Philosophiæ Naturalis Principia Mathematica
  65. The result is sharp, as can be easily seen in the case of the $1D$ equation $\frac{d}{dt} u= u^2$. 
  66. That is, $x=f(t)$ defines a smooth curve near $(t_0, f(t_0)$ in the Cartesian plane $(t, x)$. 
  67. Assuming also that we can define the partial derivatives of $u, v$ when we keep one of the variables $x, y$ fixed. 
  68. This ought to place complex analysis as a branch of PDE, really. 
  69. This is not at all the case for solutions of the Laplace equation, and no other linear system of PDE’s have such properties. 
  70. The result can be generalized into one of the most beautiful, deep, and consequential results in geometry, the so called uniformization theorem for compact two-dimensional Riemannian manifolds. 
  71. A simple topological condition on the domain which, roughly, means that the domain has no holes. 
  72. A theory that was developed naturally in the context of continuum mechanics and electrostatics. 
  73. Unlike in the case of harmonic functions, for which the Dirichlet principle holds true, one cannot prescribe the values of a holomorphic function on a closed contour. Note, however, that any holomorphic function $u$ on a simple connected domain can be realized as the real part of a holomophic function $u+iv$. 
  74. In an 1826 letter to his teacher, Niels Henrik Abel wrote from Paris: “He [Cauchy] is moreover the only one today working in pure mathematics. Poisson, Fourier, Ampère, etc. etc. occupy themselves with nothing other than magnetism and other physical matters.” Quoted in Umberto Bottazzini, The Higher Calculus: A History of Real and Complex Analysis from Euler to Weierstrass (New York: Springer-Verlag, 1986), 85–86. 
  75. Note that partial derivatives commute, i.e., $\nabla_i\nabla_j u=\nabla_j \nabla_i u$. 
  76. Except the Cauchy-Kowalewski theorem holds true only in the restrictive case of equations with analytic coefficients and analytic initial data. Even linear equations cannot in general be solved. There are, in fact, many explicit examples of non-solvable equations, away from the analytic context. 
  77. For an attempt to give a compact, coherent, view of the subject, see Sergiu Klainerman, “PDE as a Unified Subject,” in Visions in mathematics, Noga Alon et al., eds. (Basel: Birkhäuser, 2010), 279–315. 
  78. See, in particular, the first two volumes of Lars Hörmander, The Analysis of Linear Partial Differential Operators (Berlin, Springer-Verlag, 2003). 
  79. The fact that the same basic equations turn out to be implicated in describing many different physical phenomena, even when these phenomena seem entirely unrelated from firsthand experience, is a striking example of the universality of mathematical concepts. The wave equation, for example, was first introduced by Jean le Rond D’Alembert to describe the motion of a vibrating string and was later found to be connected to the propagation of sound as well as electromagnetic and gravitational waves. The heat equation, first introduced by Fourier to describe heat propagation, appears as well in many other situations in which dissipative effects play an important role, such as the description of random walks in probability. The same can be said of the Laplace, Schrödinger, and other basic equations. 
  80. It is interesting to note that even in cases when we have an explicit solution, it is often easier to describe its properties by these indirect techniques than form the explicit solution itself. 
  81. See discussions in Klainerman, “PDE as a Unified Subject.” 
  82. Good equations ought to have special symmetries. The variational principle, also called the principle of least action, is a scheme of generating equations with prescribed symmetries. It allows us to associate any Lagrangian $L$ to a system of partial differential equations, called the Euler-Lagrange equations, which inherit the symmetries built in $L$. In view of Noether’s principle, for any continuous symmetry of the Lagrangian there corresponds a conservation law for the associated Euler-Lagrange PDE. Thus, the variational principle generates equations with desired conservation laws, such as energy, linear and angular momenta. Modern physical theories start by defining an appropriate Lagrangian. The variational principle has an uncanny ability to generate many of our most important equations. 
  83. A famous conjecture made by Poincaré about the topological characterization of the $3$-sphere, whose statement has nothing to do with PDEs! 
  84. Something that could be described as some form of “mathematical entanglement.” 
  85. This is a geometric flow of a parabolic type. The simple example of such a flow is provided by the heat equation. 
  86. Allyn Jackson, “Interview with Shiing Shen Chern,” Notices of the AMS 45, no. 7 (1998), 861. 
  87. Or any other surface embedded in the Euclidian space $\mathbb{R}^3$. 
  88. The modern definition, due to Hassler Whitney, appeared almost a century later. It is a marvelous example of an inspired definition, not indispensable to make tensorial calculus work, but extremely convenient. 
  89. These were surfaces embedded in the 3D Euclidian space. Gauss defined a notion of curvature of the surface, which made use of the normal to the surface in the Euclidian space. To his own great surprise, he was then able to show that the curvature depended only on the geometry intrinsic to the surface, through the metric on $S$ induced by the Euclidian metric. Gauss, who was known to be rather terse in his writings, described the result of Theorema Egregium “Remarkable.” 
  90. Tensorial calculus was originally developed by Gregorio Ricci-Curbastro and Tullio Levi-Civitta precisely in order to understand the nature of Riemann’s magical quantity. 
  91. At every point on a manifold the metric is, in fact, a symmetric matrix that is positive definite in the Riemannian case and has the signature $(-, +, +, +)$ in the Lorentzian case. 
  92. We are allowed to wonder why it is that among all possible theories nature chose to express itself through this most symmetric and beautiful mathematical theory. 
  93. Even though the great mathematicians of the time were often driven in their research by specific applications to geometry and the new science of dynamics. A great exception is complex analysis which, as mentioned earlier, was developed by Cauchy without any direct relation to problems originating in physics. 
  94. Wigner, “The Unreasonable Effectiveness of Mathematics,” 3. 
  95. One has to note in this regard that Greek philosophers, such as Zeno, had serious problems with this. Unlike the notions of continuity or smoothness, the precise concepts of limits and differentiability are far less intuitive, and as such they require precise definitions. 
  96. Analysis is the branch of mathematics that has developed to make sense of these and myriad other questions regarding functions. 
  97. Nothing better illustrates this quest for completeness than the notion of a distribution through which mathematicians perform the miracle of differentiating functions that are in no way differentiable. 
  98. The process was spurred by the need to solve ordinary and partial differential equations. Yet as often happens in mathematics, it was, with time, delegated and compartmentalized into various subfields, often with no particular regard for the original motivation. 
  99. Before Abel and Galois, up to the beginning of nineteenth century, people had the naive notion that all polynomial equations in one variable can be solved by formulas, involving radicals, similar to the one appearing in solving quadratic, cubic, or quartic equations. Abel and Galois were able to show, independently, that polynomials of degree five and higher cannot be solved in terms of radicals. Galois’s method, in particular, has led to the development of group theory, which, in the spirit of that mysterious mathematical entanglement described by Wigner, has found applications in many other areas of mathematics and physics. 
  100. Wigner, “The Unreasonable Effectiveness of Mathematics,” 3. 
  101. It is at best unproductive, and possibly meaningless, to search for an explanation of these. Some level of mystery will always be with us. One can even argue that as we advance in knowledge the extent of what we do not know, and the sense of the pervading mystery accompanying it grows at an even faster rate. The question of why there is something rather than nothing is a perfect example of what Wittgenstein calls philosophical problems that arise “when language is forced from its proper home into a metaphysical environment, where all the familiar and necessary landmarks and contextual clues are removed.” 
  102. According to the prevalent materialistic point of view—dominant in biology, neuroscience, and so on—physical reality is postulated to exist as an entity completely independent of our perceptions of it. According to that view, consciousness is itself nothing but “a highly organized version of matter”—the highest, as I was taught in in my dialectic materialist classes of my youth. It is thus both distinct and part of the same elusive notion of matter. An obvious difficulty with that point of view, besides its apparent circularity, is that, as in the case of Kant’s antinomies, we have absolutely no way to either prove or disprove any statement about it. Moreover, like the ether in pre-relativistic physics, the materialist conceit serves ultimately no role in interpreting physical processes. We lose nothing in our ability to understand the physical world by giving up on the pretense that everything, including our mind, can be explained by a reductionist approach to the motion and interaction of material entities. In this essay, I take the more modest point of view that reality is defined by objectivity, i.e., the consistency of perceptions, either sensory or mental, as argued by Alain Connes in this remarkable exchange: Jean-Pierre Changeux and Alain Connes, Conversations on Mind, Matter, and Mathematics, ed. and trans. M. B. DeBevoise (Princeton, NJ: Princeton University Press, 1995). 
  103. The reality, existence, coherence, and rigidity of mathematics are, according to Changeux, pure products of the human mind and a posteriori results of evolution. See Changeux and Connes, Conversations, 36. Does it then follow that, if evolution were to have followed a different path, mathematics may have reached different conclusions? That the prime decomposition theorem, for example, may have been false? 
  104. Wigner, “The Unreasonable Effectiveness of Mathematics,” 4. 
  105. In fact, as any mathematician knows, one single contradiction would suffice to obliterate the entire edifice of a mathematical proof. 
  106. “The use of complex numbers”, he also writes, “is in this case not a calculation trick of applied mathematics but comes close to being a necessity in the formulation of the laws of quantum mechanics.” 
  107. Wigner, “The Unreasonable Effectiveness of Mathematics,” 7. 
  108. His theory of ideal forms was most probably inspired by the existence of mathematical objects and then applied to provide objective support for non-mathematical, complex, and difficult to pin down concepts, such as truth, beauty and justice. 
  109. Coherent refers here to that of different sensory experiences of the same object. By consistent, I mean the fact that when I wake up in the morning my glasses happen to be precisely where I left them before going to sleep. 
  110. For any observer outside the black hole. 
  111. The slightest inconsistency that cannot be ultimately reconciled with accepted facts leads to the demise of the theory. 
  112. It is this latter fact that is the most miraculous of all for, as it happened a few glorious times in the history of physics, a new mathematical theory, consistent with all known facts at a given time, often leads to unexpected predictions, such as the existence of electromagnetic waves by James Clerk Maxwell, that of the electron by Paul Dirac, or the existence of black holes and gravitational waves by Einstein. 
  113. In the words of Freeman Dyson: “As usual when a profound new theory is born, the equations come first and a clear understanding of their physical meaning comes later”—infinite in all directions. 
  114. The statement betrays an unflinching faith that such a mathematical theory exists and that we humans are capable of discovering it. 
  115. Here is the quote:
    I am convinced that we can discover by means of purely mathematical constructions the concepts and the laws connecting them with each other, which furnish the key to the understanding of natural phenomena … The creative principle resides in mathematics.
    Quoted in Julian Schwinger, Einstein’s Legacy: The Unity of Space and Time (New York City: W.H. Freeman, 1986), 237–38. See also Albert Einstein, “Remarks on Bertrand Russell’s Theory of Knowledge,” in Paul Arthur Schilpp, ed., The Philosophy of Bertrand Russell (Chicago: Open Court Publishing, 1944). 
  116. In his exchange with Changeux, Connes writes:
    What proves the reality of the material word, apart from our brain’s perception of it? Chiefly the coherence of our perceptions, and their permanence—more precisely, the coherence of touch and sight that characterizes the perceptions of a single person, and the coherence that characterizes the perceptions of several persons.
    Changeux and Connes, Conversations, 22. 
  117. A theory, such as classical mechanics, may become obsolete as a valid description of nature and yet remains a rich and highly consequential mathematical theory. 
  118. By no means do I claim that this attitude is desirable, although the freedom of mathematical investigation, unencumbered by the immediate need to be relevant, has often led to spectacular results. See Henri Poincaré, The Foundations of Science, trans. George Bruce Halsted (New York City: The Science Press, 1913). 
  119. Note that our position here is different from the more modern attempts to affirm the reality of mathematical objects. According to W. V. O. Quine’s famous indispensability argument, the reality of mathematical theories is only guaranteed by the reality of the physical theory to which they are applied. That is, the mathematics needed to formulate a scientific theory confirmed by experience is itself confirmed. According to this point of view, or rather my own oversimplified version of it, the semi-simple Lie group $SU(3)\times SU(2)\times SU(1) $ is real, because it plays a fundamental role in the standard model of particle physics while $SU(4)$, or any other group, may not. More precisely, their reality cannot be guaranteed according to the Quine–Putnam theory. By the same logic, Riemannian and Lorentzian geometry only acquired certified proof of reality after Einstein used them in general relativity! 
  120. Werner Heisenberg, Natural Law and the Structure of Matter (London: Rebel Press, 1970). 
  121. In his exchange with Connes, Changeux denounces the notion of mathematical reality espoused by the former as a form of unscientific and mystical transcendentalism. Yet Connes carefully defines what he means by mathematical reality, anchored in his clearly stated notion of objectivity, while, on the contrary, it is Changeux who relies on a vague definition of matter as an entity in itself, independent from our observations of it. 
  122. The non-mathematical descriptions of these objects, given to the public at large, are always misleading, as they try to associate them with familiar, naturalistic, representations. 
  123. The drawing in Penrose’s book, fig. 1.3, has three balls representing the three worlds and the paradoxical relations between them. Roger Penrose, The Road to Reality: A Complete Guide to the Laws of the Universe (London: Jonathan Cape, 2004), 18. 
  124. If by “illumination” Penrose means inclusion, then we have a real paradox—for how can they all be equal? If the word means however a one to one correspondence the paradox is only avoided if they are all of infinite extent. 
  125. Understanding reality in this way allows us to easily dismiss that old query attributed to George Berkeley: “If a tree falls in a forest and no one is around to hear it, does it make a sound?” 
  126. Indeed, the queen of science, according to Gauss. 
  127. An unfortunate and disastrous consequence of this point of view is the notion that mathematics is contingent on the culture in which the inventor lives. This leads, in extreme cases, to noxious distinctions such as Aryan mathematics or, as we often hear today, white supremacism, or its opposite, anti-racist mathematics. As an adolescent in communist Romania, I still remember accusations of “bourgeois mathematics” for various developments in Western mathematics that were deemed too formalistic. 
  128. Some mathematicians, noticing the importance of aesthetic considerations in mathematical discoveries, would like to treat mathematics purely as a form of art. G. H. Hardy, who declared “there is no permanent place in this world for ugly mathematics,” may have been the closest serious mathematician to that point of view. 
  129. Henri Poincaré, “Mathematical Creation,” The Monist 20, no. 3 (1910): 331, doi:10.5840/monist19102037. 
  130. See also Jacques Hadamard, The Mathematician’s Mind: The Psychology of Invention in the Mathematical Field (Princeton, NJ: Princeton University Press, 2020). 
  131. Alas, one also least understood by the public at large. 
  132. These incompatibilities are also mathematical in nature, that is, each theory is based on a different mathematical framework. 
  133. It needs to be stressed that intuition, leaps of faith, and mathematical experiments are immensely important in arriving at mathematical regularities which can later be formulated as theorems and rigorously proved. Moreover, modern day computers have vastly enhanced our abilities to perform powerful experiments. In that sense, mathematics follows its own scientific method precepts. 
  134. Thus, both quantum mechanics and general relativity are perfectly valid mathematical theories. The difficulty only arises when we try to design a new theory consistent with both. 
  135. Geometry became a specific discipline when it was able to formulate its basic axioms and develop procedures to establish less intuitive properties of elementary shapes by logical inference from these postulated statements. 
  136. The discovery of the magnificent world of differential equations, as mentioned before, is the true great defining achievement of calculus. 
  137. Such as non-local equations involving also integrals. 
  138. Mathematicians pursue, of course, many other goals, such as describing and classifying various objects of interest: finite groups, Lie groups and Lie algebras, manifolds, $C^*$ and Von Neumann algebras, graphs, and so on. But even in these cases, often the main steps reduce to solving equations. 
  139. The basic equations in non-relativistic quantum mechanics of a finite system of particles are also based on PDE’s, i.e. Schrödinger type equations for interacting particles. 
  140. I have purposefully avoided discussing the typical equations that arise in quantum mechanics, which require a higher level of formalism. This is particularly the case for relativistic quantum field theory, for which a rigorous mathematical formalism is not even available. The evidence accumulated so far points toward a third fundamental triad akin to those described in this article. 
  141. As Galileo write in The Assayer
    Philosophy is written in that great book which ever lies before our eyes—I mean the universe—but we cannot understand it if we do not first learn the language and grasp the symbols, in which it is written. This book is written in the mathematical language, and the symbols are triangles, circles and other geometrical figures …

Sergiu Klainerman is Higgins Professor of Mathematics at Princeton University.


More from this Contributor

More on Mathematics


Endmark

Copyright © Inference 2024

ISSN #2576–4403