During the days near 1900 struggling with
axiomatic set theory, there was also
struggling with the concept of infinity
or infinite.
We say that two sets A and B have the same
cardinality, essentially are the same
size, if they can be put into 1-1
correspondence.
Okay, set B is a subset of set A
provided each element of B is an element
of A. B is a proper subset of A if A
contains some elements not in B, that is,
if B is not all of A.
Then a set is infinite if it can be put
into 1-1 correspondence with a proper
subset of itself. A common example is to
let A be the set of natural numbers and B
be the set of even (can be divided by 2
with no remainder -- 4th grade math)
natural numbers. Then B is a proper
subset of A, and, for each natural number n
in A, let it correspond to the natural
number 2n in B. Presto, bingo, we have put
A and B into 1-1 correspondence. So, set
A is infinite.
A set that is not infinite is finite.
Similarly sets of integers, rationals,
reals, and complex numbers are all
infinite.
Not very difficult to see. the set of
natural numbers has the same cardinality
as the set of integers, that is, the
natural numbers and the integers can be
put into 1-1 correspondence.
Any set that can be put into 1-1
correspondence with the natural numbers,
has the same cardinality as the natural
numbers, is said to be countably
infinite.
Curiously, the rationals are countably
infinite. To show this, use the clever
Cantor diagonal process (lots of hits at
Google).
Is the set of reals countably infinite?
No, and there is a short,
clever argument that
shows this. So, the cardinality of the
set of reals is larger (after we handle
some details) than the set of natural
numbers.
Is there a set D with cardinality greater
than the natural numbers but smaller than
the cardinality of the set of real
numbers? Difficult question. That there
cannot be such a set D is the continuum
hypothesis (CH). As can see at
Wikipedia, this question was settled by
work of Kurt Gödel in 1940 and Paul Cohen
in 1963: The result is that the CH is
independent of Zermelo–Fraenkel set
theory with the axiom of choice. By
independent we mean that we can assume
that the CH is true or false and cannot
get a contradiction. This work has been a
bit amazing, even philosophical.
Understanding infinity is important, and
we have done a good job at that. Unless
we are down in the basement of
foundations, we at most rarely consider
the continuum hypothesis.
There is a theorem
that for any natural number n,
we have
1^3 + 2^3 + ... + n^3 =
n^2(n + 1)^2 / 4
We can prove this by mathematical induction.
We want a proof that works for
any natural number n, and for this
we use the definition (above) of the
natural numbers.
So, first we check if the equation
is true for n = 1:
1^3 = 1
and when n = 1
n^2(n + 1)^2 / 4 = 1 (2)^2 / 4 = 1
So, the equation holds for n = 1.
Suppose the equation holds for some
n and check the equation for
n + 1:
1^3 + 2^3 + ... + n^3 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + 4(n + 1)(n + 1)^2 / 4 =
(n^2 + 4(n + 1))(n + 1)^2 / 4 =
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
But
n^2 + 4(n + 1) =
n^2 + 2n + 1 + 2(n + 1) + 1 =
((n + 1) + 1)^2
So
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
(n + 1)^2((n + 1) + 1)^2 / 4
and the equation also holds
for n + 1.
Then by the definition of the
natural numbers, the equation
holds for all natural numbers.
Done.
So, that's an example
of proof by mathematical
induction.
(5) Fundamental Theorem of Arithmetic
A prime number is a natural number like
1, 3, 5, 7, 11, 13, 17, 19, 23, ..., that
is, evenly divisible by only itself and 1.
Prime numbers have been studied since the
ancient Greeks, and there are many
difficult questions easy to state; now we
have answers to some of the questions.
Consider 45:
45 = 3 * 3 * 5
So, we have written 45 as a product of
prime numbers. It is easy enough to see
that any natural number can be written as
a product of prime numbers.
The Fundamental Theorem of Arithmetic is
that each natural number can be written as
a product of prime numbers in just one
way, e.g., where we neglect the order of
the prime numbers in the product.
So,
45 = 3 * 3 * 5
is the only way to write 45 as a product
of prime numbers.
(6) High School Algebra
A quadratic is an algebraic expression
of the form
ax^2 + bx + c
A polynomial is the same except can have
higher natural number powers of x, e.g.,
7x^5 - 4x^3 + 2x - 1
Polynomials play an important role in
linear algebra -- e.g., via determinants
and eigenvalues.
Polynomials are also used in error
correcting codes and related work.
There have been attempts to use
polynomials in curve fitting, but for
large natural numbers n the x^n term means
that as go on the real line a long way
from 0, commonly the value of the
polynomial goes a very long way from 0, so
far from 0 that it is usually unrealistic
for applications. That is, polynomials
don't provide good, versatile means of
curve fitting.
Also interesting and useful is the
binomial, for natural number n and
numbers x and y
(x + y)^n
Presumably the name binomial is from
having the two numbers, x and y.
Curiously, surprisingly, this binomial is
important in various approximations and in
combinations, permutations, probability,
and statistics.
(7) Second Year High School Algebra
A root of a polynomial is a value of x
that makes the value of the polynomial 0.
The fundamental theorem of algebra is that
each polynomial can be written as
(ax + p)(bx + q) ... (cx + r)
where the p, q, ... r are roots of the
polynomial. The theorem holds for
polynomials with either real or complex
coefficients. The roots, even with only
real coefficients, can be complex.
I've omitted some details due to the
challenge of typing math.
Mathematicians worked for centuries to get
good proofs of this result.
Also might be covered are the algorithms
for least common multiple and greatest
common divisor. These were known to the
ancient Greeks and get used currently.
We can define a function f by, for any
number x,
f(x) = ax^2 + bx + c
So, function f is a look-up table -- plug
in a value for x and get back the value of
ax^2 + bx + c
Since the x is a number, we say that the
domain of the function f is the set of
numbers, usually the set of real numbers.
In this case, the range of function f is
also the set of real numbers.
We can have look-up tables for telephone
numbers -- the domain a set of names of
people and the range a set of telephone
numbers. We can have a function for
Internet addresses, domain URLs and range,
say, the 64 bit internet addresses.
So, functions have enormous generality --
given any two non-empty sets A and B, we
can have a function with domain A and
range B and to indicate that f is a
function with domain A and range B write
High school plane geometry contains some
nice, useful math and an excellent first
lesson in mathematical proofs.
(9) Abstract Algebra
In grade school, we learned that
2 + 3 = 5
Abstract algebra calls the +, addition, an
operation. Multiplication is also an
operation.
We learned that for real numbers a and b
a + b = b + a
So, addition is a commutative operation.
It is also associative: For real
numbers a, b, c we have
a + (b + c) = (a + b) + c
Multiplication is also commutative and
associative.
For addition, 0 is the identity element
since for any number a we have
a + 0 = 0 + a = a
And -a is the inverse of a since
a + -a = 0.
Etc. with the rest of the usual properties
of the operations of addition and
multiplication we knew in grade school.
Well, abstract algebra studies operations
on sets more general than the numbers we
have considered so far. So, there are
groups, rings, fields, etc. The
rationals, reals, and complex numbers are
all examples of fields.
E.g., a group is a nonempty set with
just one operation. The operation is
associative and there is an identity
element and inverses.
Some groups are commutative, and some are
not. Some groups are on a set that is
infinite and some are on a set that is
finite.
Given a common definition of vectors, with
vector addition the set of all the vectors
forms a commutative group.
Similarly for the rings, fields, etc. --
that is, we have sets with operations and
properties.
There are some important applications,
e.g., in error correcting coding.
(10) Calculus
Calculus was invented mostly by Newton and
mostly for understanding the motions of
the planets etc.
The first part of calculus has to do with
derivatives. So, suppose we have an
object that for each time t is at distance
t^2. Then the speed of the object at time
t is 2t and the acceleration is 2. The
acceleration is directly proportional to
the force on the object.
We got 2 and 2t from t^2 by taking
calculus derivatives.
We write
d/dt t^2 = 2t
d/dt 2t = 2
So at time t, consider increment in time
h. Then at time t + h, the object is at
position (t + h)^2. So in time h, the
object has moved from t^2 to (t + h)^2,
that is, has moved distance
(t + h)^2 - t^2
= t^2 + 2th + h^2 - t^2
= 2th + h^2
Since speed is distance divided by time,
the object has moved at speed
(2th + h^2) / h
= 2t + h
Then as h is made small and moves to 0,
the speed moves to just
2t
So for very small h, the speed is
approximately just 2t, and as h moves to 0
the speed is exactly 2t.
So, 2t is the instantaneous speed at
time t.
So, 2t is the calculus derivative of
t^2.
To reverse differentiation we can do the
second half of calculus, do integration,
start with the 2t and get back to t^2.
Calculus is the most important math in
physics.
Later chapters in a calculus book show how
to find various areas, arc lengths, and
surface areas.
Calculus with vectors is crucial for heat
flow, fluid flow, electricity and
magnetism, and much more.
Einstein's E = mc^2 can be derived from
just some thought experiments and the
first parts of calculus. There is an
amazing video at
This topic starts with systems of linear
equations in several variables, e.g., for
two equations in three variables:
2x + 5y - 2z = 21
3x - 2y + 2z = 12
The system has none, one, or infinitely
many solutions. To see this, we can use
Gauss elimination. To justify that,
multiplying one of the equations by a
non-zero number does not change the set of
solution. Neither does adding a copy of
one of the equations to another. So, with
these two tools, we just rewrite the
equations so that the set of solutions is
obvious.
To save space here,
I will leave the rest of the
details to the Internet.
A solution can be regarded as a vector,
e.g.,
(x, y, z)
A function f is linear provided for x
and y in the domain of f and numbers a and
b we have
f(ax + by) = af(x) + bf(y)
It can be claimed that the two pillars of
the analysis part of math are continuity
and linearity.
Linearity is enormously important,
essentially because it is such a powerful
simplification.
The linear equations of linear algebra are
linear in this sense.
Differentiation and integration in
calculus are linear.
Linearity is a key assumption in physics
and engineering, especially signals and
signal processing.
In linear algebra we take the two
equations in two unknowns and define
matrices so that we can write the
equations as
So, left to right, we have three matrices.
The first one has 2 rows and three columns
and, thus, is a 2 x 3 (read, 2 by 3)
matrix. The second one is 3 x 1. The
third one is 2 x 1.
We define matrix multiplication so that
multiplying the first and second matrix
gives the same results as in the linear
equations:
2x + 5y - 2z = 21
3x - 2y + 2z = 12
In principle could do all of linear
algebra with just such linear equations
and without matrices, but the effort would
be very clumsy. Matrix notation,
multiplication, etc. are terrific
simplifications.
The 3 x 1 and 2 x 1 matrices are examples
of vectors.
Matrix multiplication is linear.
The known properties of matrices and their
multiplication and addition and of
associated vectors are amazing, profound,
and enormously powerful.
There is a joke, surprisingly close to
true, that all math, once applied, becomes
linear algebra.
(12) Differential Equations
In the subject differential equations,
we are given some equations satisfied by
some function and its derivatives and
asked to find the function.
The classic application was for the motion
of a planet: Newton's laws of motion
specified the derivatives of the position
of the planet as a function of time, and
the goal was to find the function, that
is, predict where the planet would be.
Science and engineering have many
applications of differential equations.
Some of the models for the spread of a
disease are differential equations.
We have seen that
d/dx t^2 = 2t
Well, we may have
function
f(s, t) = st^2
We can still write
d/dx f(s, t) = d/dx st^2 = 2st
but usually we say that we are taking the
partial derivative of f(s, t) with
respect to x.
(13) Mathematical Analysis
This topic is really part of calculus and
has the fine details on the assumptions
that guarantee that calculus works. The
main important assumption is continuity.
The next most important assumption is that
the function has domain a closed
interval or a generalization to what is
called a compact set; a closed
interval is, for real numbers a and b,
{x| a <= x <= b}
Also important are infinite sequences and
series that make approximations as close
as we please to something we want. So, in
this way we define sine, cosine, etc.
This topic is also part of calculus but
concentrates on surfaces, volumes, and
maybe the math of flows of heat, water,
etc.
Commonly this part of calculus makes heavy
use of vectors and matrices.
Also covered can be Fourier series and
integrals.
A pure tone in music, say, from an organ,
has a fundamental frequency, say, 440
cycles per second and also overtones at
natural number multiples of that
fundamental frequency. Each of the
overtone frequenciess works like an axis
of orthogonal coordinates and is a
terrific way to analyze or synthesize such
a tone.
The Fourier transform is closely related
and has an orthogonal coordinate axis for
each real number.
Engineering is awash in linear systems
that do not change their properties over
time -- time invariant linear systems.
Such systems merely adjust some Fourier
coefficients. This is a powerful result.
(15) Measure Theory
This topic develops integral calculus with
a lot more generality.
(16) Probability Theory
In probability we imagine that we do
some experimental trials and observe the
results. We have a sample space, our
set of trials.
We regard the set of all trials like a
region with area, and we say that the area
is 1.
Maybe we flip a coin and get Heads. The
set of all trials that yield Heads is one
event Heads.
The event Heads has area, probability
P(Heads)
With a fair coin we have
P(Heads) = 1/2
A random variable X is some outcome of a
trial. Usually X is a number. In this
case we can write
P(X >= 2)
for the probability that random variable X
is >= 2.
Events A and B are independent provided
P(A and B) = P(A)P(B)
In probability we can have algebraic
expressions with random variables, have a
distance between random variables, have
a sequence of random variables converge
to a random variable, etc.
Given a real valued random variable X, for
real number x we can let
F_X(x) = P(X <= x)
Then F_X is the cumulative distribution
of X, and
f_X(x) = d/dx F_X(x)
is the probability density of X.
One of the major results is the central
limit theorem that shows a distribution
converging to the Gaussian bell curve.
Two more results are the weak and strong
laws of large numbers that show
convergence to average values (means,
expectations).
One more result, amazing, is the
martingale convergence theorem.
The Radon-Nikodym theorem of measure
theory provides an amazing general theory
of conditional expectation which is one
way to look at the value of information.
In statistics we are given the values of
some random variables and try to infer
the values of something related.
The more serious work in probability and
statistics is done on the foundation of
measure theory.
(17) Optimization
We have a furniture factory and a new
shipment of wood. We have some freedom in
what we do with this wood. An
optimization problem is how to use the
freedom to make the greatest revenue from
the wood.
More generally we have some work to do
with some freedom in how we do the work,
and an optimization problem is how best to
exploit the freedom to have the best
outcome, profit,
for the work.
If the profit is linear and if the work
and freedom are described by linear
equations, then we have the topic of
linear programming.
Linear programming has played a role in
the Nobel prizes in economics.
The main means of solving linear
programming problems is the simplex
algorithm which is a not very large
modification of Gauss elimination.
In practice the simplex algorithm is
shockingly fast; in theory its worst case
performance is exponential.
In the furniture making, we can't sell a
fraction, say, 1/3rd, of a dining room
chair. So, we want our solution to
consist of integers. The simplex
algorithm does not promise us integers.
So we have a problem in integer linear
programming.
In practice from the simplex algorithm we
may get to make 601 + 1/3rd dining room
chairs and round that to 601 or just 600.
No biggie.
If we insist on integers, then,
surprisingly, even shockingly, we have one
of the early examples of a problem in NP
complete.
In practice there are still ways to
proceed; often we can get the optimal
integer solutions we want easily;
sometimes such solutions are challenging;
sometimes we have to settle for
approximations; often the approximations
can be quite good, like forgetting about
1/3rd of a chair; in theory the ways have
worst case performance that is exponential
and nearly useless.
One integer linear programming problem I
attacked has 600,000 variables and 40,000
constraints. I wrote some software that
in 600 seconds on a slow computer found an
integer solution within 0.025% of
optimality. The solution made heavy use
of the simplex algorithm, non-linear
duality theory, and something called
Lagrangian relaxation.
In addition to optimization, the simplex
algorithm and related math make nice
contributions to geometry, e.g., prove
theorems of the alternative used in
proving the famous Kuhn-Tucker conditions
necessary for optimality.
Since it is possible that the work we are
trying to optimize is complicated, its
mathematical description may also be
complicated and the optimization effort,
difficult.
Optimization problems are common in animal
feeding, cracking crude oil,
transportation problems, and much more.
Game theory can be regarded as a special
case. One proof of the famous Nash result
(featured in the movie about Nash, A
Beautiful Mind) is via linear
programming.
Some problems are to find curves, say, how
to climb, cruise, and descend an airplane
to minimize fuel burned or the best curve
for a missile to attack an enemy fighter
jet. Newton found the brachistochrone
curve solution to the problem of the
frictionless curve that would let a bead
slide down in minimum time.
Relevant fields are calculus of variations
and deterministic optimal control. One
famous result is the Pontryagin maximum
principle.
Some problems have uncertainty; so the
optimization may be to find the solution
with the best expected outcome. For
such problems there is the field of
stochastic optimal control.
(4) Infinity
During the days near 1900 struggling with axiomatic set theory, there was also struggling with the concept of infinity or infinite.
We say that two sets A and B have the same cardinality, essentially are the same size, if they can be put into 1-1 correspondence.
Okay, set B is a subset of set A provided each element of B is an element of A. B is a proper subset of A if A contains some elements not in B, that is, if B is not all of A.
Then a set is infinite if it can be put into 1-1 correspondence with a proper subset of itself. A common example is to let A be the set of natural numbers and B be the set of even (can be divided by 2 with no remainder -- 4th grade math) natural numbers. Then B is a proper subset of A, and, for each natural number n in A, let it correspond to the natural number 2n in B. Presto, bingo, we have put A and B into 1-1 correspondence. So, set A is infinite.
A set that is not infinite is finite.
Similarly sets of integers, rationals, reals, and complex numbers are all infinite.
Not very difficult to see. the set of natural numbers has the same cardinality as the set of integers, that is, the natural numbers and the integers can be put into 1-1 correspondence.
Any set that can be put into 1-1 correspondence with the natural numbers, has the same cardinality as the natural numbers, is said to be countably infinite.
Curiously, the rationals are countably infinite. To show this, use the clever Cantor diagonal process (lots of hits at Google).
Is the set of reals countably infinite? No, and there is a short, clever argument that shows this. So, the cardinality of the set of reals is larger (after we handle some details) than the set of natural numbers.
Is there a set D with cardinality greater than the natural numbers but smaller than the cardinality of the set of real numbers? Difficult question. That there cannot be such a set D is the continuum hypothesis (CH). As can see at Wikipedia, this question was settled by work of Kurt Gödel in 1940 and Paul Cohen in 1963: The result is that the CH is independent of Zermelo–Fraenkel set theory with the axiom of choice. By independent we mean that we can assume that the CH is true or false and cannot get a contradiction. This work has been a bit amazing, even philosophical.
Understanding infinity is important, and we have done a good job at that. Unless we are down in the basement of foundations, we at most rarely consider the continuum hypothesis.
There is a theorem that for any natural number n, we have
1^3 + 2^3 + ... + n^3 = n^2(n + 1)^2 / 4
We can prove this by mathematical induction. We want a proof that works for any natural number n, and for this we use the definition (above) of the natural numbers.
So, first we check if the equation is true for n = 1:
1^3 = 1
and when n = 1
n^2(n + 1)^2 / 4 = 1 (2)^2 / 4 = 1
So, the equation holds for n = 1.
Suppose the equation holds for some n and check the equation for n + 1:
1^3 + 2^3 + ... + n^3 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + (n + 1)^3 =
n^2(n + 1)^2 / 4 + 4(n + 1)(n + 1)^2 / 4 =
(n^2 + 4(n + 1))(n + 1)^2 / 4 =
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
But
n^2 + 4(n + 1) =
n^2 + 2n + 1 + 2(n + 1) + 1 =
((n + 1) + 1)^2
So
(n + 1)^2 (n^2 + 4(n + 1)) / 4 =
(n + 1)^2((n + 1) + 1)^2 / 4
and the equation also holds for n + 1.
Then by the definition of the natural numbers, the equation holds for all natural numbers.
Done.
So, that's an example of proof by mathematical induction.
(5) Fundamental Theorem of Arithmetic
A prime number is a natural number like 1, 3, 5, 7, 11, 13, 17, 19, 23, ..., that is, evenly divisible by only itself and 1.
Prime numbers have been studied since the ancient Greeks, and there are many difficult questions easy to state; now we have answers to some of the questions.
Consider 45:
45 = 3 * 3 * 5
So, we have written 45 as a product of prime numbers. It is easy enough to see that any natural number can be written as a product of prime numbers.
The Fundamental Theorem of Arithmetic is that each natural number can be written as a product of prime numbers in just one way, e.g., where we neglect the order of the prime numbers in the product.
So,
45 = 3 * 3 * 5
is the only way to write 45 as a product of prime numbers.
(6) High School Algebra
A quadratic is an algebraic expression of the form
ax^2 + bx + c
A polynomial is the same except can have higher natural number powers of x, e.g.,
7x^5 - 4x^3 + 2x - 1
Polynomials play an important role in linear algebra -- e.g., via determinants and eigenvalues.
Polynomials are also used in error correcting codes and related work.
There have been attempts to use polynomials in curve fitting, but for large natural numbers n the x^n term means that as go on the real line a long way from 0, commonly the value of the polynomial goes a very long way from 0, so far from 0 that it is usually unrealistic for applications. That is, polynomials don't provide good, versatile means of curve fitting.
Also interesting and useful is the binomial, for natural number n and numbers x and y
(x + y)^n
Presumably the name binomial is from having the two numbers, x and y.
Curiously, surprisingly, this binomial is important in various approximations and in combinations, permutations, probability, and statistics.
(7) Second Year High School Algebra
A root of a polynomial is a value of x that makes the value of the polynomial 0.
The fundamental theorem of algebra is that each polynomial can be written as
(ax + p)(bx + q) ... (cx + r)
where the p, q, ... r are roots of the polynomial. The theorem holds for polynomials with either real or complex coefficients. The roots, even with only real coefficients, can be complex.
I've omitted some details due to the challenge of typing math.
Mathematicians worked for centuries to get good proofs of this result.
Also might be covered are the algorithms for least common multiple and greatest common divisor. These were known to the ancient Greeks and get used currently.
We can define a function f by, for any number x,
f(x) = ax^2 + bx + c
So, function f is a look-up table -- plug in a value for x and get back the value of
ax^2 + bx + c
Since the x is a number, we say that the domain of the function f is the set of numbers, usually the set of real numbers. In this case, the range of function f is also the set of real numbers.
We can have look-up tables for telephone numbers -- the domain a set of names of people and the range a set of telephone numbers. We can have a function for Internet addresses, domain URLs and range, say, the 64 bit internet addresses.
So, functions have enormous generality -- given any two non-empty sets A and B, we can have a function with domain A and range B and to indicate that f is a function with domain A and range B write
f: A --> B