Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Part II

(4) Infinity

During the days near 1900 struggling with axiomatic set theory, there was also struggling with the concept of infinity or infinite.

We say that two sets A and B have the same cardinality, essentially are the same size, if they can be put into 1-1 correspondence.

Okay, set B is a subset of set A provided each element of B is an element of A. B is a proper subset of A if A contains some elements not in B, that is, if B is not all of A.

Then a set is infinite if it can be put into 1-1 correspondence with a proper subset of itself. A common example is to let A be the set of natural numbers and B be the set of even (can be divided by 2 with no remainder -- 4th grade math) natural numbers. Then B is a proper subset of A, and, for each natural number n in A, let it correspond to the natural number 2n in B. Presto, bingo, we have put A and B into 1-1 correspondence. So, set A is infinite.

A set that is not infinite is finite.

Similarly sets of integers, rationals, reals, and complex numbers are all infinite.

Not very difficult to see. the set of natural numbers has the same cardinality as the set of integers, that is, the natural numbers and the integers can be put into 1-1 correspondence.

Any set that can be put into 1-1 correspondence with the natural numbers, has the same cardinality as the natural numbers, is said to be countably infinite.

Curiously, the rationals are countably infinite. To show this, use the clever Cantor diagonal process (lots of hits at Google).

Is the set of reals countably infinite? No, and there is a short, clever argument that shows this. So, the cardinality of the set of reals is larger (after we handle some details) than the set of natural numbers.

Is there a set D with cardinality greater than the natural numbers but smaller than the cardinality of the set of real numbers? Difficult question. That there cannot be such a set D is the continuum hypothesis (CH). As can see at Wikipedia, this question was settled by work of Kurt Gödel in 1940 and Paul Cohen in 1963: The result is that the CH is independent of Zermelo–Fraenkel set theory with the axiom of choice. By independent we mean that we can assume that the CH is true or false and cannot get a contradiction. This work has been a bit amazing, even philosophical.

Understanding infinity is important, and we have done a good job at that. Unless we are down in the basement of foundations, we at most rarely consider the continuum hypothesis.

There is a theorem that for any natural number n, we have

1^3 + 2^3 + ... + n^3 = n^2(n + 1)^2 / 4

We can prove this by mathematical induction. We want a proof that works for any natural number n, and for this we use the definition (above) of the natural numbers.

So, first we check if the equation is true for n = 1:

1^3 = 1

and when n = 1

n^2(n + 1)^2 / 4 = 1 (2)^2 / 4 = 1

So, the equation holds for n = 1.

Suppose the equation holds for some n and check the equation for n + 1:

1^3 + 2^3 + ... + n^3 + (n + 1)^3 =

n^2(n + 1)^2 / 4 + (n + 1)^3 =

n^2(n + 1)^2 / 4 + 4(n + 1)(n + 1)^2 / 4 =

(n^2 + 4(n + 1))(n + 1)^2 / 4 =

(n + 1)^2 (n^2 + 4(n + 1)) / 4 =

But

n^2 + 4(n + 1) =

n^2 + 2n + 1 + 2(n + 1) + 1 =

((n + 1) + 1)^2

So

(n + 1)^2 (n^2 + 4(n + 1)) / 4 =

(n + 1)^2((n + 1) + 1)^2 / 4

and the equation also holds for n + 1.

Then by the definition of the natural numbers, the equation holds for all natural numbers.

Done.

So, that's an example of proof by mathematical induction.

(5) Fundamental Theorem of Arithmetic

A prime number is a natural number like 1, 3, 5, 7, 11, 13, 17, 19, 23, ..., that is, evenly divisible by only itself and 1.

Prime numbers have been studied since the ancient Greeks, and there are many difficult questions easy to state; now we have answers to some of the questions.

Consider 45:

45 = 3 * 3 * 5

So, we have written 45 as a product of prime numbers. It is easy enough to see that any natural number can be written as a product of prime numbers.

The Fundamental Theorem of Arithmetic is that each natural number can be written as a product of prime numbers in just one way, e.g., where we neglect the order of the prime numbers in the product.

So,

45 = 3 * 3 * 5

is the only way to write 45 as a product of prime numbers.

(6) High School Algebra

A quadratic is an algebraic expression of the form

ax^2 + bx + c

A polynomial is the same except can have higher natural number powers of x, e.g.,

7x^5 - 4x^3 + 2x - 1

Polynomials play an important role in linear algebra -- e.g., via determinants and eigenvalues.

Polynomials are also used in error correcting codes and related work.

There have been attempts to use polynomials in curve fitting, but for large natural numbers n the x^n term means that as go on the real line a long way from 0, commonly the value of the polynomial goes a very long way from 0, so far from 0 that it is usually unrealistic for applications. That is, polynomials don't provide good, versatile means of curve fitting.

Also interesting and useful is the binomial, for natural number n and numbers x and y

(x + y)^n

Presumably the name binomial is from having the two numbers, x and y.

Curiously, surprisingly, this binomial is important in various approximations and in combinations, permutations, probability, and statistics.

(7) Second Year High School Algebra

A root of a polynomial is a value of x that makes the value of the polynomial 0.

The fundamental theorem of algebra is that each polynomial can be written as

(ax + p)(bx + q) ... (cx + r)

where the p, q, ... r are roots of the polynomial. The theorem holds for polynomials with either real or complex coefficients. The roots, even with only real coefficients, can be complex.

I've omitted some details due to the challenge of typing math.

Mathematicians worked for centuries to get good proofs of this result.

Also might be covered are the algorithms for least common multiple and greatest common divisor. These were known to the ancient Greeks and get used currently.

We can define a function f by, for any number x,

f(x) = ax^2 + bx + c

So, function f is a look-up table -- plug in a value for x and get back the value of

ax^2 + bx + c

Since the x is a number, we say that the domain of the function f is the set of numbers, usually the set of real numbers. In this case, the range of function f is also the set of real numbers.

We can have look-up tables for telephone numbers -- the domain a set of names of people and the range a set of telephone numbers. We can have a function for Internet addresses, domain URLs and range, say, the 64 bit internet addresses.

So, functions have enormous generality -- given any two non-empty sets A and B, we can have a function with domain A and range B and to indicate that f is a function with domain A and range B write

f: A --> B



Part III

(8) High School Plane Geometry

High school plane geometry contains some nice, useful math and an excellent first lesson in mathematical proofs.

(9) Abstract Algebra

In grade school, we learned that

2 + 3 = 5

Abstract algebra calls the +, addition, an operation. Multiplication is also an operation.

We learned that for real numbers a and b

a + b = b + a

So, addition is a commutative operation. It is also associative: For real numbers a, b, c we have

a + (b + c) = (a + b) + c

Multiplication is also commutative and associative.

For addition, 0 is the identity element since for any number a we have

a + 0 = 0 + a = a

And -a is the inverse of a since

a + -a = 0.

Etc. with the rest of the usual properties of the operations of addition and multiplication we knew in grade school.

Well, abstract algebra studies operations on sets more general than the numbers we have considered so far. So, there are groups, rings, fields, etc. The rationals, reals, and complex numbers are all examples of fields.

E.g., a group is a nonempty set with just one operation. The operation is associative and there is an identity element and inverses.

Some groups are commutative, and some are not. Some groups are on a set that is infinite and some are on a set that is finite.

Given a common definition of vectors, with vector addition the set of all the vectors forms a commutative group.

Similarly for the rings, fields, etc. -- that is, we have sets with operations and properties.

There are some important applications, e.g., in error correcting coding.

(10) Calculus

Calculus was invented mostly by Newton and mostly for understanding the motions of the planets etc.

The first part of calculus has to do with derivatives. So, suppose we have an object that for each time t is at distance t^2. Then the speed of the object at time t is 2t and the acceleration is 2. The acceleration is directly proportional to the force on the object.

We got 2 and 2t from t^2 by taking calculus derivatives.

We write

d/dt t^2 = 2t

d/dt 2t = 2

So at time t, consider increment in time h. Then at time t + h, the object is at position (t + h)^2. So in time h, the object has moved from t^2 to (t + h)^2, that is, has moved distance

(t + h)^2 - t^2

= t^2 + 2th + h^2 - t^2

= 2th + h^2

Since speed is distance divided by time, the object has moved at speed

(2th + h^2) / h

= 2t + h

Then as h is made small and moves to 0, the speed moves to just

2t

So for very small h, the speed is approximately just 2t, and as h moves to 0 the speed is exactly 2t.

So, 2t is the instantaneous speed at time t.

So, 2t is the calculus derivative of t^2.

To reverse differentiation we can do the second half of calculus, do integration, start with the 2t and get back to t^2.

Calculus is the most important math in physics.

Later chapters in a calculus book show how to find various areas, arc lengths, and surface areas.

Calculus with vectors is crucial for heat flow, fluid flow, electricity and magnetism, and much more.

Einstein's E = mc^2 can be derived from just some thought experiments and the first parts of calculus. There is an amazing video at

https://www.youtube.com/watch?v=KZ8G4VKoSpQ

Calculus is a pillar of civilization.

(11) Linear Algebra

This topic starts with systems of linear equations in several variables, e.g., for two equations in three variables:

     2x + 5y - 2z = 21

     3x - 2y + 2z = 12
The system has none, one, or infinitely many solutions. To see this, we can use Gauss elimination. To justify that, multiplying one of the equations by a non-zero number does not change the set of solution. Neither does adding a copy of one of the equations to another. So, with these two tools, we just rewrite the equations so that the set of solutions is obvious. To save space here, I will leave the rest of the details to the Internet.

A solution can be regarded as a vector, e.g.,

(x, y, z)

A function f is linear provided for x and y in the domain of f and numbers a and b we have

f(ax + by) = af(x) + bf(y)

It can be claimed that the two pillars of the analysis part of math are continuity and linearity.

Linearity is enormously important, essentially because it is such a powerful simplification.

The linear equations of linear algebra are linear in this sense.

Differentiation and integration in calculus are linear.

Linearity is a key assumption in physics and engineering, especially signals and signal processing.

In linear algebra we take the two equations in two unknowns and define matrices so that we can write the equations as

     / 2  5 -2 \ / x \   / 21 \
     |         | |   | = |    |
     \ 3 -2  2 / | y |   \ 12 /
                 |   |
                 \ z /
So, left to right, we have three matrices. The first one has 2 rows and three columns and, thus, is a 2 x 3 (read, 2 by 3) matrix. The second one is 3 x 1. The third one is 2 x 1.

We define matrix multiplication so that multiplying the first and second matrix gives the same results as in the linear equations:

     2x + 5y - 2z = 21

     3x - 2y + 2z = 12
In principle could do all of linear algebra with just such linear equations and without matrices, but the effort would be very clumsy. Matrix notation, multiplication, etc. are terrific simplifications.

The 3 x 1 and 2 x 1 matrices are examples of vectors.

Matrix multiplication is linear.

The known properties of matrices and their multiplication and addition and of associated vectors are amazing, profound, and enormously powerful.

There is a joke, surprisingly close to true, that all math, once applied, becomes linear algebra.

(12) Differential Equations

In the subject differential equations, we are given some equations satisfied by some function and its derivatives and asked to find the function.

The classic application was for the motion of a planet: Newton's laws of motion specified the derivatives of the position of the planet as a function of time, and the goal was to find the function, that is, predict where the planet would be.

Science and engineering have many applications of differential equations.

Some of the models for the spread of a disease are differential equations.

We have seen that

d/dx t^2 = 2t

Well, we may have function

f(s, t) = st^2

We can still write

d/dx f(s, t) = d/dx st^2 = 2st

but usually we say that we are taking the partial derivative of f(s, t) with respect to x.

(13) Mathematical Analysis

This topic is really part of calculus and has the fine details on the assumptions that guarantee that calculus works. The main important assumption is continuity. The next most important assumption is that the function has domain a closed interval or a generalization to what is called a compact set; a closed interval is, for real numbers a and b,

{x| a <= x <= b}

Also important are infinite sequences and series that make approximations as close as we please to something we want. So, in this way we define sine, cosine, etc.


Part IV

(14) Advanced Calculus

This topic is also part of calculus but concentrates on surfaces, volumes, and maybe the math of flows of heat, water, etc.

Commonly this part of calculus makes heavy use of vectors and matrices.

Also covered can be Fourier series and integrals.

A pure tone in music, say, from an organ, has a fundamental frequency, say, 440 cycles per second and also overtones at natural number multiples of that fundamental frequency. Each of the overtone frequenciess works like an axis of orthogonal coordinates and is a terrific way to analyze or synthesize such a tone.

The Fourier transform is closely related and has an orthogonal coordinate axis for each real number.

Engineering is awash in linear systems that do not change their properties over time -- time invariant linear systems. Such systems merely adjust some Fourier coefficients. This is a powerful result.

(15) Measure Theory

This topic develops integral calculus with a lot more generality.

(16) Probability Theory

In probability we imagine that we do some experimental trials and observe the results. We have a sample space, our set of trials.

We regard the set of all trials like a region with area, and we say that the area is 1.

Maybe we flip a coin and get Heads. The set of all trials that yield Heads is one event Heads.

The event Heads has area, probability

P(Heads)

With a fair coin we have

P(Heads) = 1/2

A random variable X is some outcome of a trial. Usually X is a number. In this case we can write

P(X >= 2)

for the probability that random variable X is >= 2.

Events A and B are independent provided

P(A and B) = P(A)P(B)

In probability we can have algebraic expressions with random variables, have a distance between random variables, have a sequence of random variables converge to a random variable, etc.

Given a real valued random variable X, for real number x we can let

F_X(x) = P(X <= x)

Then F_X is the cumulative distribution of X, and

f_X(x) = d/dx F_X(x)

is the probability density of X.

One of the major results is the central limit theorem that shows a distribution converging to the Gaussian bell curve.

Two more results are the weak and strong laws of large numbers that show convergence to average values (means, expectations).

One more result, amazing, is the martingale convergence theorem.

The Radon-Nikodym theorem of measure theory provides an amazing general theory of conditional expectation which is one way to look at the value of information.

In statistics we are given the values of some random variables and try to infer the values of something related.

The more serious work in probability and statistics is done on the foundation of measure theory.

(17) Optimization

We have a furniture factory and a new shipment of wood. We have some freedom in what we do with this wood. An optimization problem is how to use the freedom to make the greatest revenue from the wood.

More generally we have some work to do with some freedom in how we do the work, and an optimization problem is how best to exploit the freedom to have the best outcome, profit, for the work.

If the profit is linear and if the work and freedom are described by linear equations, then we have the topic of linear programming.

Linear programming has played a role in the Nobel prizes in economics.

The main means of solving linear programming problems is the simplex algorithm which is a not very large modification of Gauss elimination.

In practice the simplex algorithm is shockingly fast; in theory its worst case performance is exponential.

In the furniture making, we can't sell a fraction, say, 1/3rd, of a dining room chair. So, we want our solution to consist of integers. The simplex algorithm does not promise us integers. So we have a problem in integer linear programming.

In practice from the simplex algorithm we may get to make 601 + 1/3rd dining room chairs and round that to 601 or just 600. No biggie.

If we insist on integers, then, surprisingly, even shockingly, we have one of the early examples of a problem in NP complete.

In practice there are still ways to proceed; often we can get the optimal integer solutions we want easily; sometimes such solutions are challenging; sometimes we have to settle for approximations; often the approximations can be quite good, like forgetting about 1/3rd of a chair; in theory the ways have worst case performance that is exponential and nearly useless.

One integer linear programming problem I attacked has 600,000 variables and 40,000 constraints. I wrote some software that in 600 seconds on a slow computer found an integer solution within 0.025% of optimality. The solution made heavy use of the simplex algorithm, non-linear duality theory, and something called Lagrangian relaxation.

In addition to optimization, the simplex algorithm and related math make nice contributions to geometry, e.g., prove theorems of the alternative used in proving the famous Kuhn-Tucker conditions necessary for optimality.

Since it is possible that the work we are trying to optimize is complicated, its mathematical description may also be complicated and the optimization effort, difficult.

Optimization problems are common in animal feeding, cracking crude oil, transportation problems, and much more.

Game theory can be regarded as a special case. One proof of the famous Nash result (featured in the movie about Nash, A Beautiful Mind) is via linear programming.

Some problems are to find curves, say, how to climb, cruise, and descend an airplane to minimize fuel burned or the best curve for a missile to attack an enemy fighter jet. Newton found the brachistochrone curve solution to the problem of the frictionless curve that would let a bead slide down in minimum time.

Relevant fields are calculus of variations and deterministic optimal control. One famous result is the Pontryagin maximum principle.

Some problems have uncertainty; so the optimization may be to find the solution with the best expected outcome. For such problems there is the field of stochastic optimal control.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: