Poisson Distribution

Here is one account of where the Poisson Distribution comes from.

First Principles

Consider first the irrational number e ~ 2.71828. It is deeply involved in equations where growth is being described; it is one of the numbers which occur at the very foundations of the universe. This number (and not the culturally natural 10) is the basis of "natural" logarithms. The number e itself can be rather naturally derived as a function of the successive whole numbers. Probably the most elegant form of that derivation is the following:

e = 1 / 0! + 1 / 1! + 1 / 2! + 1 / 3! + . . . . . (1)

out to infinity (remember that 2! is 2 factorial). The first two terms both reduce to 1, and it is customary to spoil the symmetry of the formula, and write instead

e = 1 + 1 + 1 / 2! + 1 / 3! + . . . . . (1a)

but we will prefer the more beautiful and intelligible equation (1), which better expresses the heart of the matter. The whole thing may be collapsed as

e = S (1 / n!)(1b)

where it is understood that n may take the value 0. Sums like that in equation (1) are open; they can never be precisely determined, but they can be calculated to any desired degree of accuracy. The terms of this series happen to decrease very rapidly, so that the calculation is easy. After summing only nine terms, we have determined e to the degree of accuracy usually cited: 2.71828.

There is also a series, due to Newton, for e raised to any power x (that is, e*x), namely:

e*x = x*0 / 0! + x*1 / 1! + x*2 / 2! + x*3 / 3! + . . . . .(2)

which is usually simplified as

e*x = 1 + x + x*2 / 2! + x*3 / 3! + . . . . . (2a)

but again we will ignore the simplification, because it conceals the structure.

The Frequency Distribution

Now, in a probability situation, the sums of the probabilities for all the possible outcomes must sum to 1, or certainty (the options must exhaust all the possibilities). Then, taking it from the other end, any algebraic sum whose value is 1 can in theory be interpreted as a probability distribution. Newton's equation (2), above, sums to e*x, not 1. But if we divide both sides by e*x, we will get a series which does sum to 1, and can thus be a probability distribution. Dividing the right side of equation (2) by e*x amounts to multiplying each denominator of that expression by e*x, so that we have

1 = x*0 / (0!)e*x + x*1 / (1!)e*x + x*2 / (2!)e*x + x*3 / (3!)e*x + . . . .(3)

We have 1 on the left, so the thing on the right is a possible frequency distribution. It remains to attach some meaning to its successive terms. One limitation is that the successive terms must represent all possible outcomes of a particular situation. We notice that each term consists of all constants except for functions of the single variable x.


Suppose that x is the rate of occurrence of a random event per some unit or module of observation: an hour of time, or a yard of length, or whatever. Then the first or "0" term of the series gives the probability that 0 events will occur during one such observation module, the second or "1" term of the series gives the probability of 1 event, and so with the "2," "3," and higher terms. We may thus restate our equation, substituting r (rate) for the previous x:

1 = r*0 / (0!)e*r + r*1 / (1!)e*r + r*2 / (2!)e*r + r*3 / (3!)e*r + . . . .(4)

If we want to know, given a particular average rate of occurrence (r), how many times k events will most likely be observed in a set of n time or space units, we need only calculate the kth term of the above series to get the probability of k, or p(k), and then multiply by the number (n) of units observed to get the number of probable total occurrences over the time in question.

We may write the kth term of this series as

p(k) = r*k / (k!)(e*r)(5)

This is not at all hard to work out, and it is the the form we will use for practical calculations. Having reached it, we may now rejoin the main Lesson.

Back to Lesson Page

24 Aug 2007 / Comments to The Project / Exit to Resources Page