|


Summing Activity Leads to a Mean of eDate: 04/01/2005 at 11:08:24 From: Steve Subject: I keep getting e as the result of a class activity, why? I asked my students to keep adding random integers from 1 to 100 until the sum exceeded 100. We then found the average number of terms added. The answer seems to be e. Why? The more we do it, the closer we get. I can't find a way to derive this result, although it seems plausible. I tried using the expansion for e as a sum of reciprocals of factorials. I guess this is related to the probability, but I'm not quite sure.
Date: 04/04/2005 at 12:27:51
From: Doctor Rick
Subject: Re: I keep getting e as the result of a class activity, why?
Hi, Steve.
This is an interesting question! My working hypothesis is that the
theoretical mean may be (1 + 1/N)^N, or close to this, for the
generalized experiment in which you add random numbers from 1 through
N until the sum is greater than N. In the limit as N increases
infinitely, this value equals e. For N = 100, it is 2.7048.
As a start to understanding this problem, I considered a much simpler
case. I let N = 4, so that we are choosing numbers from 1 to 4 and
stopping when the sum exceeds 4. There are few enough possibilities
that I can enumerate them.
There is no way that one number can exceed a sum of 4. There are 10
ways that the first two numbers chosen can have a sum greater than 4:
14, 23, 24, 32, 33, 34, 41, 42, 43, 44
Each has a probability of 1/4^2 = 1/16, since there are 4^2 ways to
choose two numbers and each has equal probability. Thus P(2), the
probability that after two numbers the total exceeds 4, is 10/16.
There are 20 ways to choose three numbers such that the first two sum
to 4 or less and the third takes the sum above 4:
113, 114, 122, 123, 124, 131, 132, 133, 134, 212,
213, 214, 221, 222, 223, 224, 311, 312, 313, 314
The probability of each is (1/4)^3 = 1/64, thus P(3) = 20/64.
There are 15 ways to choose four numbers such that the first three
sum to 4 or less and the fourth takes the sum above 4:
1112, 1113, 1114, 1121, 1122, 1123, 1124, 1211,
1212, 1213, 1214, 2111, 2112, 2113, 2114
The probability of each is (1/4)^4 = 1/256, so P(4) = 15/256.
The only possibility left is 1111, in which case any choice for the
fifth number will put the sum over 4. The probability of getting 1111
is 1/256, so that is P(5), the probability that it will take 5
numbers to exceed a total of 4.
The mean number of selections required to exceed a total of 4 is
2*P(2) + 3*P(3) + 4*P(4) + 5*P(5) =
2*10/16 + 3*20/64 + 4*15/256 + 5*1/256 =
(2*10*4^2 + 3*20*4 + 4*15 + 5*1)/256 =
625/256 =
2.44140625
That's exactly 5^4/4^4 = (1 + 1/4)^4, which is what I was hoping I'd
find. Perhaps I'm on the right track. However, another Math Doctor,
Dr. George, has let me know that he did an experiment (like yours, I
assume) where N = 10, and he did not get a mean of (1 + 1/10)^10.
Dr. George had some other very helpful information, though. It is a
known result that in the limit as N increases infinitely, the mean
number of terms needed to exceed N is e. You can read about it here:
MathWorld: Uniform Sum Distribution
http://mathworld.wolfram.com/UniformSumDistribution.html
I need to study that further myself.
- Doctor Rick, The Math Forum
http://mathforum.org/dr.math/
Date: 04/07/2005 at 09:17:14 From: Doctor George Subject: Re: I keep getting e as the result of a class activity, why? Hi Steve, You can see from Dr. Rick's work that doing an analytical experiment for N=10 would be a lot of work, so what I did was a computer simulation instead. I noticed from Dr. Rick's work that he set up his discrete experiment a little differently than mine. I then changed my assumptions to match his and repeated my simulation. For N=10 I now get a mean of (1 + 1/10)^10 as Dr. Rick's intuition expected. - Doctor George, The Math Forum http://mathforum.org/dr.math/
Date: 04/11/2005 at 21:41:18
From: Doctor Rick
Subject: Re: I keep getting e as the result of a class activity, why?
Hi, Steve, I'm back!
I haven't had much time to give to this problem, but I think now I can
convince myself (at least) that it's true: If we select random digits
from 1 through N until the sum exceeds N, the mean number of
selections is (1+1/N)^N. In case you're still interested, here's the
proof.
Let's go back to the case of N=4 and construct a chart. The number in
the n-th column of the m-th row is the probability that after m
selections, the sum is n. I will only show the columns 1 through N.
1 2 3 4
------------------------
1 1/4 1/4 1/4 1/4
2 0 1/16 2/16 3/16
3 0 0 1/64 3/64
4 0 0 0 1/256
How did I generate this table? On the first selection, there is an
equal probabilty of selecting any of the numbers 1 through 4. On the
second selection, the sum cannot be 1, because the lowest possible
sum is 1+1=2. The probability that the sum will be 2 is the
probability that the sum after 1 selection is less than 2, times 1/4,
the probability that the second selection will be the number that
makes the total 2. The same follows for each succeeding entry in the
table. In general, the probability P(m,n) that the total after m
selections is n, is
P(m,n) = 0 , n < m
(1/N)Sum (k=1 to n-1) P(k,n-1) , n >= m
Now notice the difference between P(m,n-1) and P(m,n): if n >= m, then
N*P(m,n-1) = Sum(k=1 to n-2) P(k,n-1)
so that
N*P(m,n) = N*P(m,n) + P(m-1,n-1)
and
N^m P(m,n) = N^m P(m,n) + N^(m-1) P(m-1,n-1)
The denominator of each P(m,n) is N^m. The numerator of P(m,n) is
found by adding the numerators of the cell to its left, P(m,n-1) and
the cell above that, P(m-1,n-1). This is the rule for Pascal's
triangle, and indeed you can see that the columns in my table are the
rows of Pascal's triangle: 1; 1,1; 1,2,1; 1,3,3,1. Now the numbers
in Pascal's triangle are combinations; we thus get the formula
(writing C(a,b) for the combinations of "a" things taken "b" at a
time):
P(m,n) = C(n-1,m-1)/N^m , m <= n ; 0, m > n
(n-1)!
= ----------------- , m <= n ; 0, m > n
(m-1)! (n-m)! N^m
How do we use these probabilities to compute the mean number of
selections to make the sum exceed N? First, the sum of each row of
the table (up through column N) is the probability that the sum has
not yet exceeded N. The probability that the sum has exceeded N is 1
minus this sum. The probability that the m-th selection is the first
to put the sum over N, is
P(m) = P(m-th sum exceeds N) - P((m-1)-th sum exceeds N)
= (1 - Sum(n=1 to N)P(m,n)) - (1 - Sum(k=1 to N)P(m-1,n))
= Sum(n=1 to N)P(m-1,n) - Sum(n=1 to N)P(m,n)
= Sum(n=1 to N)(P(m-1,n) - P(m,n))
The probability that the first selection (m=1) puts the sum over N is
zero. Applying the formula to the table above for N=4, we get
m P(sum<N) P(m selections needed to exceed N)
-------------------------------------------------------
1 4/4 0
2 6/16 4/4 - 6/16 = 10/16
3 4/64 6/16 - 4/64 = 20/64
4 1/256 4/64 - 1/256 = 15/256
5 0 1/256 - 0 = 1/256
This agrees with my previous work based on enumerating the
possibilities. As I did there, I calculate the mean number of terms
needed to exceed N by multiplying each probability in the last column
by m and summing:
1*P(1) + 2*P(2) + 3*P(3) + 4*P(4) + 5*P(5) = 20/16 + 60/64 + 20/256
+ 5/256
= 625/256
In general, the formula is
<m> = Sum(m=2 to N+1) m*Sum(n=1 to N)(P(m-1,n) - P(m,n))
The sums can be interchanged, so that instead of summing across the
rows of the table, then add the row sums, we sum first down the
columns, then add the column sums. We have
<m> = Sum(n=1 to N) Sum(m=2 to N+1) m*(P(m-1,n) - P(m,n))
Examine the inner sum: in the case N = 4, it is
2(P(1,n)-P(2,n)) + 3(P(2,n)-P(3,n)) + 4(P(3,n)-P(4,n)) + 5(P(4,n) -
P(5,n)) =
2P(1,n) - 2P(2,n)
+ 3P(2,n) - 3P(3,n)
+ 4P(3,n) - 4P(4,n)
+ 5P(4,n) - 5P(5,n)
-----------------------------------------------
2P(1,n) + P(2,n) + P(3,n) + P(4,n) - 5P(5,n)
In general, with P(1,n) = 1/N and P(N+1,n) = 0, this is
1/N + Sum(m=1 to N)P(m,n)
The full sum is
<m> = Sum(n=1 to N)(1/N + Sum(m=1 to N)P(m,n))
= 1 + Sum(n=1 to N) Sum(m=1 to N) P(m,n)
= 1 + Sum(n=1 to N) Sum(m=1 to n) C(n-1,m-1)/N^m
I changed the summation limit in the last step because P(m,n) = 0 for
m > n.
Recall that the expansion of (1 + 1/N)^n is
(1 + a)^n = C(n,0) + C(n,1)a + C(n,2)a^2 + ... + C(n,n)a^n
Comparing the following expansion with the sum over m, we see they
are equal:
(1 + 1/N)^(n-1)/N = (C(n-1,0) + C(n-1,1)/N + ... + C(n-1,n-1)/N^(n-
1))/N
= C(n-1,0)/N + C(n-1,1)/N^2 + ... + C(n-1,n-1)/N^n
Using this result, our formula for the mean is now
<m> = 1 + (1/N)Sum(n=1 to N)(1 + 1/N)^(n-1)
= 1 + (1/N)Sum(n=0 to N-1)(1 + 1/N)^n
The remaining sum is a geometric series, and we can apply the formula
Sum(k=0 to n) a^k = (a^(n+1)-1)/(a-1)
to get
<m> = 1 + (1/N)((1 + 1/N)^N - 1)/((1 + 1/N) - 1)
= 1 + (1/N)((1 + 1/N)^N - 1)/(1/N)
= 1 + (1 + 1/N)^N - 1
= (1 + 1/N)^N
That's what I've been trying to show! It's not the tightest proof in
the world, but I can rest content. Thanks for bringing up an
interesting problem.
- Doctor Rick, The Math Forum
http://mathforum.org/dr.math/
Date: 04/12/2005 at 07:59:06 From: Steve Subject: Thank you (I keep getting e as the result of a class activity, why?) Wow. Thanks for all your effort. I'll sift through your proof this weekend. I'm glad to see that this actually works. Now I have to try to explain it to my students. Thanks again for all your work. Steve |
Search the Dr. Math Library: |
[Privacy Policy] [Terms of Use]


Ask Dr. MathTM
© 1994-2011 The Math Forum
http://mathforum.org/dr.math/