Where an event has a probability p, what is the probability of a run of at least k consecutive occurrences in n trials? u(k,n)This problem is to be found in DeMoivre's Doctrine of Chances (1718), however an exact solution has proven elusive to this very day. This page presents some new results developed in the newsgroup sci.math in a thread initiated by the editor on November 1, 2003.
It is possible to express successive values of u in the form of a recurrence relation
u(m + 1) = u(m) + (1 - u(m - n)) (1 - p) pn where u(m) = Pr that a run of n or more has happened by the m'th trial.
The initial values are u(n) = pn and u(m - n) = 0 if m < 2n
Here is the logic:
What is the probability that an event shall occur at least n times in a row in m trials?
u(m) = Pr that a run of n or more has happened by the m'th trial
v(m + 1) = Pr that a run of n or more has been completed for the first time at the (m + 1)th trial
then in order for the run to be completed at the m + 1 trial either it has been completed by the m'th trial or it is completed for the first time at the m + 1 trial:
u(m + 1) = u(m) + v(m + 1)
In order that the (m + 1)th trial may complete the first run the following conditions must be satisfied:
1. There must not have been a run of n or more up to and including the (m - n)th trial.Pr = 1 - u(m - n)
2. It must not happen at the (m - n + 1)th trial. Pr = 1 - p
3. It must happen at each trial from the (m - n + 2)th to the (m + 1)th. Pr = pn so that
u(m + 1) = u(m) + (1 - u(m - n)) (1 - p) pnThis expression resists solution as a difference equation because of its high degree, however it is possible, beginning with
u(n) = pn to calculate successive values of u until k is reached. This amounts to solving the difference equation numerically.The following VBA code (Visual Basic for Applications) was submitted to the newsgroup by Ian Smith:
Public Function Probnmp(n As Long, m As Long, p As Double) Dim C() As Double, result As Double, pn As Double, q As Double Dim i As Long, j As Long, next_j As Long If n <= 0 Or p <= 0 Or p >= 1 Then Probnmp = [#VALUE!] ElseIf n > m Then Probnmp = 0 Else ReDim C(n + 1) For i = 1 To n + 1 C(i) = 0 Next i pn = p ^ n q = 1 - p j = n For i = n To m If j > n Then next_j = 1 Else next_j = j + 1 End If result = pn * (1 + q * ((i - n) - C(next_j))) C(next_j) = C(j) + result j = next_j Next i Probnmp = result End If End FunctionWho explains his logic as follows:
As far as I can see, if q is the probabilty of a loss and P(n,m) is the probability of at least 1 run of at least length n in m trials thenP(n,m) = 0 if m < n, = q^n.(1 + (1-q).( (m-n) - sum(i=n..m-n-1, P(n,i) ) ).The logic is simple. Either the first n trials are losses or the first trial is a success and then we have n losses or we have no runs of n or more in i trials followed by a success and then n losses (for i = 1.. m-n-1). I think this enumerates all the possibile combinations without duplication.
Your formula
u(m + 1) = u(m) + (1 - u(m - n)) (1 - p) p^n
is deducible from mine, for m > n. Just calculate P(n,m+1)-P(n,m) in my notation, swap P(n,m) to u(m) and q to p and you'll get the same.
Here is some simple but inefficient code for a language that supports recursive function calls, based on the logic previously given:
function run(m, n, q); /* Returns the probability of a run of at least n */ /* consecutive failures in m independent trials where */ /* the probability of failure at any one trial is q */ integer m, n; float q; if m < n then return 0; else if m = n then return q^n; else return run(m - 1, n, q) + (1 - run(m - n - 1, n, q)) * (1 - q) * q^n; end if; end run;
Problem
Hawkeye hits the bullseye 90% of the time. What is the probability that he will score at least five successive hits in ten shots?Here the calculation is simplified by the fact that the number of trials is not greater than twice the length of the run. Either he can score five successive hits at the outset or else he can score a miss followed by five hits. This can happen starting with either shot one, two, three, or four, so that the probability is
(1 + 4(0.1))(0.9)5 = 0.8267 One notes also that in implementing a program to solve the above difference equation one can use
u(2n) = (1 + (n - 1) (1 - p)) pn as a starting value.
Bibliography
Burnside, William
The Theory of Probability (1928), Chapter 3
Dover Publications, 1959Uspensky, J. V.
Introduction to Mathematical Probability, Chapter V, Section 3
McGraw Hill, 1937Villarino, M. B., The Probability of a Run (2006)