In each Crackerjack box, in addition to the crackerjack, there is a charm. If there are 36 individual charm designs and each box is equally likely to contain any of them, how many boxes must I buy to be 90% sure of getting a complete set of charms?This sometimes is knows as "The Coupon Collector's Problem" or "The Classical Occupancy Problem" and formerly it has encountered difficulties in computation and required methods of approximation when the population or the number of items to be categorized is large. Here we use a computing environment that invokes exact rational arithmetic for precise answers that put the approximation methods to a good test.
Solution by Henry of Rotherhithe
The function T(m,n) is an enumerator for the number of ways m numbered objects can be distributed to n numbered boxes such that no box is left empty. It can be calculated as
T(m,n) = n!S2(m,n), where S2(m,n)are the Stirling numbers of the second kind and can be found in the larger collections of mathematics tables. Here we shall use it not to enumerate the ways of putting the charms into the boxes but rather to enumerate the ways of assigning the boxes to categories according to the types of charm they contain. If a collection of boxes contains a complete set of charms, then no category will be left empty.If there are b boxes and d charm designs, then there will be T(b,d) possible collections that contain all of the charms and bd possible collections formed irrespective of whether they contain all of the charms or not. The probability, therefore, that a collection of b boxes contains all of the charms is T(b,d) / bd. We can set this equal to 0.9 and solve numerically to get the value of b for 90% confidence. The results are, for 90% confidence, 208 boxes; for 95% confidence, 233 boxes; and for 99% confidence, 291 boxes. Details by Henry
Below is a link to a Maple 6 worksheet that was used to perform the calculations. The zipped file contains both the worksheet and a rich text format version that can be read with any word processor.
WorksheetSolution by K. L. Q. Read
K. L. Q. Read gave a more general solution, namely, pn(r,s), representing the probability that if there are n charm designs, that after buying r boxes we have exactly s of them:![]()
Diego Kuonen (The American Statistician, Vol. 90, No. 429, March, 1995) cites a problem by Feller:
In a village of 1900 people what is the probability of finding no days of the year that are not birthdays?While P(1900, 365) may be calculated with an alternating series, doing so with ordinary floating point arithmetic would lead to complete loss of significance due to cancellation error. Kuonen cites Feller's Poisson approximation, 0.1353 and a saddle point approximation of 0.1339. Maple gives P(1900,365) as 0.1322813307 and that figure can be relied on not only for that many decimal places but for a great many more due to Maple's use of exact rational arithmetic. With numbers this large the Dawkins approximation does fairly well at 0.13499. With ordinary floating point arithmetic, full precision could be obtained by first creating a table of S2(m,n) using a recurrence relation that uses only the operations of addition and multiplication. We expect to give details in the near future.Which could be stated a little more clearly as:
What is the probability that every day is somebody's birthday?
The Classical Occupancy Problem by William Feller, from his An Introduction to Probability Theory and Its Applications, 2nd Edition, Volume I, New York: John Wiley & Sons, London, Chapman & Hall, Ltd. 1957, pp. 91 - 95
Calculated and written up by Sam Allen