The Secretary Problem

The Secretary Problem, also known as The Sultan's Dowry Problem, is believed to have been first stated in Martin Gardner's Mathematical Recreations column in the February 1960 issue of The Scientific American and because of its practical implications has created a subject of study all its own in the field of Management Science. It concerns the best strategy to use in order to select the best candidate from a non-reversible sequence of applicants. In the Arabian Nights version a materialistic suitor wishes to select the sultan's daughter with the largest dowry. The setting here is a college job fair where representatives from industry come to interview those about to graduate.

At a school job fair a recruiter will be given the opportunity to see a group of students. His instructions are to hire the one with the highest grade point average. Once he sees a student he must accept or reject him on the spot. He cannot go back and change his mind because another recruiter will get him. What strategy should he use in order to maximize his probability of getting the student with the highest GPA? He knows the number of students who will come to see him but he does not know the range or distribution of scores.

The consensus among the many who have worked on this problem is that his best strategy is to let a certain fraction of the students pass and take the next one who has a score better than any of the ones seen thus far. Let us first address this problem by direct enumeration. The higher the number the better.

Three Students

                1 2 3   1 3 2   2 1 3   2 3 1   3 1 2   3 2 1
Inflexibly taking the first student, the last student, or any other particular student leads to a probability of success of 1/3. Letting the first student pass and taking the next one with a better score than his leads to success in the second, third, and fourth permutations, for a probability of success of 1/2, Letting two go by leads to success only in the first and third permutations, so the best strategy for three students is to let one go by and the probability of getting the best one is 1/2.

Four Students

               A         B         C         D         E         F

    1       1 2 3 4   1 2 4 3   1 3 2 4   1 3 4 2   1 4 2 3   1 4 3 2

    2       2 1 3 4   2 1 4 3   2 3 1 4   2 3 4 1   2 4 1 3   2 4 3 1

    3       3 1 2 4   3 1 4 2   3 2 1 4   3 2 4 1   3 4 1 2   3 4 2 1

    4       4 1 2 3   4 1 3 2   4 2 1 3   4 2 3 1   4 3 1 2   4 3 2 1
Letting the first student go by and taking the next one with a score better than his works for permutations 1 E and F, 2 B, E and F; and 3 A, B, C, D, E and F for eleven wins out of 24 permutations and a probability of success of 0.458. Letting the first two pass and taking the next student with a better score than either of those succeeds in 1 B, C, and D; 2 B, C, and D; 3 A, B, C, D, for ten wins out of 24 permutations and a probability of success of 0.416. For a group of four students the first strategy still is better.

We can't carry this approach much farther because of the factorial growth of the number of permutations. For this reason we try to derive an algebraic approach.

Let us say that n students in total will be seen and that the interviewer will let m pass in order to test the market. There are two ways the interviewer can fail, 1) The best student is among the first m, or 2) The best student is not among the first m but is preceded by a student with a higher score than any in the first m.

The Probability of Winning with Student m+1

With the interviewer's poor state of information the probability that any particular student out of n students has the highest score is 1/n. If student m+1 has the highest score he is certain to be selected, so that the probability of winning with student m+1 is 1/n.

The Probability of Winning with Student m+2

The probability that student m+2 has the highest score is 1/n, but he will be selected only if the student with the highest score in the range 1 to m+1 is among the first m. The probability of that is m/(m+1), so the probability of winning with student m+2 is m/(n(m+1)).

The Probability of Winning with Student m+3

The probability that student m+3 has the highest score is 1/n, but he will be selected only if the student with the highest score in the range 1 to m+2 is among the first m. The probability of that is m/(m+2), so the probability of winning with student m+3 is m/(n(m+2)).

The Probability of Winning with the Last Student to be Seen

The probability that student n has the highest score is 1/n, but he will be selected only if the student with the highest score in the range 1 to n-1 is among the first m. The probability of that is m/(n-1), so the probability of winning with student n is m/(n(n-1)).

If the interviewer wins at all, he can do so with only one of these students, so these events are exclusive and their probabilities may be added. The probability of winning with one of the students m+1 through n, therefore, is the following expression. We have taken out the factor m/n.

             m          ( 1    1     1           1  )
        P =  - S,   S = ( - + --- + --- + ... + --- )
             n          ( m   m+1   m+2         n-1 )

The optimal value for m is the one which maximizes P. We can find such a value by searching in the area of m = n/3 and we find:

n = 3, m = 1, S = 1.500, p = 0.500* n = 3, m = 2, S = 0.500, p = 0.333 n = 4, m = 1, S = 1.833, p = 0.458* n = 4, m = 2, S = 0.833, p = 0.416 n = 5, m = 1, S = 1.917, p = 0.416 n = 5, m = 2, S = 0.917, p = 0.433* n = 5, m = 3, S = 0.583, p = 0.350 n = 6, m = 1, S = 2.283, p = 0.380 n = 6, m = 2, S = 1.283, p = 0.428* n = 6, m = 3, S = 0.783, p = 0.392

In searching for an optimal value of m starting with m = 1 we notice that the probability of success increases up until the optimal value is reached and then declines. Notice also that the value of the series P passes through the value 1 in the vicinity of the optimal value of m. Now we would like to develop an approximation that will reduce the necessary calculation for very large n.

For very large n the series sum(1/i, i = 1..n) may be replaced by the approximation E + ln(n), where ln is the logarithm to the base e and E is Euler's constant. We subtract the approximation for sum(1/i, i = 1..m-1) and for large n we can approximate S = ln((n-1)/(m-1)) and set it equal to 1 for optimal m. From this we conclude that (n-1)/(m-1) = e and that for large n the optimal value of m is (n-1)/e + 1. For n = 100 this gives m = 37, which is the same as the computed value. A slight modification permits the approximation to be used for smaller values of n. Using m = (n-1)/e gives m = 3 for n = 10, which is the same as the computed value. For large n using m = (n-1)/e + 1 and S = 1 we conclude that P = 1/e = 0.368, approximately. The probability of success using the optimal strategy goes from 0.500 for small n to 0.368 for large n.

The above web site offers a forum for discussing this problem.


References

Gilbert, J. and F. Mosteller, "Recognizing the Maximum of a Sequence" J. American Statistical Association 61, pp. 35-73 (1966)

Ferguson, T. S., "Who solved the secretary problem? Statistical Science Vol.4, pp. 282-296 (1989)


Sam Allen, with the assistance of
Amnon Rapoport and Darryl A. Seale