Bayesian decision riddle
The current puzzle on The Riddler is a version of the secretary problem with an interesting (?) Bayesian solution.
Given four positive numbers x¹, x², x³, x⁴, observed sequentially, the associated utility is the value of x at the stopping time. What is the optimal stopping rule?
While nothing is mentioned about the distribution of the x’s, I made the assumption that they were iid and uniformly distributed over (0,M), with M unknown and tried a Bayesian resolution with the non-informative prior π(M)=1/M. And failed. The reason for this failure is that the expected utility is infinite at the first step: while the posterior expected utility is finite with three and two observations, meaning I can compare stopping and continuing at the second and third steps, the predicted expected reward for continuing after observing x¹ does not exist because the expected value of max(x¹,x²) given x¹ does not exist. As the predictive density of x² is max(x¹,x²)⁻²… Several alternatives are possible to bypass this impossible resolution, from changing the utility function to picking another reference prior.
For instance, using a prior like π(M)=1/M² l(and the same monetary return utility) leads to a proper optimal solution, namely
- always wait for the second observation x²
- stop at x² if x²>11x¹/12, else wait for x³
- stop at x³ if x³>23 max(x¹,x²)/24, else observe x⁴
obtained analytically on a bar table in Rouen (and checked numerically later).
Another approach is to try to optimise the probability to pick the largest amount of the four x’s, but this is not leading to an interesting solution, since it corresponds to picking the first maximum after x¹, while picking the largest among remaining ones leads to a somewhat convoluted solution I have no patience to produce here! Plus this is not a really pertinent loss function as it does not discriminate enough against waiting…
Leave a Reply