Finding the perfect baking time for that soufflé (or: optimal search with test penalty)

TL;DR

How to find a value in a range of integers where every test carries a cost with pretty graphs at the end.

Don’t ruin that soufflé

You know that problem. If you take it out of the oven too soon, and you get a raw egg; leave it in for too long, and you’ll end up with a fat, flat omelet (or is it omelette?). A soufflé is like Schrödinger’s cat: if you open the oven to observe, it will collapse from the good-bad superposition into a slobbery glob of goo. So what to do?

The real answer is to get a good cookbook or tips from someone who knows. But let’s not bother with reality here: there’s an interesting problem hiding here that’s worth exploring. Let’s assume you can only make one soufflé at a time, and that if you open the oven to check, you can’t put it back in. It will either be underdone, exactly ready, or overdone. Also let’s assume that the ideal time to bake a soufflé is a constant, integer number of minutes. You need to find out quickly what’s the best baking time because you are hosting a party in a few hours and want to be prepared. Finally, let’s assume that we tried to make one and after baking it for 30 minutes found that it was overdone.

So what’s the fastest way to find out the perfect baking time for a soufflé?

So here’s the actual question

To state the problem more generally: we have a test, a boolean function of integers f(t), that tells us whether a desired value s is less than the tested value t. Or, in Haskell:

f :: Integer -> Bool
f t = t >= s

Every time we do the test (evaluate f(t)) we pay a penalty – in the case of a soufflé, we waste time: we have to restart the process and start baking a new one. Let’s call this penalty the cost function, and let it depend on the value we are testing. For our simple purposes, let the function be from integers to integers.

cost :: Integer -> Integer

Finally, we point out that s, that value we’re trying to find, is a random variable with probability function pf for finding s in a range [x,y]. This function satisfies:

pf :: Integer -> Integer -> Integer -> Double
-- the rule is:
pf x y z = sum . map pmf $ [x..y]

Where pmf (probability mass function) is a distribution on the total range [a,b] where s is known to reside.

pmf :: Integer -> Double

Being a probability mass function, the rule that pmf must hold is:

sum . map pmf $ [a..b] = 1
-- or
pf a b = 1

The problem is then, to find an algorithm that finds s with minimal expected cost. The average total cost of checking a range [x..y] when testing a value t in the range, can be calculated recursively. Let’s say our algorithm decides which value to test next using a function guessVal(x,y). First we need to define the conditional probability cpf of finding s in a range [x’,y’] given the fact that it is know to lie in an extended range [x,y]:

cpf :: Integer -> Integer -> Integer -> Integer -> Double
cpf x y x' y' = ... -- related to pf and depends on the distribution

Then we can write:

avgCost :: Integer -> Integer -> Double
avgCost x y = 
    if (x == y) 
    then 0 
    else cost t + (cpf' x t) * (avgCost x t) 
                + (cpf' (t+1) y) * (avgCost (t+1) y)
    where t    = guessVal x y
          cpf' = cpf x y

This assumes that t is actually in the range [x,y] and that s is known to be in that range. If the range is a single integer, there is nothing to check and thus zero cost. Otherwise, the cost is the probability-weighted average of the two next branches (the guessed value yielded a test result false or true). Note that for any distribution, the total of the two cpf values above will always be 1 since we are covering the entire range on which the condition is applied. We need to minimize the expected cost, so the general problem is:

Choose guessVal that minimizes avgCost for a given range [a,b].

A minor subtlety to note is that we never need to test the highest value of a range because it will always test positive. This can be expressed as the rule:

x <= (guessVal x y) < y

Simplifying assumption

Let’s assume that our probability distribution is uniform. In that case the probability function depends only on the size of the range being tested and a constant that depends on the complete range size:

pf x y = (y - x + 1) / (b - a + 1)
-- and conditional probability:
cpf x y x' y' = (y' - x' + 1) / (y - x + 1)

Under these conditions the avgCost function may be rewritten as:

avgCost x y = 
    if (x == y) 
    then 0 
    else cost t + c * ((t - x + 1) * (avgCost x t) 
                         + (y - t) * (avgCost (t+1) y))
    where t = guessVal x y
          c = 1.0 / (y - x + 1)

In many cases uniform distribution is not a good pick – for example, when we have a pretty good guess about where s lies in the range a normal distribution would be more appropriate.

Cost functions

Constant cost

In the trivial case, the cost function is just a constant – most trivially, it is always 1. In that case our problem boils down to minimizing the number of tests and is the same as finding a value in an array. The well-known solution to that is binary search. Our average cost is then:

avgCost x y = 
    if (x == y) 
    then 0 
    else 1 + 2 * 1/2 * (avgCost x t)
    where t    = x + (y - x) / 2

This reduces to the expected value of log(n) (rounded up) where n is the size of the complete range.

Linear cost

Unfortunately, when baking a soufflé every attempt has more than a constant cost – you have to wait a certain amount of time to find out if a given duration for baking is too short or too long. If time is of the essence, an appropriate cost function would be linear and increasing, such as cost(t) = t. In that case the equation becomes:

avgCost x y =
    if (x == y)
    then 0
    else t + c * ((t - x + 1) * (avgCost x t) + (y - t) * (avgCost (t+1) y))
    where t = guessVal x y
    c = 1.0 / (y - x + 1)

Just a reminder: the problem is to minimize the average cost by picking the right guessVal function – picking the right t every time. If anyone solves this, let me know.

Visualization

As suggested by this answer1, any sensible algorithm is equivalent to a binary decision tree (given a range [a,b]). At each node, the algorithm tests a value x, spending cost(x), to determine which of the branches to take. If we assign a cost value to each node in the tree, minimal expected cost is the weighted average root-to-leaf total cost (where the weights depend on the distribution pmf; for uniform distribution the weights are identical and can be ignored.) The leaves themselves have zero cost because they signify that the result is known.

I wrote a small program that finds the best algorithm (decision tree) for a given range and cost function, and prints out the results in graphviz-compatible .dot format. Here they are for constant cost (= binary search), linear and quadratic costs for the range [0,14]. Since I don’t (yet) have a solution for the linear cost case, I was surprised to find that the best algorithm for linear cost is still binary search (update: that’s not true, this was a side effect of testing only small trees – I’ll write a new post about the updated analysis) despite having what is intuitively a considerable bias toward the beginning of the range. For quadratic cost (cost(t) ~= t^2), the best algorithm is quite lopsided as you can see below. It’s possible that there’s a slight bias effect also for linear cost that is only prevalent for larger intervals.

Linear cost yields the same graph as constant cost

Linear cost yields the same graph as constant cost for the range [0,14]

Quadratic cost yields a lopsided graph

Quadratic cost yields a lopsided graph

The dreaded Space Leak

The range is that small because the number of trees to test increases very fast – I could only get up to 14 before running out of memory, which is another story. The implementation is naive: the program builds all the trees before calculating the average cost of each – the right way would be to calculate the cost of each tree and discard it so that the GC could get rid of it when memory gets tight.

So what about that soufflé?

As the visualized in the graph for linear cost above, when time is the issue you should still use binary section search. (update: next post will explain why this is wrong). If the cost is quadratic, give higher priority to lower values.

Open questions

  1. Solve the minimization problem for the recursive equation for avgCost when the cost function is linear: cost(t) = t.
  2. Take into consideration other probability distributions. For example, solve for linearly increasing distribution (where larger values are more likely to equal s) and normal distribution around the middle of the range.
  3. How do I improve the implementation of the brute force algorithm to not explode in RAM?

Let me know when you’re done :)

If you have any comments or insights, I’d be glad to hear them! My family can take a few more soufflé attempts…


1. Since I couldn’t find any information about this problem I asked this question on cstheory.stackexchange.com. I got some backlash from one person about this not being a “research-level question” as dictated by the site’s FAQ… You can check out the other questions on the site to decide for yourself if this question exceeds the maximum dumbness level.

About these ads

8 thoughts on “Finding the perfect baking time for that soufflé (or: optimal search with test penalty)

  1. It would also be interesting if, say, over and under cook-ing had different costs.
    I’m thinking of that interview question, where you have to find out what floor a dropped egg will break from in the fewest number of steps, but, you only have n eggs. (n = 2 most of the time, but it variable)

    • In that case cost(t) becomes a random variable with the rule:
      cost t = if t < s then 0 else 1
      In the avgCost function we’d have to plug in a probability: P(t >= s | s in [x,y])

    • Giving it a little more thought: for uniform distribution that conditional probability is linear in t, and the problem is equivalent to linear cost. wait… no, it isn’t linear in t because it depends on the subrange [x,y] which we are calculating (where we are in the recursion). I’ll have to think about it more…

    • Done, it solves the memory issue completely and runs faster. I’ll post an update about that – with larger trees you see that linear cost is not equivalent to constant cost (binary search).

  2. Ultimately, aren’t you in all cases doing binary search, not in feature space but in *cost* space? That is, at each decision point, you choose two subtrees such that the total cost of exploring the each subtree is equal? This should lead to a closed-form solution, but I’ll have to think about it more.

    • Noah, counterexample: in a range [a,b], cost(a) = 0, and for any other value = 1. In that case your “balanced cost subtrees” would lead to picking some value other than a itself, but a must be the right choice for minimizing cost overall (we can test it for free).

      I also think there’s a way to show equivalence of this problem to the regular searching one. For integer cost, one way would be to map the range [a,b] to a “dilated” range where each value x in the original range takes up cost(x) cells in a target array, with order retained. (actually that’s also wrong)

  3. Pingback: Penalty-search problem revisited: Dynamic Programming & Control.Parallel to the rescue | Enough Blogging!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s