by:Mike McCourt, Ph.D.
In an earlier post, we introduced the topic of multicriteria optimization and some mechanisms for solving multicriteria problems (also called multiobjective problems) using theSigOpt optimization tool; SigOpt provides aRESTful API to solve scalar black-box optimization problems for customers from fields such asindustrial design andimage processing. The earlier introduction discussed a somewhat simple problem of choosing the driving speed on a trip subject to concerns about minimizing both time in transit and cost of fuel. In this post we want to analyze a more complex situation in which the parameters of a given model produce a random output, and our multicriteria problem involves maximizing the mean while minimizing the variance of that random variable.
The standard example in literature of simultaneously considering the mean behavior of a model/process and the variance is in financial portfolio management. Generally, investors want to maximize expected gain while minimizing risk (which exists in the form of variance) which motivated the development of the Nobel prize winning theory of mean-variance analysis . The original work and a 2014 retrospective by the same author (as well as modern texts such as here and here ) serve as helpful introductions to the theory of defining (even implicitly) a utility function and, potentially, an associated quadratic (mean-variance) approximation. Under some conditions (discussed in those references) an expert’s selection of a portfolio from the mean-variance Pareto frontier should approximately optimize the utility function.
Our example today is less practical and more fun (1) . We start with a mouse that has some ability to intelligently traverse a maze and study how long it takes the mouse to solve a randomly generated maze. Then we try to tune parameters of the random maze generation algorithm to create mazes which are very difficult for the mouse to solve. This will be phrased as a multicriteria optimization problem, with the goal of maximizing the mean time to solve while minimizing its variance.
Variations on a theme
The motivating goal of mean-variance analysis involves both wanting good performance (high portfolio return) and wanting to avoid bad performance (high risk and potential for loss). These are certainly related, but are not necessarily the same thing. For example, gamblers who play slot machines are told on the machine exactly how much they will win/lose on average by playing the slot machine. If the slot machine paid out $0.90 for every $1.00 put into it, with no variance whatsoever, I doubt anyone would pla (2) . But, gamblers may be willing to accept a low mean payout in exchange for a high variance because the higher variance implies a greater chance for a million dollar payout. In this setting, the gambler embracing the high likelihood of a bad performance in exchange for a potentially exceptional performance.
Other circumstances where our goal would not be to maximize the mean and minimize variance include many industrial processes: in the formation of a tool such as a nail, the goal is not to maximize the mean length of the nail but rather to minimize deviation from a desired length. This goal was formalized with the development of the Taguchi loss function (3) , which served as a scalarization of the multicriteria optimization problem involving minimizing both the variance and the deviation from the desired outcome.
Maze solver description
Our maze solver strategy is the very simple wall follower strategy , which will, admittedly, only work on simply connected mazes (no loops, no levels). Given that, it is quite easy to implement and has predictable (non-random) behavior which helps push the entirety of the randomness to the maze generation algorithm. For those of you interested, you can see the implementation of the maze construction and solution in our examples Github repo .
The strategy involves the mouse continually following the wall on the right all the way through the maze until reaching the end. If the mouse can turn right, it does so, otherwise it moves forward; if both those paths are blocked it attempts to turn left. If all three directions are blocked, the mouse has reached a dead end and reverses course, continuing to hug the right wall to guide it forward. There is an outstanding explanation and graphic of this wall follower strategy available here which I will not try to supercede. The figure below, instead, shows why the wall follower strategy succeeds for simply connected mazes, but may fail for mazes which have their exit inside a loop.
Figure 1: Two mice moving through slightly different mazes (see if you can spot the difference) using the wall follower strategy. (left) This maze is solvable because all walls are attached to the boundary (right) This maze is not solvable because the solution is next to a wall which is not attached to the boundary; eventually, the mouse simply returns to the start. Maze generator description
It is quite tempting for myself, as a research-oriented person, to see the problem of designing a maze as an opportunity to spend several days devising diabolical labyrinths under the cover of “developing content marketing”. I restrained, however, if only because so many great resources are already available on the internet. In particular, Wikipedia already had available a perfectly acceptable starting point involving a random depth-first search . Furthermore, the resulting maze structure guarantees that the failure to solve observed in Figure 1 cannot be encountered.
The only shortcoming with the code already available is that there were no tunable parameters. We needed to introduce a mechanism for making the mazes harder or easier for the mouse to solve; we did this by varying the relative probability of the depth-first search choosing each of the possible directions with which it could move, after eliminating those directions which it has already traversed. Essentially, at each step of the maze generation, there is a relative probability of punching through a wall. The figure below shows one such randomly generated maze, which has no preference for any direction.
Figure 2: A sample maze creation process with equal probability of moving in each direction.
By allowing the creation algorithm the possibility of different probabilities, mazes will start to gain a more identifiable structure, as demonstrated in the figure below. The probability of moving left, up, right and down are the parameters that we tune using Bayesian optimization. (4)