In the Part 1 of this series, I showed how to turn chessboard recognition into an optimization problem. In Part 2, I discussed different ways of solving this optimization problem, deciding on a custom evolutionary algorithm with tailored mutation operators described in Part 3. In this post, I discuss my solution. A video of the final solution is shown to the right.
Basic Solution
The last three posts developed the chessboard recognition problem as an optimization problem to be solved by an evolutionary algorithm using specialized mutation operators. So given the score function that assigns a score to possible quadrilaterals as chessboard boundaries, this simple algorithm works as follows. First, a population of 100 quadrilaterals is generated by placing a virtual camera at a random positions around an imaginary chessboard and projecting the chessboard into the camera's frustrum. Then the following steps are repeated one hundred times:- Each quadrilateral in the population is scored.
- The quadrilateral with the highest score is chosen; for this purpose the best quadrilateral from the last population is included, so that selection is elitist.
- The best quadrilateral is mutated using the tailored mutation operations to generate a new population of 10 quadrilaterals.
An Elitist Island Model
Although the basic solution works well, it sometimes gets stuck. This problem can be alleviated by running the basic algorithm several times in a distributed fashion. So when the chessboard detection system first starts, it runs the basic algorithm eight times in eight separate processes. Then each one of the eight instance of the algorithm returns a proposed solution. These eight solutions are compared using the score function, but with the addition of a score component that checks the color of the different board positions, which is too expensive to check for all 8,000 possible solutions. The best out of these eight solutions is chosen as the current best guess at the chessboard location for one time step. It takes 10-20 seconds to compute this guess.
Then, for subsequent time steps, four smaller evolutionary algorithms are run. Rather than generating new chessboards from random camera positions, each algorithm is seeded with the initial guess, and generates a new population of ten quadrilaterals just as in Step 3 above. The algorithm is run for just ten more generations, and then the four best quadrilaterals for each evolutionary algorithm are compared again. Once again, the color information is also checked, and the best quadrilateral is published as the current chessboard guess. These smaller evolutionary runs are run in parallel and complete within 250 milliseconds. So after initialization, the entire model runs at 4 Hz, publishing four successive guesses at the chessboard per second.
An algorithm of this form is called an island model. Each distributed evolutionary algorithm is called an island, and when information travels between islands it is called migration. Island models can be very effective because they maintain some diversity across all islands while allowing a focused search on each island. In this case, our island model is elitist, because when the islands exchange information, it wipes out the past entirely.
Adapting to Change
One of the nice things about an evolutionary algorithm is that it adapts easily to change. If the board is moved while the board detector is running, it only takes a few cycles (5-10 seconds) for the best chessboard guess to adapt to the new reality. While this latency is not fast enough to allow the robot to move its eyes while tracking the board, it is good enough to adjust reasonably if someone accidentally (or intentionally) bumps into the board.
Also, in order to promote stability when the board does not change, it makes sense to keep old detected intersection points, since the lines and intersections detected at each frame may vary. In my approach, old intersection points are rolled off every 5 frames. This is enough to make the board detection robust, for example when an hand is placed between the camera and the board. You can see an example of this in the video above.
Overall Results
The overall solution is very robust and relatively fast. The image to the right shows what a converged example looks like, and the video above shows how the detection adjusts when the board is obscured or moved.
No comments:
Post a Comment