Wednesday, September 3, 2014

Robust Chessboard Recognition, Part 3: Tailored Mutation Operators


A correctly detected chessboard
In the last two posts, I discussed the chessboard recognition problem and the types of algorithms we could use to solve it. In this post I will describe the solution I developed. As described in the first post, I use an image of the chessboard taken from the iCub's eye, extract lines and intersections, and create a chessboard scoring function that takes a proposed quadrilateral, warps the image to undo the perspective, and then checks to see how many lines and intersections match what is expected from a chessboard. An example of a good quadrilateral is shown to the right.

In the second post, I argued the case for using a simple evolutionary algorithm with custom mutation operators tailored to look for chessboards. In this post, I will describe the mutation operators. In the next post, I'll describe an elitist distributed island model used to select the best chessboard candidates.

Rectangle Mutation

A randomly generated quadrilateral
The general idea is that we start with a quadrilateral with a decent score, and we want to change in a way that might make it better. At the very beginning, the quadrilaterals are chosen by using a projective model from a camera whose position and orientation are somewhat random above and around the chessboard. This projection generates some random convex quadrilaterals. An example is shown to the left. So let's think about what could be wrong with our quadrilateral, and how we could adjust to quadrilateral to fix it.



Snapping Corners to Points


Leftmost corner snapped to point
Looking at the chessboard in the real image above, we notice that the generated quadrilateral has no relationship the to the lines (in red) and intersections (white circles) that we computed for the scoring function. Ideally, the right quadrilateral should have corners that are near intersections on the chessboard. So one of the first mutations we might consider is to snap a corner of the quadrilateral to some intersection that we detected. An example of what happens after this is done is shown to the right. It looks like we have now have nearly matched near the upper left corner of the real chessboard. The image on the right was generated by just ten trial points. 
Top right snapped to point
This snapping operation works because there is a concentration of intersection points inside or near the chessboard, so with high probability we'll pick a point right around the chessboard, and there is a decent probability that we will get a snap point closer to one of the corners. If we do it again, the we might get a second corner a bit closer. This is shown to the right. The snap point actually came from the keyboard. But it doesn't matter, since it got us closer to the right answer.  Again, I only had to sample 10 snap points to get the image in the right.

But now we're stuck. The snapping operation works by randomly choosing an intersection and then moving the closest corner of the quadrilateral to that point, as long as the corner lies within a threshold from the snap point (50 pixels from a 480 x 640 image here). As can be seen, our two earlier snap points are now closer to most of the new snap points we might choose than the other two corners are. Despite generating 100 new samples, I could not get a third corner in place. We need something more.

Elongating and Shortening

After six shortening steps
After three shortening steps
We need to move the corners that are below and to the right of the visible area. Specifically, we need then to come close enough to the other corners so that we can snap them into place. One way to do this is to take lines on opposite sides of the quadrilateral and make them shorter, pulling the lower corners in towards the top two points.

After six more snaps
The actual mutation operation chooses a random side of the quadrilateral, and then picks a percentage to shorten or lengthen that side and its opposite. The percentage is chosen from a random normal variable with a variance of 12.5%. In our case, the quadrilateral's corners are off at 1255 pixels on the right, while the image ends at 640 pixels. It will take several operations to pull the right side in towards the image.

The images to the right show the progress after three and six steps. Notice that we've also gotten further away from the corners. But now the corners at the right and bottom are close enough that we can hope to pull them in by snapping more. The image above is what we get after six snaps.

Widening and Narrowing

After six narrowing steps
Our quadrilateral is looking better; we're getting close on three out of the four corners. But that fourth corner is still at 881 pixels out of 480. It needs to be pulled in. This time, we'll use an operation to widen and narrow the quadrilateral. This operation randomly picks a single side, and then pulls both the corners it is attached to towards the center of the line. We might imagine that while doing so we are moving the camera up or down with respect to the plane of the board. Again we use a standard normal with variation 12.5% to decide the percentage of widening/narrowing. 

Three elongations and a snap
After six narrowing steps, we get the picture to the right. We've pulled the lowest coordinate up from 881 to 632 pixels, but we're stuck again. The side on which most of the narrowing has been done pulled down the corner on the right, and we're not going to get much more out of narrowing that side. But we can try to lengthen it back a bit via the elongation operation. And then we can fix the corners by snapping. The image to the left shows the progress after three elongations and a single snap. We're getting a bit closer, but we still need more.

Clipping to Lines

Second clip to right
First clip to botton
Up to now, I haven't mentioned our most effective mutation operator. We used the intersections points right at the outset, but we didn't use the lines. And yet, if all works well, the outline of the chessboard is right there in our list of lines! So what we do now is to choose a random line, and then move the closest side of the rectangle inward to match the lines. And look below at what happens after just three steps of clipping the quadrilateral to lines. We now have three of the four sides. But because of all the pieces on the board, it will be hard to get the fourth line, since less of that line was extracted. No matter; we can snap again.

After another snap
Third clip to left
After snapping, we are now in a position where the elongation/shortening operator should work perfectly. After three elongations and two clips, we are almost done.




Three elongations and a clip

Sliding into Place

Since we can't clip the final line into place, we have to randomly shift the corners. Our final operator picks one corner of the rectangle and shifts it a random amount. I chose to use a poisson random variable with \(\lambda = 5\). A few random shifts that affect the top corner, and then we can clip the left side back into place. A few more random shifts on the right, and then the right side gets clipped in to place. We can stop here, since we now have a good match. Once the solution is this close to optimum, the scoring functions can quickly guide an evolutionary algorithm to the correct solution.
Shift top left corner

Clip top left into place

Shift top right into place

Final clip to refine.

Summary of Mutations

It took 40 mutations to achieve the picture above. While that might seem like a lot, a more general-purpose solution can take far longer. For example, differential evolution needs thousands of generations and still does not find matches this good. The clip and snap mutations are very effective at moving the quadrilateral close to a good solution, and the other three mutations are useful to shift everything into the right place using the fitness function.

In the end, all of the mutations mentioned above are useful. Our final evolutionary algorithm mutates a quadrilateral by randomly selecting one of the mutations and applying it. The list of mutations is listed below, together with the probability of choosing each one.

  1. Clip to Line (40% probability)
  2. Snap to Point (20% probability)
  3. Elongate / Shorten (15% probability)
  4. Widen / Narrow (15% probability)
  5. Shift One Corner (10% probability)
We're counting on clipping and snapping to get us near the right solution, and on elongate and widen to bring us into situations where clipping and snapping work. Finally, small local shifts optimize the solution in the end.

In the next post, we'll look at how the evolutionary algorithm operates in real-time to track the chessboard and adjust when the chessboard is moved.









No comments:

Post a Comment