Decoding Chess Boards: Ten Years of Computer Vision & AI | Part 1

Sam Does Leetcode
Sam Does Leetcode
1.1 هزار بار بازدید - 2 هفته پیش - Hey Folks, I've built a
Hey Folks, I've built a series of chessboard detection algorithms over the last decade, and I think it's a good way to introduce several computer vision and AI concepts. I show several approaches to chessboard detection, from simpler contour detection to more advanced deep learning models, and explain the pros and cons of each approach.  
If you have any questions about the algorithms, or ideas for other approaches, let me know in the comments! For those who are seeing this type of content for the first time, please enjoy!

00:00 - Intro
00:46 - The Plan
01:18 - Chessboard Outline Contour
02:57 - Tile Based Growing
03:56 - Hough Transform Lines
05:30 - Optimizing Hough with C++
06:26 - X-Corner
07:39 - Fit Grid to X-Corners
08:38 - Deep Neural Networks
10:21 - Convolutional Neural Networks
11:01 - Optical Flow
11:51 - YOLO & More

Outline:

- In general we use classic computer vision to load images/videos, build gradients and edges from them with techniques like the Sobel operator and Canny edge detection.
- From those we identify contours (grouping binary pixels), using OpenCV's libraries, try to find the largest quadrilateral which is the chessboard. This works for the simplest images, but real images with occlusions will fail.
- To improve on this, instead we look for tiles, and try to grow them out, warping the image (homographies) and scoring the warped image for 'chessboard-ness'. This is computationally expensive, but works, slowly, minutes.
- Then we look at Hough Transforms, a technique to find lines in images, several variations of increasing speeds are developed, including informed and OpenCV's probabilistic, as well as C++ optimized versions. This has potential, but hough resolution and cost makes it tough on most images alone.
- Next a mathematical oddity of the chessboard pattern makes identifying their x-corner intersections (Saddle points) a relatively fast computation, we use those to filter tiles before growing, with random sampling (RANSAC) to identify good groups. We're down to seconds an image now.
- Machine learning and Deep and Convolutional Neural Networks (DNN and CNNs) are taking off now (2014's), I look into building a labeled dataset of x-corners for training a small model for further filtering. This works, we're under tenths of seconds now.
- There are some C++ optimizations with Halide, an interesting tool for parallelizing matrix operations.

For those interested, all the (very rough and decade old) code is available on https://github.com/Elucidation/Chessb...

Looking back, I could have saved myself a lot of time, but making mistakes is part of the journey, and it's quite fun to explore computer vision and AI.
2 هفته پیش در تاریخ 1403/04/10 منتشر شده است.
1,137 بـار بازدید شده
... بیشتر