图像处理和字符提取

我正在试图找出处理角色图像所需的技术。

具体来说，在这个例子中，我需要提取带圆圈的＃标签。你可以在这里看到它：

在此处输入图像描述

任何实现都会有很大的帮助。

使用OpenCV + Tesseract可以解决这个问题

虽然我认为可能有更简单的方法。 OpenCV是一个用于构建计算机视觉应用程序的开源库， Tesseract是一个开源OCR引擎。

在我们开始之前，让我澄清一下：这不是一个圆圈，它是一个圆角矩形 。

我正在分享我编写的应用程序的源代码，以演示如何解决问题，以及有关正在发生的事情的一些提示。这个答案不应该教育任何人进行数字图像处理，并且期望读者对该领域的理解很少。

我将简要描述代码的较大部分的作用。大部分代码都来自squares.cpp ，这是一个示例应用程序，随OpenCV一起提供，用于检测图像中的方块。

#include  #include  #include  #include  // angle: helper function. // Finds a cosine of angle between vectors from pt0->pt1 and from pt0->pt2. double angle( cv::Point pt1, cv::Point pt2, cv::Point pt0 ) { double dx1 = pt1.x - pt0.x; double dy1 = pt1.y - pt0.y; double dx2 = pt2.x - pt0.x; double dy2 = pt2.y - pt0.y; return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10); } // findSquares: returns sequence of squares detected on the image. // The sequence is stored in the specified memory storage. void findSquares(const cv::Mat& image, std::vector >& squares) { cv::Mat pyr, timg; // Down-scale and up-scale the image to filter out small noises cv::pyrDown(image, pyr, cv::Size(image.cols/2, image.rows/2)); cv::pyrUp(pyr, timg, image.size()); // Apply Canny with a threshold of 50 cv::Canny(timg, timg, 0, 50, 5); // Dilate canny output to remove potential holes between edge segments cv::dilate(timg, timg, cv::Mat(), cv::Point(-1,-1)); // find contours and store them all as a list std::vector > contours; cv::findContours(timg, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE); for( size_t i = 0; i < contours.size(); i++ ) // Test each contour { // Approximate contour with accuracy proportional to the contour perimeter std::vector approx; cv::approxPolyDP(cv::Mat(contours[i]), approx, cv::arcLength(cv::Mat(contours[i]), true)*0.02, true); // Square contours should have 4 vertices after approximation // relatively large area (to filter out noisy contours) // and be convex. // Note: absolute value of an area is used because // area may be positive or negative - in accordance with the // contour orientation if( approx.size() == 4 && fabs(cv::contourArea(cv::Mat(approx))) > 1000 && cv::isContourConvex(cv::Mat(approx)) ) { double maxCosine = 0; for (int j = 2; j < 5; j++) { // Find the maximum cosine of the angle between joint edges double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1])); maxCosine = MAX(maxCosine, cosine); } // If cosines of all angles are small // (all angles are ~90 degree) then write quandrange // vertices to resultant sequence if( maxCosine < 0.3 ) squares.push_back(approx); } } } // drawSquares: function draws all the squares found in the image void drawSquares( cv::Mat& image, const std::vector >& squares ) { for( size_t i = 0; i < squares.size(); i++ ) { const cv::Point* p = &squares[i][0]; int n = (int)squares[i].size(); cv::polylines(image, &p, &n, 1, true, cv::Scalar(0,255,0), 2, CV_AA); } cv::imshow("drawSquares", image); }

好的，我们的计划开始于：

 int main(int argc, char* argv[]) { // Load input image (colored, 3-channel) cv::Mat input = cv::imread(argv[1]); if (input.empty()) { std::cout << "!!! failed imread()" << std::endl; return -1; } // Convert input image to grayscale (1-channel) cv::Mat grayscale = input.clone(); cv::cvtColor(input, grayscale, cv::COLOR_BGR2GRAY); //cv::imwrite("gray.png", grayscale);

什么灰度看起来像：

 // Threshold to binarize the image and get rid of the shoe cv::Mat binary; cv::threshold(grayscale, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("Binary image", binary); //cv::imwrite("binary.png", binary);

什么二进制看起来像：

 // Find the contours in the thresholded image std::vector > contours; cv::findContours(binary, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE); // Fill the areas of the contours with BLUE (hoping to erase everything inside a rectangular shape) cv::Mat blue = input.clone(); for (size_t i = 0; i < contours.size(); i++) { std::vector cnt = contours[i]; double area = cv::contourArea(cv::Mat(cnt)); //std::cout << "* Area: " << area << std::endl; cv::drawContours(blue, contours, i, cv::Scalar(255, 0, 0), CV_FILLED, 8, std::vector(), 0, cv::Point() ); } cv::imshow("Countours Filled", blue); //cv::imwrite("contours.png", blue);

什么蓝色看起来像：

 // Convert the blue colored image to binary (again), and we will have a good rectangular shape to detect cv::Mat gray; cv::cvtColor(blue, gray, cv::COLOR_BGR2GRAY); cv::threshold(gray, binary, 225, 255, cv::THRESH_BINARY_INV); cv::imshow("binary2", binary); //cv::imwrite("binary2.png", binary);

此时二进制文件看起来像什么：

 // Erode & Dilate to isolate segments connected to nearby areas int erosion_type = cv::MORPH_RECT; int erosion_size = 5; cv::Mat element = cv::getStructuringElement(erosion_type, cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1), cv::Point(erosion_size, erosion_size)); cv::erode(binary, binary, element); cv::dilate(binary, binary, element); cv::imshow("Morphologic Op", binary); //cv::imwrite("morpho.png", binary);

此时二进制文件看起来像什么：

 // Ok, let's go ahead and try to detect all rectangular shapes std::vector > squares; findSquares(binary, squares); std::cout << "* Rectangular shapes found: " << squares.size() << std::endl; // Draw all rectangular shapes found cv::Mat output = input.clone(); drawSquares(output, squares); //cv::imwrite("output.png", output);

输出结果如下：

好的！我们解决了问题的第一部分，即找到圆角矩形。您可以在上图中看到，检测到矩形形状，并在原始图像上绘制绿线以用于教育目的。

第二部分要容易得多。首先在原始图像中创建ROI（感兴趣区域），以便我们可以将图像裁剪到圆角矩形内的区域。完成此操作后，裁剪后的图像将作为TIFF文件保存在磁盘上，然后将其转换为Tesseract实现它的魔力：

 // Crop the rectangular shape if (squares.size() == 1) { cv::Rect box = cv::boundingRect(cv::Mat(squares[0])); std::cout << "* The location of the box is x:" << box.x << " y:" << box.y << " " << box.width << "x" << box.height << std::endl; // Crop the original image to the defined ROI cv::Mat crop = input(box); cv::imshow("crop", crop); //cv::imwrite("cropped.tiff", crop); } else { std::cout << "* Abort! More than one rectangle was found." << std::endl; } // Wait until user presses key cv::waitKey(0); return 0; }

什么作物看起来像：

当这个应用程序完成它的工作时，它会在磁盘上创建一个名为cropped.tiff的文件。转到命令行并调用Tesseract以检测裁剪图像上的文本：

 tesseract cropped.tiff out

此命令使用检测到的文本创建名为out.txt的文件：

在此处输入图像描述

Tesseract有一个API，您可以使用它将OCRfunction添加到您的应用程序中。

此解决方案不健壮，您可能需要在此处进行一些更改，以使其适用于其他测试用例。

有几种选择： Java OCR实现

他们提到了下一个工具：

java ocr http://sourceforge.net/projects/javaocr/
向往http://asprise.com/home/
面向Java面向对象的神经引擎http://www.jooneworld.com/
Ron Cemer Java OCR http://www.roncemer.com/software-development/java-ocr

还有其他几个。

此链接列表也很有用： http ： //www.javawhat.com/showCategory.do？id = 2138003

通常这种任务需要大量的试验和测试。可能最好的工具更多地依赖于输入数据的概况而不是其他任何东西。

您可以查看这篇文章： http ： //www.codeproject.com/Articles/196168/Contour-Analysis-for-Image-Recognition-in-C

轮廓分析演示

它带有数学理论和C＃的实现（不幸的是，如果你决定在java中实现它，那么重写没有那么多）+ opencv。因此，如果您想对其进行测试，则必须使用Visual Studio并对您的opencv版本进行重建，但这是值得的。

OCR适用于扫描文档。您所指的是一般图像中的文本检测，这需要其他技术（有时OCR用作流程的一部分）

我不知道任何“生产准备”的实现。

一般信息尝试google学者：“图像中的文本检测”

一个对我来说效果很好的特定方法是“笔画宽度变换” （SWT）并不难实现，我相信还有一些在线可用的实现。

图像处理和字符提取

使用OpenCV + Tesseract可以解决这个问题

合并不同语言的程序

Java Server，C＃Client。发送数据

在if语句中使用按位和内部

如何创建一个Java类，类似于C ++模板类？

修正的Fibonacci序列的迭代版本

XOR使用数学运算符

Java和C ++之间的对象创建的主要区别是什么？

如何有效地转置2D位矩阵

是否有可能让sockets保持无限时间

不使用任何外部函数生成随机数

图像处理和字符提取

使用OpenCV + Tesseract可以解决这个问题

合并不同语言的程序

Java Server，C＃Client。 发送数据

在if语句中使用按位和内部

如何创建一个Java类，类似于C ++模板类？

修正的Fibonacci序列的迭代版本

XOR使用数学运算符

Java和C ++之间的对象创建的主要区别是什么？

如何有效地转置2D位矩阵

是否有可能让sockets保持无限时间

不使用任何外部函数生成随机数

Java Server，C＃Client。发送数据