Centos上安装tesseract+pytesseract用来做图片验证码的识别

转载请注明出处:http://www.cnblogs.com/blazer/p/7131202.html

环境:Centos6.7

tesseract-3.05

pytesseract-0.1.7

Imaging-1.1.7

Ubuntu

If they are not already installed, you need the following libraries (Ubuntu 16.04/14.04):

sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install autoconf-archive
sudo apt-get install pkg-config
sudo apt-get install libpng12-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev

if you plan to install the training tools, you also need the following libraries:

sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev

官方叫你装的依赖包

如果是用yum装,则有些关键字不太一样,需要慢慢装。

都安装完了之后,然后使用如下python

image = Image.open('yzm.jpeg')
vcode = pytesseract.image_to_string(image)

有可能会报以下错误:

IOError: decoder jpeg not available

那么,重装Imaging-1.1.7

装的时候可能会遇到一个问题。

python selftest.py

 执行该脚本能看到是否支持图片

我的Centos中是已经安装了libjpeg-turbo这个包的。

但是支持该脚本还是有如下关键字

*** TKINTER support not installed
*** JPEG support not installed
*** ZLIB (PNG/ZIP) support not installed
*** FREETYPE2 support not installed
*** LITTLECMS support not installed 

 那么

TCL_ROOT = None
JPEG_ROOT = None
ZLIB_ROOT = None
TIFF_ROOT = None
FREETYPE_ROOT = None
LCMS_ROOT = None 

改成

TCL_ROOT = "/usr/lib64/"
JPEG_ROOT = "/usr/lib64/"
ZLIB_ROOT = "/usr/lib64/"
TIFF_ROOT = "/usr/lib64/"
FREETYPE_ROOT = "/usr/lib64/"
LCMS_ROOT = "/usr/lib64/"

然后需要重新编译和安装

python2.7 setup.py clean
python2.7 setup.py build_ext
python2.7 setup.py build
python2.7 setup.py install
原文地址:https://www.cnblogs.com/blazer/p/7131202.html