.Net调用科大讯飞OCR接口识别图片中的印刷文字

目前市场上的OCR我了解到的有谷歌、科大讯飞、百度,本文主要介绍.Net中如何调用科大讯飞的接口识别图片文字:

一:注册账号、实名认证后可领取免费的识别次数:

如图:创建项目后方可获得对应的id和密码;

因为我是用Api的方式请求接口,所以只需要加上参数模仿Http请求即可,不需要引用Dll依赖,也可以用SDK的方式去识别,后边用百度的OCR我就是用SDK的方式:

 public static String Md5(string s)
        {
            System.Security.Cryptography.MD5 md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
            byte[] bytes = System.Text.Encoding.UTF8.GetBytes(s);
            bytes = md5.ComputeHash(bytes);
            md5.Clear();
            string ret = "";
            for (int i = 0; i < bytes.Length; i++)
            {
                ret += Convert.ToString(bytes[i], 16).PadLeft(2, '0');
            }
            return ret.PadLeft(32, '0');
        }

        public static void Headers()
        {
            string x_appid = "*****";
            string api_key = "********";
            string path = @"E:imgFile15.jpg";
            string param = @"{""language"":""en"",""location"": ""false""}";

            System.Text.Encoding encode = System.Text.Encoding.ASCII;
            byte[] bytedata = encode.GetBytes(param);
            string x_param = Convert.ToBase64String(bytedata);

            TimeSpan ts = DateTime.UtcNow - new DateTime(1970, 1, 1, 0, 0, 0, 0);
            string curTime = Convert.ToInt64(ts.TotalSeconds).ToString();

            MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
            string result = string.Format("{0}{1}{2}", api_key, curTime, x_param);
            string X_checksum = Program.Md5(result);

            byte[] arr = File.ReadAllBytes(path);
            string cc = Convert.ToBase64String(arr);
            string data = "image=" + cc;

            string Url = @"https://webapi.xfyun.cn/v1/service/v1/ocr/general";

            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(Url);
            request.Method = "POST";
            request.ContentType = "application/x-www-form-urlencoded; charset=utf-8";
            request.Headers["X-Appid"] = x_appid;
            request.Headers["X-CurTime"] = curTime;
            request.Headers["X-Param"] = x_param;
            request.Headers["X-CheckSum"] = X_checksum;

            request.ContentLength = Encoding.UTF8.GetByteCount(data);
            Stream requestStream = request.GetRequestStream();
            StreamWriter streamWriter = new StreamWriter(requestStream, Encoding.GetEncoding("gb2312"));
            streamWriter.Write(data);
            streamWriter.Close();

            string htmlStr = string.Empty;
            HttpWebResponse response = request.GetResponse() as HttpWebResponse;
            Stream responseStream = response.GetResponseStream();
            using (StreamReader reader = new StreamReader(responseStream, Encoding.GetEncoding("UTF-8")))
            {
                htmlStr = reader.ReadToEnd();
            }
            responseStream.Close();

            var json = JsonConvert.DeserializeObject<Root>(htmlStr);
            string str = string.Empty;
            
            foreach (var item1 in json.data.block)
            {
                
                foreach (var item in item1.line)
                {
                    foreach (var item2 in item.word)
                    {
                        str += item2.content+item.confidence;
                    }
                }
            }
            Console.WriteLine(str);
            Console.ReadLine();

        }


        static void Main(string[] args)
        {
            Headers();
        }

识别出来的是json数据:

用Newtonsoft.json反序列化一下就得到数据了:

对象类:

public class Root
    {
        /// <summary>
        /// 
        /// </summary>
        public string code { get; set; }
        /// <summary>
        /// 
        /// </summary>
        public Data data { get; set; }
        /// <summary>
        /// 
        /// </summary>
        public string desc { get; set; }
        /// <summary>
        /// 
        /// </summary>
        public string sid { get; set; }
    }
    public class Data
    {
        /// <summary>
        /// 
        /// </summary>
        public List<BlockItem> block { get; set; }
    }
    public class BlockItem
    {
        /// <summary>
        /// 
        /// </summary>
        public string type { get; set; }
        /// <summary>
        /// 
        /// </summary>
        public List<LineItem> line { get; set; }
    }
    public class LineItem
    {
        /// <summary>
        /// 
        /// </summary>
        public int confidence { get; set; }
        /// <summary>
        /// 
        /// </summary>
        public List<WordItem> word { get; set; }
    }
    public class WordItem
    {
        /// <summary>
        /// 赣到時段
        /// </summary>
        public string content { get; set; }
    }

上面的demo是在官方给的demo中改出来的:

附上img路径base64编码的方法封装:

#region Image To base64
        public static Image UrlToImage(string url)
        {
            WebClient mywebclient = new WebClient();
            byte[] Bytes = mywebclient.DownloadData(url);
            using (MemoryStream ms = new MemoryStream(Bytes))
            {
                Image outputImg = Image.FromStream(ms);
                return outputImg;
            }
        }

        /// <summary>
        /// Image 转成 base64
        /// </summary>
        /// <param name="fileFullName"></param>
        public static string ImageToBase64(Image img)
        {
            try
            {
                Bitmap bmp = new Bitmap(img);
                MemoryStream ms = new MemoryStream();
                bmp.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
                byte[] arr = new byte[ms.Length];
                ms.Position = 0;
                ms.Read(arr, 0, (int)ms.Length);
                ms.Close();
                return Convert.ToBase64String(arr);
            }
            catch (Exception ex)
            {
                return null;
            }
        }
        public static string ImageToBase64(string url)
        {
            return ImageToBase64(UrlToImage(url));
        }
        #endregion
View Code

识别出来的结果可以用正则表达式解析出自己想要的数据;

原文地址:https://www.cnblogs.com/jf-ace/p/15243152.html