技术控

    今日:36| 主题:49507
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] Tesseract.js – Pure JavaScript OCR for 60 Languages

[复制链接]
咬字拆开念 发表于 2016-10-13 02:07:43
218 4

立即注册CoLaBug.com会员,免费获得投稿人的专业资料,享用更多功能,玩转个人品牌!

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
Tesseract.js

   Tesseract.js is a javascript library that gets words inalmost any language out of images. ( Demo )
   
Tesseract.js – Pure JavaScript OCR for 60 Languages-1 (javascript,including,function,library,include)

   Tesseract.js works with script tags, webpack/browserify, and node.After you install it, using it is as simple as
  [code]Tesseract.recognize(myImage)
         .progress(function  (p) { console.log('progress', p)    })
         .then(function (result) { console.log('result', result) })[/code]   Check out the docsfor a full treatment of the API.
  Installation

   Tesseract.js works with a [/code]   After including your scripts, the Tesseract variable should be defined! You canhead to the docsfor a full treatment of the API.
  npm

  First:
  [code]> npm install tesseract.js --save[/code]  Then
  [code]var Tesseract = require('tesseract.js')[/code]  or
  [code]import Tesseract from 'tesseract.js'[/code]   You canhead to the docsfor a full treatment of the API.
  Docs

  
       
  • Tesseract.recognize(image: ImageLike[, options]) ->TesseractJob
           
    • Simple Example     
    • More Complicated Example   
       
  • Tesseract.detect(image: ImageLike) ->TesseractJob   
  • ImageLike   
  • TesseractJob
           
    • TesseractJob.progress(callback: function) -> TesseractJob     
    • TesseractJob.then(callback: function) -> TesseractJob     
    • TesseractJob.catch(callback: function) -> TesseractJob   
       
  • Local Installation
           
    • corePath     
    • workerPath     
    • langPath   
       
  • Contributing
           
    • Development     
    • Building Static Files     
    • Send us a Pull Request!   
       
   Tesseract.recognize(image:ImageLike[, options]) ->TesseractJob

   Figures out what words are in image , where the words are in image , etc.
  
       
  • image is anyImageLikeobject.   
  • options is either absent (in which case it is interpreted as 'eng' ), a string specifing a language short code from thelanguage list, or a flat json object that may:
           
    • include properties that override some subset of the default tesseract parameters     
    • include a lang property with a value from the list of lang parameters   
       
   Returns aTesseractJobwhose then , progress , and catch methods can be used to act on the result.
  Simple Example:

  [code]Tesseract.recognize(myImage)
.then(function(result){
    console.log(result)
})[/code]  More Complicated Example:

  [code]// if we know our image is of spanish words without the letter 'e':
Tesseract.recognize(myImage, {
    lang: 'spa',
    tessedit_char_blacklist: 'e'
})
.then(function(result){
    console.log(result)
})[/code]   Tesseract.detect(image:ImageLike) ->TesseractJob

  Figures out what script (e.g. 'Latin', 'Chinese') the words in image are written in.
  
       
  • image is anyImageLikeobject.  
   Returns aTesseractJobwhose then , progress , and error methods can be used to act on the result of the script.
  [code]Tesseract.detect(myImage)
.then(function(result){
    console.log(result)
})[/code]  ImageLike

   The main Tesseract.js functions take an image parameter, which should be something that is like an image. What's considered "image-like" differs depending on whether it is being run from the browser or through NodeJS.
  On a browser, an image can be:
  
       
  • an img , video , or canvas element   
  • a CanvasRenderingContext2D (returned by canvas.getContext('2d') )   
  • a File object (from a file or drag-drop event)   
  • a Blob object   
  • a ImageData instance (an object containing width , height and data properties)   
  • a path or URL to an accessible image (the image must either be hosted locally or accessible by CORS)  
  In NodeJS, an image can be
  
       
  • a path to a local image   
  • a Buffer instance containing a PNG or JPEG image   
  • a ImageData instance (an object containing width , height and data properties)  
  TesseractJob

   A TesseractJob is an an object returned by a call to recognize or detect . It's inspired by the ES6 Promise interface and provides then and catch methods. One important difference is that these methods return the job itself (to enable chaining) rather than new.
  Typical use is:
  [code]Tesseract.recognize(myImage)
    .progress(function(message){console.log(message)})
    .catch(function(err){console.error(err)})
    .then(function(result){console.log(result)})[/code]  Which is equivalent to:
  [code]var job1 = Tesseract.recognize(myImage);

job1.progress(function(message){console.log(message)});

job1.catch(function(err){console.error(err)});

job1.then(function(result){console.log(result)})[/code]  TesseractJob.progress(callback: function) -> TesseractJob

   Sets callback as the function that will be called every time the job progresses.
  
       
  • callback is a function with the signature callback(progress) where progress is a json object.  
  For example:
  [code]0[/code]  The console will show something like:
  [code]1[/code]  TesseractJob.then(callback: function) -> TesseractJob

   Sets callback as the function that will be called if and when the job successfully completes.
  
       
  • callback is a function with the signature callback(result) where result is a json object.  
  For example:
  [code]2[/code]  The console will show something like:
  [code]3[/code]  TesseractJob.catch(callback: function) -> TesseractJob

   Sets callback as the function that will be called if the job fails.
  
       
  • callback is a function with the signature callback(erros) where error is a json object.  
  Local Installation

   In the browser, tesseract.js simply provides the API layer. Internally, it opens a WebWorker to handle requests. That worker itself loads code from the Emscripten-built tesseract.js-core which itself is hosted on a CDN. Then it dynamically loads language files hosted on another CDN.
   Because of this we recommend loading tesseract.js from a CDN. But if you really need to have all your files local, you can use the Tesseract.create function which allows you to specify custom paths for workers, languages, and core.
  [code]4[/code]  corePath

   A string specifying the location of the tesseract.js-core library , with default value ' https://cdn.rawgit.com/naptha/tesseract.js-core/master/index.js '. Set this string before calling Tesseract.recognize and Tesseract.detect if you want Tesseract.js to use a different file.
  workerPath

   A string specifying the location of thetesseract.worker.js file, with default value ' https://cdn.rawgit.com/naptha/tesseract.js/8b915dc/dist/tesseract.worker.js '. Set this string before calling Tesseract.recognize and Tesseract.detect if you want Tesseract.js to use a different file.
  langPath

   A string specifying the location of the tesseract language files, with default value ' https://cdn.rawgit.com/naptha/tessdata/gh-pages/3.02/ '. Language file urls are calculated according to the formula langPath + langCode + '.traineddata.gz' . Set this string before calling Tesseract.recognize and Tesseract.detect if you want Tesseract.js to use different language files.
  Contributing

  Development

  To run a development copy of tesseract.js, first clone this repo.
  [code]5[/code]   Then, cd in to the folder, npm install , and npm start
  [code]6[/code]   Then open http://localhost:7355/examples/file-input/demo.html in your favorite browser. The devServer automatically rebuilds tesseract.js and tesseract.worker.js when you change files in the src folder.
  Building Static Files

   After you've cloned the repo and run npm install as described in theDevelopment Section, you can build static library files in the dist folder with
  [code]7[/code]  Send us a Pull Request!

  Thanks :)
友荐云推荐




上一篇:xUnique - 摆脱 XCode 的 project 文件冲突
下一篇:如何在自己的网站安装一个搜索引擎
酷辣虫提示酷辣虫禁止发表任何与中华人民共和国法律有抵触的内容!所有内容由用户发布,并不代表酷辣虫的观点,酷辣虫无法对用户发布内容真实性提供任何的保证,请自行验证并承担风险与后果。如您有版权、违规等问题,请通过"联系我们"或"违规举报"告知我们处理。

刘述平 发表于 2016-10-13 06:38:56
呵呵,低调,低调!
回复 支持 反对

使用道具 举报

顺丰单水电费 发表于 2016-10-13 08:30:08
要戒烟,早睡,好好的死。
回复 支持 反对

使用道具 举报

御龙在天 发表于 2016-10-16 09:25:04
看贴回贴,繁荣网络、利人利己
回复 支持 反对

使用道具 举报

黄建 发表于 2016-10-18 10:57:43
不回帖的话就太任性了
回复 支持 反对

使用道具 举报

*滑动验证:
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

我要投稿

推荐阅读

扫码访问 @iTTTTT瑞翔 的微博
回页顶回复上一篇下一篇回列表手机版
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 )|网站地图 酷辣虫

© 2001-2016 Comsenz Inc. Design: Dean. DiscuzFans.

返回顶部 返回列表