Nuxeo Marketplace

Find Nuxeo packages for your application

Nuxeo OCR

By Oceane Consulting DM

The Oceane Consulting OCR Platform addon provides a wrapper for OCR services. Currently, it supports the OCR platform services provided by Oceane Consulting DM which is based on Tesseract provider. It provides automation operations for fulltext extraction based on pictures content. Extraction can be automatically launched on document creation or update but also by end-user through the Web UI interface. OCR services provided by Oceane Consulting is free for testing purpose.

Partner Certified Addon

This addon is provided by Oceane Consulting, and validated as a Nuxeo Partner Premier Certified Addon. Any bugs or improvements should be reported to the partner (please do not fill any Nuxeo Jira ticket related to this addon)

Usage

Once the addon is installed and configured, a listener updates automatically all images matching configured mimetypes with fulltext content extracted from document using the remote service API of Oceane Consulting OCR Services.

Installation

Oceane Consulting OCR Platform requires the installation of the Oceane Consulting OCR Platform Package. For testing purpose, access to OCR API is free. For production purpose, please contact Oceane Consulting DM (contact informations in the link Read Documentation). Network access must be available between Nuxeo instance and Oceane Consulting OCR services. Oceane Consulting provides also a sample package including sample document type, automation chains and WebUI integration.

Configuration

You need to reference in nuxeo.conf the fields you want to store fulltext (simple and complex).A pre-defined access key is defined for testing purpose, for any production environment, please contact Oceane Consulting DM.

In nuxeo.conf, available configurations can be defined :

  • ocdm.ocr.simple.fulltext.field : set the field to save simple fulltext value (like ocr:fulltext)

  • ocdm.ocr.complex.fulltext.field : set the field to save advanced fulltext value in a complex type (like ocr:advanced)

  • ocdm.ocr.endpoint : The OCR service URL (https://dm-ocr.oceaneconsulting.com by default)

  • ocdm.ocr.default.language : Language used by default for OCR (English value by default)

  • ocdm.ocr.default.provider : Provider used by the service for OCR (Tesseract by default)

  • ocdm.ocr.trustStorePath : TrustStore path for SSL connection to the Oceane Consulting DM OCR service (check our documentation on how-to define truststore configuration)

  • ocdm.ocr.trustStorePassword : TrustStore password (check our documentation on how-to define truststore configuration)

  • ocdm.ocr.trustStoreType : TrustStore type (JKS for example) for SSL connection to the Oceane Consulting DM OCR service (check our documentation on how-to define truststore configuration) A sample project nuxeo-ocr-connector-test is also available presenting the features and how to use them in Studio (custom button and WebUI, operations, …)

N.B. Using the Oceane Consulting DM OCR Service for testing purpose implies no responsability of Oceane Consulting DM for any case.

For more information about this addon, please contact Oceane Consulting DM

Downloads
Video of the Nuxeo OCR Addon
(01-Nuxeo-OCR-WebUI-Demo.mp4, 4,791,201b)
Install and configuration guide
(Nuxeo OCR addon - Install and configuration guide - 1.0.2.pdf, 458,433b)
User guide
(Nuxeo OCR addon - User guide - 1.0.0.pdf, 1,226,941b)