Extracting text and structure information from documents is a core enabling technology for robotic process automation and workflow automation. Since its preview release in May 2019, Azure Form Recognizer has attracted thousands of customers to extract text, key and value pairs, and tables from documents to accelerate their business processes.
Today, we're sharing the new Form Recognizner features that are available.
Updates for Azure Form Recognizer
The Form Recognizer March release is a major update that includes many new features our customers have asked for:
- Customization: The service now supports training with and without labels, which makes it easier for customers to reliably extract valuable information from their forms. The APIs have also been redesigned as long-running operations to improve support for larger customer data sets. Automatic detection of key value pairs and table extraction have been enhanced and improved. A new sample labeling tool UX container will help customers label data more efficiently and extract the values of interest.
Form Recognizer Custom: Train with Labels, Form Recognizer Sample Labeling Tool.
In addition, Form Recognizer Sample Labeling Tool is now available as an open source project located here. You can integrate it within your solutions and make customer-specific changes to meet your needs.
- Layout: We released a new Layout API that is capable of extracting text and tables from documents with high accuracy optical character recognition (OCR) results on small texts. It also extracts tables from arbitrary documents, enabling a very popular application scenario for document extraction.
Layout text and table extraction: Table extracted with 5 columns and 30 rows.
- Pre-Built Receipt: The new version features major accuracy improvements. Error rates for certain fields like merchant name, phone number, transaction time, and subtotal have been reduced by more than 30 percent. We also added support for recognizing tips, receipt type, and line items, as well as providing confidence values.
Pre-built Receipt: Key fields extracted from itemized sales receipt.
Learn more on what’s new in Form Recognizer here.
Acumatica and Zelros are customers using Azure Form Recognizer and have shared their experiences with Microsoft.
“By automating expense reporting with Form Recognizer, we can eliminate almost all human errors—which really helps accounting teams streamline approvals and reimbursement.“ Ajoy Krishnamoorthy, Vice President of Platform and Technology Acumatica.
Learn more in our case study with Acumatica here.
“Zelros Documents2Insights leverages Form Recognizer to speed up the insurers' and bancassurers’ underwriting process. Identity card, proof of residence, vehicle registration document, driving license, and more. Speeding up and simplifying this business process is key to improve the experience of policyholders. Zelros Documents2Insights automates the underwriting processes, based on the Cognitive Services Computer Vision API and built on top of the Form Recognizer feature, the solution automatically reads and analyzes documents. It also cross-references information in order to correct and lower the error rate, while complying with regulatory requirements. With this, we are to process documents and subscriptions faster.” Fabien Vauchelles, CTO of Zelros
To get started, please login to the Azure Portal to create a Form Recognizer resource. Once your resource is created you can extract data from your forms by following one of our Quickstart templetes:
Custom: Train a custom model for your forms to extract text, key value pairs, and tables.
- Train without labels:
- Train with labels:
Prebuilt receipts: Extract data from USA sales receipts.
- Quickstart: Extract receipt data using the REST API with cURL.
- Quickstart: Extract receipt data using the REST API with Python.
Layout: Extract text and table structure (row and column numbers) from your documents.