This content applies to: v4.0 (preview) | Previous versions: v3.1 (GA) v3.0 (GA) v2.1 (GA)
This content applies to: v3.1 (GA) | Latest version: v4.0 (preview) | Previous versions: v3.0 v2.1
This content applies to: v3.0 (GA) | Latest versions: v4.0 (preview) v3.1 | Previous version: v2.1
This content applies to: v2.1 | Latest version: v4.0 (preview)
The Document Intelligence invoice model uses powerful Optical Character Recognition (OCR) capabilities to analyze and extract key fields and line items from sales invoices, utility bills, and purchase orders. Invoices can be of various formats and quality including phone-captured images, scanned documents, and digital PDFs. The API analyzes invoice text; extracts key information such as customer name, billing address, due date, and amount due; and returns a structured JSON data representation. The model currently supports invoices in 27 languages.
Supported document types:
Automated invoice processing is the process of extracting key accounts payable fields from billing account documents. Extracted data includes line items from invoices integrated with your accounts payable (AP) workflows for reviews and payments. Historically, the accounts payable process is performed manually and, hence, very time consuming. Accurate extraction of key data from invoices is typically the first and one of the most critical steps in the invoice automation process.
Sample invoice processed with Document Intelligence Studio:
Document Intelligence v4.0 (2024-07-31-preview) supports the following tools, applications, and libraries:
Feature | Resources | Model ID |
---|---|---|
Invoice model | • Document Intelligence Studio • REST API • C# SDK • Python SDK • Java SDK • JavaScript SDK | prebuilt-invoice |
Document Intelligence v3.1 supports the following tools, applications, and libraries:
Feature | Resources | Model ID |
---|---|---|
Invoice model | • Document Intelligence Studio • REST API • C# SDK • Python SDK • Java SDK • JavaScript SDK | prebuilt-invoice |
Document Intelligence v3.0 supports the following tools, applications, and libraries:
Feature | Resources | Model ID |
---|---|---|
Invoice model | • Document Intelligence Studio • REST API • C# SDK • Python SDK • Java SDK • JavaScript SDK | prebuilt-invoice |
Document Intelligence v2.1 supports the following tools, applications, and libraries:
Feature | Resources |
---|---|
Invoice model | • Document Intelligence labeling tool • REST API • Client-library SDK • Document Intelligence Docker container |
Model | Image: JPEG/JPG , PNG , BMP , TIFF , HEIF | Microsoft Office: Word ( DOCX ), Excel ( XLSX ), PowerPoint ( PPTX ), HTML | |
---|---|---|---|
Read | ✔ | ✔ | ✔ |
Layout | ✔ | ✔ | ✔ (2024-07-31-preview, 2024-02-29-preview, 2023-10-31-preview) |
General Document | ✔ | ✔ | |
Prebuilt | ✔ | ✔ | |
Custom extraction | ✔ | ✔ | |
Custom classification | ✔ | ✔ | ✔ (2024-07-31-preview, 2024-02-29-preview) |
See how data, including customer information, vendor details, and line items, is extracted from invoices. You need the following resources:
The Sample Labeling tool does not support the BMP file format. This is a limitation of the tool not the Document Intelligence Service.
See our Language Support—prebuilt models page for a complete list of supported languages.
The Document Intelligence invoice model prebuilt-invoice extracts the following fields.