Document Intelligence invoice model

This content applies to: checkmarkv4.0 (preview) | Previous versions: blue-checkmarkv3.1 (GA) blue-checkmarkv3.0 (GA) blue-checkmarkv2.1 (GA)

This content applies to: checkmarkv3.1 (GA) | Latest version: purple-checkmarkv4.0 (preview) | Previous versions: blue-checkmarkv3.0 blue-checkmarkv2.1

This content applies to: checkmarkv3.0 (GA) | Latest versions: purple-checkmarkv4.0 (preview) purple-checkmarkv3.1 | Previous version: blue-checkmarkv2.1

This content applies to: checkmarkv2.1 | Latest version: blue-checkmarkv4.0 (preview)

The Document Intelligence invoice model uses powerful Optical Character Recognition (OCR) capabilities to analyze and extract key fields and line items from sales invoices, utility bills, and purchase orders. Invoices can be of various formats and quality including phone-captured images, scanned documents, and digital PDFs. The API analyzes invoice text; extracts key information such as customer name, billing address, due date, and amount due; and returns a structured JSON data representation. The model currently supports invoices in 27 languages.

Supported document types:

Automated invoice processing

Automated invoice processing is the process of extracting key accounts payable fields from billing account documents. Extracted data includes line items from invoices integrated with your accounts payable (AP) workflows for reviews and payments. Historically, the accounts payable process is performed manually and, hence, very time consuming. Accurate extraction of key data from invoices is typically the first and one of the most critical steps in the invoice automation process.

Sample invoice processed with Document Intelligence Studio:

Screenshot of a sample invoice analyzed in the Document Intelligence Studio.

Screenshot of a sample invoice.

Development options

Document Intelligence v4.0 (2024-07-31-preview) supports the following tools, applications, and libraries:

Feature Resources Model ID
Invoice model Document Intelligence Studio
REST API
C# SDK
Python SDK
Java SDK
JavaScript SDK
prebuilt-invoice

Document Intelligence v3.1 supports the following tools, applications, and libraries:

Feature Resources Model ID
Invoice model Document Intelligence Studio
REST API
C# SDK
Python SDK
Java SDK
JavaScript SDK
prebuilt-invoice

Document Intelligence v3.0 supports the following tools, applications, and libraries:

Feature Resources Model ID
Invoice model Document Intelligence Studio
REST API
C# SDK
Python SDK
Java SDK
JavaScript SDK
prebuilt-invoice

Document Intelligence v2.1 supports the following tools, applications, and libraries:

Feature Resources
Invoice model Document Intelligence labeling tool
REST API
Client-library SDK
Document Intelligence Docker container

Input requirements

Model PDF Image:
JPEG/JPG , PNG , BMP , TIFF , HEIF
Microsoft Office:
Word ( DOCX ), Excel ( XLSX ), PowerPoint ( PPTX ), HTML
Read
Layout ✔ (2024-07-31-preview, 2024-02-29-preview, 2023-10-31-preview)
General Document
Prebuilt
Custom extraction
Custom classification ✔ (2024-07-31-preview, 2024-02-29-preview)

Invoice model data extraction

See how data, including customer information, vendor details, and line items, is extracted from invoices. You need the following resources:

Screenshot of keys and endpoint location in the Azure portal.

Screenshot of Run analysis and Analyze options buttons in the Document Intelligence Studio.

  1. On the Document Intelligence Studio home page, select Invoices.
  2. You can analyze the sample invoice or upload your own files.
  3. Select the Run analysis button and, if necessary, configure the Analyze options :

Document Intelligence Sample Labeling tool

  1. Navigate to the Document Intelligence Sample Tool.
  2. On the sample tool home page, select the Use prebuilt model to get data tile. Screenshot of layout model analyze results process.
  3. Select the Form Type to analyze from the dropdown menu.
  4. Choose a URL for the file you would like to analyze from the below options:
  5. In the Source field, select URL from the dropdown menu, paste the selected URL, and select the Fetch button. Screenshot of source location dropdown menu.
  6. In the Document Intelligence service endpoint field, paste the endpoint that you obtained with your Document Intelligence subscription.
  7. In the key field, paste the key you obtained from your Document Intelligence resource. Screenshot showing the select-form-type dropdown menu.
  8. Select Run analysis. The Document Intelligence Sample Labeling tool calls the Analyze Prebuilt API and analyze the document.
  9. View the results - see the key-value pairs extracted, line items, highlighted text extracted, and tables detected. Screenshot of layout model analyze results operation.

The Sample Labeling tool does not support the BMP file format. This is a limitation of the tool not the Document Intelligence Service.

Supported languages and locales

See our Language Support—prebuilt models page for a complete list of supported languages.

Field extraction

The Document Intelligence invoice model prebuilt-invoice extracts the following fields.