Document Information Extraction: Public 2024-05-13
Document Information Extraction: Public 2024-05-13
PUBLIC
2024-05-13
3 Concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Service Plans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8 Initial Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.1 Enabling the Service in the Cloud Foundry Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2 Enabling the Service in the Kyma Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
11 Tutorials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12 Development. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102
12.1 API Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Get Access Token. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
18 Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
18.1 Data Protection and Privacy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .288
18.2 Auditing and Logging Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.3 Front-End Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Document Information Extraction helps you to process large amounts of business documents that have
content in headers and tables. You can use the extracted information, for example, to automatically process
payables, invoices, or payment notes while making sure that invoices and payables match. After you upload a
document file to the service, it returns the extraction results from header fields and line items.
Tip
The service performs the following steps to extract information from the uploaded document file:
You can also use the Document Information Extraction UI to consume the service. See Using the Document
Information Extraction UI [page 234] to find out how to subscribe, access, and use the user interface
application for the service.
Features
Automate information Automate the extraction of relevant information from business documents. The
extraction Document API takes document files as input and returns header fields and line items
as structured data.
Automate data Match a business document to enrichment data records based on the information
enrichment extracted from the document. The Enrichment Data API takes document files as
input and returns the ID of the matching enrichment data records.
Benefit from Use this service in tenant-aware (multitenant) applications. Run them on a shared
multitenancy support compute unit that can be used by multiple consumers (tenants).
SAP may continuously improve the above listed core features and their functionalities provided as part
of the Document Information Extraction cloud service including automation, transaction processing, and
machine learning on behalf of the customer.
Tip
Use the data feedback collection feature to allow confirmed documents to be used to improve the
Document Information Extraction service.
SAP uses the identity and position of the document-specific fields (see Extracted Header Fields [page 278]
and Extracted Line Items [page 286]) as a feedback signal to continuously retrain the machine learning
models of the service. With this approach, SAP is able to reduce errors over time when predicting field
values from documents.
This is a platform functionality reused by other applications. SAP reserves the right to reject documents
submitted for retraining.
For more information, see Create Configuration [page 115], Confirm Document [page 157] and Data
Protection and Privacy [page 288].
Environment
Multitenancy Support
For information on multitenancy support, see Run the Service in a Multitenant Application [page 100].
Prerequisites
Technical Constraints
Trial Scope
Document Information Extraction is available for trial use. A trial account lets you try out SAP Business
Technology Platform (SAP BTP) for free and is open to everyone. Trial accounts are intended for personal
exploration, and not for productive use or team development. They allow restricted use of the platform
resources and services.
Note
See also the following information: Trial Accounts and Free Tier.
In the Cloud Foundry environment, you get a free trial account for Document Information Extraction with the
following constraints: Free Tier Option and Trial Account Technical Constraints [page 276].
Mod
Tech ular
nical Envi- Line Busi Lat-
Com ron- Life- of ness est Avail
po- men Ac- cy- Busi Proc Revi- able
nent t Title Description tion cle Type ness ess Product sion as of
Docu Clou Enrichment The Enrichment Data API [page Rec- Dep- Cha Intel- Not SAP 2024 2024
Business
men d Data API 166] endpoint Delete Enrich- om- re- nged li- ap- -05-1 -05-1
Foun Technology
t ment Data (Synchronous) - Dep- men cate gent pli- 3 3
dry Platform
Infor recated [page 181] is now ded d Tech ca-
mati deprecated and scheduled for nolo- ble
on decommissioning in November gies
Extra 2024. After that, the endpoint
ction will no longer be available.
Docu Clou Better Mod- The machine learning mod- Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d els for the els for the extraction of only eral nged li- ap- -05-1 -05-1
Foun Technology
t Extraction invoice, paymentAdvice, Avail gent pli- 3 3
dry Platform
Infor of Standard and purchaseOrder docu- abil- Tech ca-
mati Document ments have been improved. ity nolo- ble
on Types gies
Extra
ction
Docu Clou Better Ex- The extraction of the Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d traction of
rawValue response field has only eral nged li- ap- -05-1 -05-1
Foun Technology
t rawValue been improved for the standard Avail gent pli- 3 3
dry Platform
Infor for Stand- document types and fields. abil- Tech ca-
mati ard Docu- ity nolo- ble
See Get Result [page 138].
on ment Types gies
Extra and Fields
ction
Docu Clou Extracted You can now extract purchase Info Gen- New Intel- Not SAP 2024 2024
Business
men d Line Items order numbers that are availa- only eral li- ap- -03-1 -03-1
Foun Technology
t ble on line item field level from Avail gent pli- 1 1
dry Platform
Infor invoice documents. abil- Tech ca-
mati ity nolo- ble
See Extracted Line Items [page
on gies
286].
Extra
ction
Docu Clou Configura- Info Gen- New Intel- Not SAP 2024 2024
You can now use the client
Business
men d tion API scope configuration for the only eral li- ap- -03-1 -03-1
Foun Technology
t dataFeedbackCollection Avail gent pli- 1 1
dry Platform
Infor configuration key. abil- Tech ca-
mati ity nolo- ble
See Configuration Keys [page
on gies
117].
Extra
ction
Docu Clou Post Cata- You can now filter documents Info Gen- New Intel- Not SAP 2024 2024
Business
men d log based on schemaId. only eral li- ap- -03-1 -03-1
Foun Technology
t Avail gent pli- 1 1
dry See Post Catalog [page 134]. Platform
Infor abil- Tech ca-
mati ity nolo- ble
on gies
Extra
ction
Docu Clou New Invoice The Document Information Info Gen- New Intel- Not SAP 2024 2024
Business
men d Supported Extraction service supports now only eral li- ap- -03-1 -03-1
Foun Technology
t Language - the Japanese language for Avail gent pli- 1 1
dry Platform
Infor Japanese invoice documents. abil- Tech ca-
mati ity nolo- ble
See Invoice: Languages and
on gies
Countries/Regions [page 87].
Extra
ction
Docu Clou Better Mod- The machine learning mod- Info Gen- Cha Tech Not SAP 2024 2024
Business
men d els for the els for the extraction of only eral nged nol- ap- -03-1 -03-1
Foun Technology
t Extraction invoice, paymentAdvice, Avail ogy pli- 1 1
dry Platform
Infor of Standard and purchaseOrder docu- abil- ca-
mati Document ments have been improved. ity ble
on Types
Extra
ction
Docu Clou Better Ex- The template algorithm has Info Gen- Cha Tech Not SAP 2024 2024
Business
men d traction of been enhanced. Document only eral nged nol- ap- -02- -02-
Foun Technology
t Line Items Information Extraction now de- Avail ogy pli- 20 20
dry Platform
Infor from Multi- livers better results when ex- abil- ca-
mati page Docu- tracting line items from multi- ity ble
on ments with page documents with a table
Extra Template header that appears only on the
ction first page.
Docu Clou Combine You can now combine header Info Gen- Cha Intel- Not SAP 2024 2024
Different Business
men d fields with different setup types only eral nged li- ap- -02- -02-
Foun Setup Technology
t in the same schema. Avail gent pli- 20 20
dry Types When Platform
Infor Adding abil- Tech ca-
You can add header fields with
mati Data Fields ity nolo- ble
the following setup types to a
on to Schemas gies
schema created for a standard
Extra
document type:
ction
• auto (with and without a de-
fault extractor)
• manual
Restriction
The setup type auto is
available without default
extractor (extraction using
generative AI) for sche-
mas with the service
plan Document Information
Extraction, premium edi-
tion (premium_edition)
only. See Service Plans
[page 77] and Metering
and Pricing [page 79].
Docu Clou Invoices - The conversion of country spe- Info Gen- Cha Intel- Not SAP 2024 2024
Conversion Business
men d cific unit of measure values to only eral nged li- ap- -02- -02-
Foun of Country Technology
t ISO format for invoice docu- Avail gent pli- 20 20
dry Specific Platform
Infor Unit of ments has been improved. abil- Tech ca-
mati Measure ity nolo- ble
on Values to gies
Extra ISO Format
ction
Docu Clou Support for Info Gen- Cha Intel- Not SAP 2024 2024
The businessCard docu-
Business
men d
business
ments are now supported in the only eral nged li- ap- -02- -02-
Foun Card Technology
t AWS region Australia (Sydney). Avail gent pli- 20 20
dry Documents Platform
Infor abil- Tech ca-
in AWS re- See Supported Document Types
mati gion Aus- ity nolo- ble
and File Formats [page 84].
on tralia (Syd- gies
Extra ney)
ction
Docu Clou Download You can now download data Info Gen- New Intel- Not SAP 2024 2024
Business
men d Trouble- about documents added to only eral li- ap- -02- -02-
Foun Technology
t shooting the Document Information Avail gent pli- 05 05
dry Platform
Infor Data for Extraction UI for use in trouble- abil- Tech ca-
mati Documents shooting any issues. ity nolo- ble
on gies
See Download Troubleshooting
Extra
Data [page 241].
ction
Docu Clou Model Used The Document API now includes Info Gen- New Intel- Not SAP 2024 2024
Business
men d for Extrac- information about the model only eral li- ap- -02- -02-
Foun Technology
t tion used for extraction. As a result, Avail gent pli- 05 05
dry Platform
Infor you can see whether Document abil- Tech ca-
mati Information Extraction used a ity nolo- ble
on template or AI to extract infor- gies
Extra mation from a particular field.
ction
See Get Result [page 138].
Docu Clou New Loca- You now call up the Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d tion for Schema Configuration feature only eral nged li- ap- -02- -02-
Foun Technology
t Schema of Document Information Avail gent pli- 05 05
dry Platform
Infor Configura- Extraction UI directly from the abil- Tech ca-
mati tion Feature navigation bar on the left of the ity nolo- ble
on on UI screen. gies
Extra
See Create Schema [page 246].
ction
Docu Clou Extraction We’ve fixed an issue with ex- Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d of Descrip- tracting description values from only eral nged li- ap- -02- -02-
Foun Technology
t tions from columns. Avail gent pli- 05 05
dry Platform
Infor Columns abil- Tech ca-
Document Information
mati ity nolo- ble
Extraction now extracts the
on gies
complete content of large col-
Extra
umn cells containing descrip-
ction
tions of numbers or quantities,
for example.
Docu Clou Extraction We’ve fixed an issue with ex- Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d of Line tracting line items. only eral nged li- ap- -02- -02-
Foun Technology
t Items Avail gent pli- 05 05
dry If the template returns the ex- Platform
Infor abil- Tech ca-
traction result invalid, but the
mati ity nolo- ble
AI returns the extraction result
on gies
valid for the same line item,
Extra
the final result is now valid
ction
when Document Information
Extraction merges the two re-
sults.
Docu Clou Get Tem- Info Gen- Cha Intel- Not SAP 2024 2024
The limit parameter of the
Business
men d plates End- Get Templates endpoint is now only eral nged li- ap- -02- -02-
Foun Technology
t point independent of the order pa- Avail gent pli- 05 05
dry Platform
Infor rameter. abil- Tech ca-
mati ity nolo- ble
To apply the limit parameter,
on gies
you no longer need to specify a
Extra
ction value for order.
Docu Clou Display De- You can now display the de- Info Gen- Cha Intel- Not SAP 2024 2024
Business
men d scription for scription text for fields in only eral nged li- ap- -02- -02-
Foun Technology
t Fields in Ex- the Extraction Results pane Avail gent pli- 05 05
dry Platform
Infor traction Re- on the Document Information abil- Tech ca-
mati sults Extraction UI. ity nolo- ble
on gies
To view the description, open
Extra
the Extraction Results pane and
ction
hover over the name of a header
field or line item. A tooltip ap-
pears, which includes the de-
scription text.
Docu Clou Extracted The line items materialNumber Info Dep- Cha Intel- Not SAP 2024 2024
Business
men d Line Items - and senderMaterialNumber only re- nged li- ap- -02- -02-
Foun Technology
t materialNu were replaced by cate gent pli- 05 05
dry Platform
Infor mber and supplierMaterialNumber and d Tech ca-
mati senderMate customerMaterialNumber re- nolo- ble
on rialNumber spectively in the list of gies
Extra Depreca- fields that you can extract
ction tion in from purchaseOrder docu-
SAP_purch ments when using the
aseOrder_ SAP_purchaseOrder_schema.
schema
The legacy line
items materialNumber and
senderMaterialNumber are now
deprecated and no longer availa-
ble for purchaseOrder docu-
ments.
Docu Clou Extracted We updated the list of line Info Dep- Cha Intel- Not SAP 2024 2024
Business
men d Line Items - items that you can extract from only re- nged li- ap- -02- -02-
Foun Technology
t currencyCo purchaseOrder documents. cate gent pli- 05 05
dry Platform
Infor de Depreca- The currencyCode line item is d Tech ca-
mati tion now deprecated and no longer nolo- ble
on available for extraction. gies
Extra
See Extracted Line Items [page
ction
286].
Docu Clou Prefilled When you add data fields to Info Gen- New Tech Not SAP 2023 2023
Business
men d Setup schemas, the service now prefills only eral nol- ap- -12-1 -12-1
Foun Technology
t Types for the Setup Type field with default Avail ogy pli- 1 1
dry Platform
Infor Schema values. abil- ca-
mati Fields ity ble
Depending on whether you
on
use Document Information
Extra
Extraction, premium edition or
ction
base edition, the default values
are as follows:
• Premium edition
• Schemas for standard
and custom document
types: auto
• Base edition
• Schemas for standard
document types: auto
• Schemas for cus-
tom document types:
manual
Docu Clou Support for The Document Information Info Gen- New Tech Not SAP 2023 2023
Business
men d X.509 Au- Extraction APIs now support only eral nol- ap- -12-1 -12-1
Foun Technology
t thentication X.509 authentication. Avail ogy pli- 1 1
dry Platform
Infor abil- ca-
See Enable X.509 Authentica-
mati ity ble
tion [page 98].
on
Extra
ction
Docu Clou Auditing New client related events have Info Gen- New Tech Not SAP 2023 2023
Business
men d and Log- been created. only eral nol- ap- -12-1 -12-1
Foun Technology
t ging Infor- Avail ogy pli- 1 1
dry See Auditing and Logging Infor- Platform
Infor mation abil- ca-
mation [page 291].
mati ity ble
on
Extra
ction
Docu Clou Template From now, you can't download Info Gen- Cha Tech Not SAP 2023 2023
Business
men d API documents that are part of the only eral nged nol- ap- -12-1 -12-1
Foun Technology
t template export package but Avail ogy pli- 1 1
dry Platform
Infor haven't been malware-scanned abil- ca-
mati during upload. You can download ity ble
on malware-scanned documents
Extra only.
ction
See Export Template [page 223].
Docu Clou Document There have been several security Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Information improvements on the Document only eral nged nol- ap- -12-1 -12-1
Foun Technology
t Extraction Information Extraction UI. Avail ogy pli- 1 1
dry Platform
Infor UI abil- ca-
See Using the Document Infor-
mati ity ble
mation Extraction UI [page 234].
on
Extra
ction
Docu Clou New Serv- The service plan Info Gen- New Tech Not SAP 2023 2023
Business
men d ice Plan: Document Information only eral nol- ap- -12-0 -12-0
Foun Technology
t Document Extraction, premium edition Avail ogy pli- 6 6
dry Platform
Infor Informatio (premium_edition) is now abil- ca-
mati n generally available. ity ble
on Extraction,
The premium_edition service
Extra premium
plan allows you to use genera-
ction edition
tive AI to automate use cases
(premium_
for business document process-
edition)
ing with large language models
(LLMs). Use generative AI to
process business documents in
more than 40 languages, and
implement new business docu-
ment use cases with shorter
time to value.
Docu Clou Template The Template API [page 211] is Info Gen- New Tech Not SAP 2023 2023
Business
men d API now generally available. You can only eral nol- ap- -11-2 -11-2
Foun Technology
t now use the Template API end- Avail ogy pli- 7 7
dry Platform
Infor points to create, reuse, edit, and abil- ca-
mati delete templates based on sche- ity ble
on mas and document types.
Extra
ction
Docu Clou Machine For your convenience, machine Info Gen- New Tech Not SAP 2023 2023
Business
men d Translation translation from the original only eral nol- ap- -11-2 -11-2
Foun Technology
t available for and official English language is Avail ogy pli- 7 7
dry Platform
Infor the now available for the Document abil- ca-
mati Document Information Extraction docu- ity ble
on Information mentation on SAP Help Portal in
Extra Extraction the following languages:
ction SAP Help
• Chinese Simplified
Portal Doc-
umentation
• French
• German
• Italian
• Japanese
• Korean
• Portuguese
• Spanish
Docu Clou Configura- In addition to the already availa- Info Gen- Cha Tech Not SAP 2024 2023
tion API and Business
men d ble instance and tenant only eral nged nol- ap- -01- -11-2
Foun Notifica- Technology
t scopes, you can now also use Avail ogy pli- 08 7
dry tions Platform
Infor the abil- ca-
mati activateDocumentNotifi ity ble
on cations configuration key on
Extra client scope level to enable
ction the Notifications [page 227]
functionality and get notifica-
tions about the status of your
processed documents.
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of invoice only eral nged nol- ap- -11-2 -11-2
Foun Technology
t the Extrac- documents has been improved. Avail ogy pli- 7 7
dry Platform
Infor tion of The improvements include bet- abil- ca-
mati invoice ter extraction results for cur- ity ble
on Documents rency, country and date fields.
Extra Additionally, the service now
ction supports the following coun-
tries/regions for Invoice: Lan-
guages and Countries/Regions
[page 87] documents (and
their corresponding languages):
• Hungary (Hungarian)
• Romania (Romanian)
• Türkiye (Turkish)
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nged nol- ap- -11-2 -11-2
Foun Technology
t the Extrac- paymentAdvice documents Avail ogy pli- 7 7
dry Platform
Infor tion of has been improved. The im- abil- ca-
mati paymentA provements include better ex- ity ble
on dvice traction results for currency and
Extra Documents country fields.
ction
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nged nol- ap- -11-2 -11-2
Foun Technology
t the Extrac- purchaseOrder documents Avail ogy pli- 7 7
dry Platform
Infor tion of has been improved. The im- abil- ca-
mati purchase provements include better ex- ity ble
on Order traction results for currency,
Extra Documents country and date fields. Addi-
ction tionally, the service now sup-
ports the extraction of quanti-
ties with multipliers, for example,
"2x5".
Docu Clou Enrichment The orderby parameter was Info Dep- Cha Tech Not SAP 2023 2023
Business
men d Data API replaced by order in December only re- nged nol- ap- -11-2 -11-2
Foun Technology
t 2022. cate ogy pli- 7 7
dry Platform
Infor d ca-
The legacy orderby parameter
mati ble
is now deprecated and no longer
on
available.
Extra
ction See List Data-Persistence Jobs
[page 174].
Docu Clou New Gener- The tutorial Use Trial to Ex- Info Gen- New Tech Not SAP 2023 2023
ative AI Tu- Business
men d tract Information from Custom only eral nol- ap- -11-1 -11-1
Foun torial Technology
t Documents with Generative Avail ogy pli- 0 0
dry Platform
Infor AI and Document Information abil- ca-
mati Extraction is now available. ity ble
on
Learn how to use Document
Extra
Information Extraction with gen-
ction
erative AI to automate the ex-
traction of information from cus-
tom document types using large
language models (LLMs).
Docu Clou Data Feed- You can now use the feed- Info Gen- New Tech Not SAP 2023 2023
Business
men d back Col- back collection feature in only eral nol- ap- -11-0 -11-0
Foun Technology
t lection for the Document Information Avail ogy pli- 5 5
dry Platform
Infor Model Im- Extraction UI to consent to the abil- ca-
mati provement use of confirmed documents to ity ble
on retrain the service’s machine
Extra learning models.
ction
See Confirm Documents [page
244].
Docu Clou Document The look and feel of Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Information the Document Information only eral nged nol- ap- -11-2 -11-0
Foun Technology
t Extraction Extraction UI has been updated Avail ogy pli- 9 5
dry Platform
Infor UI to provide the latest SAP Fiori abil- ca-
mati user experience. ity ble
on
Extra
ction
Docu Clou Edit Tem- You can now edit templates. In Info Gen- New Tech Not SAP 2023 2023
Business
men d plate addition to changing the name only eral nol- ap- -11-2 -10-2
Foun Technology
t and description, you can choose Avail ogy pli- 9 3
dry Platform
Infor a different schema for the tem- abil- ca-
mati plate. Changing the schema ity ble
on makes a new set of extraction
Extra fields available for the template.
ction
If you’ve already edited extrac-
tion results for sample docu-
ments associated with your tem-
plate, these edits are preserved
following the change of schema
if the relevant fields appear
in both the old and the new
schema.
Docu Clou Field Label In Schema Configuration, you Info Gen- New Tech Not SAP 2023 2023
Business
men d can now optionally enter a field only eral nol- ap- -10-2 -10-2
Foun Technology
t label in the Add Data Field dialog. Avail ogy pli- 3 3
dry Platform
Infor These labels enable you to give abil- ca-
mati user-friendly names to some or ity ble
on all of the header fields and line
Extra item fields that you add to sche-
ction mas.
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of invoice only eral nged nol- ap- -10-2 -10-2
Foun Technology
t the Extrac- documents has been improved. Avail ogy pli- 3 3
dry Platform
Infor tion of The improvements include bet- abil- ca-
mati invoice ter extraction results for date ity ble
on Documents fields.
Extra
ction
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nged nol- ap- -10-2 -10-2
Foun Technology
t the Extrac- paymentAdvice documents Avail ogy pli- 3 3
dry Platform
Infor tion of has been improved. The im- abil- ca-
mati paymentA provements include better ex- ity ble
on dvice traction results for date fields,
Extra Documents and amount fields in line items.
ction
Docu Clou Built-In You can now use the integrated Info Gen- New Tech Not SAP 2023 2023
Business
men d Support Built-In Support tool to quickly only eral nol- ap- -10-0 -10-0
Foun Technology
t find answers to your support-re- Avail ogy pli- 9 9
dry Platform
Infor lated questions. abil- ca-
mati ity ble
Built-In Support is an embedded
on
digital assistant that allows you
Extra
to search for support-related in-
ction
formation without leaving the UI.
Docu Clou Configura- The Info Gen- New Tech Not SAP 2023 2023
tion API Business
men d enrichmentConfidenceTh only eral nol- ap- -10-0 -10-0
Foun Technology
t reshold configuration key is Avail ogy pli- 9 9
dry Platform
Infor now available. You can now ad- abil- ca-
mati just the similarity confidence ity ble
on threshold for the enrichment.
Extra
See Create Configuration [page
ction
115], Configuration Keys [page
117], and Enrichment Data API
[page 166].
Docu Clou New Auto- You can now have the Document Info Gen- New Tech Not SAP 2023 2023
Business
men d save Fea- Information Extraction UI save only eral nol- ap- -10-0 -10-0
Foun Technology
t ture for Ed- your edits to extraction results. Avail ogy pli- 9 9
dry Platform
Infor iting Extrac- abil- ca-
When you choose Autosave on
mati tion Results ity ble
the Edit Extraction Results pane
on
in the Documents feature, the
Extra
service saves your work auto-
ction
matically at 10-second intervals.
Docu Clou New The setup types auto and Info Gen- New Tech Not SAP 2023 2023
Schema Business
men d manual are now available when only eral nol- ap- -10-0 -10-0
Foun Field Setup Technology
t you add data fields to new sche- Avail ogy pli- 9 9
dry Types Platform
Infor mas. abil- ca-
mati ity ble
See Add Fields to Schema Ver-
on
sion [page 199] and Add Data
Extra
Fields [page 247].
ction
Docu Clou Technical You can now associate a max- Info Gen- New Tech Not SAP 2023 2023
d Business
men Constraints imum of 5 documents with a only eral nol- ap- -10-0 -10-0
Foun Technology
t template. Avail ogy pli- 9 9
dry Platform
Infor abil- ca-
See Technical Constraints [page
mati ity ble
275], Free Tier Option and Trial
on
Account Technical Constraints
Extra
[page 276] and Add Documents
ction
and Activate/Deactivate Tem-
plate [page 253].
Docu Clou Associated You can now associate docu- Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Confirmed ments that have the status only eral nged nol- ap- -10-0 -10-0
Foun Technology
t Documents “CONFIRMED” with templates. Avail ogy pli- 9 9
dry Platform
Infor with Tem- abil- ca-
If you edit the extraction results
mati plates ity ble
for a document and then con-
on
firm the document, you can use
Extra
the Add to Document feature to
ction
associate the document with a
template.
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nged nol- ap- -10-0 -10-0
Foun Technology
t the Extrac- purchaseOrder documents Avail ogy pli- 9 9
dry Platform
Infor tion of has been improved. The im- abil- ca-
mati purchase provements include better ex- ity ble
on Order traction results for date fields
Extra Documents and better formatting of
ction amounts.
Docu Clou Role Collec- The role collection Info Dep- Cha Tech Not SAP 2023 2023
Business
men d tions Document_Information_E only re- nged nol- ap- -10-0 -10-0
Foun Technology
t xtraction_UI_Admin_Use cate ogy pli- 9 9
dry Platform
Infor r has been deprecated. d ca-
mati ble
To create or delete schemas and
on
templates, use the role collec-
Extra
tion
ction
Document_Information_E
xtraction_UI_Templates
_Admin.
Docu Clou Use Gener- You now have the option of using Info Re- New Tech Not SAP 2023 2023
Business
men d ative AI to generative AI to extract informa- only strict nol- ap- -11-1 -10-0
Foun Technology
t Extract In- tion from standard and custom ed ogy pli- 0 5
dry Platform
Infor formation document types. Avail ca-
mati from Stand- abil- ble
To use generative AI, select the
on ard and ity
setup type auto without a de-
Extra Custom
fault extractor when adding data
ction Document
fields to a schema for a standard
Types
or custom document type.
Restriction
This option is currently
available in SAP BTP trial ac-
counts only.
Docu Clou Schema You can now optionally use Info Gen- New Tech Not SAP 2023 2023
Business
men d API - Add the label property to en- only eral nol- ap- -09- -09-
Foun Technology
t Schema ter field labels. These la- Avail ogy pli- 04 04
dry Platform
Infor Fields bels enable you to give user- abil- ca-
mati friendly names to some or all ity ble
on of the headerFields and
Extra lineItemFields that you
ction include in the Add Fields to
Schema Version [page 199]
payload.
Docu Clou Free Tier Free tier and trial account users Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Option and can now: only eral nged nol- ap- -09- -09-
Foun Technology
t Trial Ac- Avail ogy pli- 04 04
dry • Upload up to 50 document Platform
Infor count Tech- abil- ca-
pages per tenant in a rolling
mati nical Con- ity ble
period of 30 days.
on straints
Extra
• Create up to 1000 schemas
per client.
ction
See Free Tier Option and Trial
Account Technical Constraints
[page 276].
Docu Clou Extraction You no longer need to save Info Gen- New Tech Not SAP 2023 2023
Business
men d Results extraction results manually be- only eral nol- ap- -08-1 -08-1
Foun Technology
t Saved Au- fore associating documents Avail ogy pli- 8 8
dry Platform
Infor tomatically with templates. The Document abil- ca-
mati when Docu- Information Extraction UI now ity ble
on ments As- saves these results automati-
Extra sociated cally.
ction with Tem-
plates
Docu Clou Schema The Schema API [page 184] is Info Gen- New Tech Not SAP 2023 2023
Business
men d API now generally available. You can only eral nol- ap- -08-1 -08-1
Foun Technology
t now use the Schema API end- Avail ogy pli- 8 8
dry Platform
Infor points to create, list, update, and abil- ca-
mati delete schemas and schema ver- ity ble
on sions.
Extra
ction
Docu Clou Technical The maximum total number of Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Constraints header fields and line items you only eral nged nol- ap- -08-1 -08-1
Foun Technology
t can add per schema is now 500. Avail ogy pli- 8 8
dry Platform
Infor abil- ca-
See Technical Constraints [page
mati ity ble
275].
on
Extra
ction
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of invoice only eral nged nol- ap- -08-1 -08-1
Foun Technology
t the Extrac- documents has been improved. Avail ogy pli- 8 8
dry Platform
Infor tion of The improvements include bet- abil- ca-
mati invoice ter extraction results for bank ity ble
on Documents account numbers, amounts with
Extra non-standard formats and nu-
ction merical dates with whitespaces.
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -08-1 -08-1
Foun Technology
t Avail ogy pli- 8 8
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Delete When editing extraction results Info Gen- New Tech Not SAP 2023 2023
Business
men d Bounding with the Document Information only eral nol- ap- -07- -07-
Foun Technology
t Boxes Extraction UI, you can now de- Avail ogy pli- 26 26
dry Platform
Infor lete bounding boxes together abil- ca-
mati with their coordinates. ity ble
on
See View and Edit Extraction Re-
Extra
sults [page 242].
ction
Docu Clou Display and When editing extraction results Info Gen- New Tech Not SAP 2023
Business
men d Edit Bound- with the Document Information only eral nol- ap- -07-
Foun Technology
t ing Boxes Extraction UI, you can now Avail ogy pli- 26
dry Platform
Infor open the Assign Field dialog for abil- ca-
mati bounding boxes by choosing the ity ble
on relevant tooltip in the page pre-
Extra view pane.
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -07- -07-
Foun Technology
t Avail ogy pli- 26 26
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -07-1 -07-1
Foun Technology
t Avail ogy pli- 7 7
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Technical The maximum number of tem- Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Constraints plates you can create has been only eral nged nol- ap- -06- -06-
Foun Technology
t increased from 1000 templates Avail ogy pli- 30 30
dry Platform
Infor per tenant to 1000 templates abil- ca-
mati per schema. ity ble
on
See Technical Constraints [page
Extra
275].
ction
Docu Clou Support for The Template [page 252] feature Info Gen- New Tech Not SAP 2023 2023
Business
men d Country supports now country code con- only eral nol- ap- -06- -06-
Foun Technology
t Code Con- version. Avail ogy pli- 22 22
dry Platform
Infor version in abil- ca-
mati Template ity ble
on
Extra
ction
Docu Clou New Data The new data type country/ Info Gen- New Tech Not SAP 2023 2023
Business
men d Type region is now available for only eral nol- ap- -06- -06-
Foun Technology
t country/ schema fields. Avail ogy pli- 22 22
dry Platform
Infor region for abil- ca-
See Add Data Fields [page 247].
mati Schema ity ble
on Fields
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -06-1 -06-1
Foun Technology
t Avail ogy pli- 3 3
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Issues with Some issues with codes Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Units of for units of measure in only eral nged nol- ap- -06-1 -06-1
Foun Technology
t Measure in purchaseOrder documents Avail ogy pli- 3 3
dry Platform
Infor purchase have now been resolved. abil- ca-
mati Order ity ble
on Documents
Extra Corrected
ction
Docu Clou Support for When you edit extraction results, Info Gen- New Tech Not SAP 2023 2023
Business
men d Bounding you can now draw bounding only eral nol- ap- -06- -06-1
Foun Technology
t Boxes boxes around parts of header Avail ogy pli- 30 3
dry Platform
Infor around field entries, instead of around abil- ca-
mati Parts of the entire entry. ity ble
on Fields
As a result, you can elimi-
Extra
nate unwanted or irrelevant el-
ction
ements, such as punctuation,
from strings and ensure that
they include only the values that
you need.
Docu Clou Better The machine learning model Info Gen- New Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nol- ap- -06-1 -06-1
Foun Technology
t the Extrac- paymentAdvice documents Avail ogy pli- 3 3
dry Platform
Infor tion of has been improved. abil- ca-
mati paymentA ity ble
on dvice
Extra Documents
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -05- -05-
Foun Technology
t Avail ogy pli- 23 23
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Setup Type The Add Data Field for schema Info Gen- New Tech Not SAP 2023 2023
Business
men d field on Add configuration now includes a only eral nol- ap- -05- -05-
Foun Technology
t Data Field new field: Setup Type. Avail ogy pli- 08 08
dry Platform
Infor dialog for abil- ca-
See the updated procedure in
mati schemas ity ble
Add Data Fields [page 247].
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -05- -05-
Foun Technology
t Avail ogy pli- 08 08
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Response The Document API endpoint Get Info Gen- New Tech Not SAP 2023 2023
Business
men d Field Result [page 138] includes a only eral nol- ap- -05- -05-
Foun Technology
t clientId new response field: clientId. Avail ogy pli- 08 08
dry Platform
Infor in Get Re- You can now identify the client abil- ca-
mati sult End- that submitted the extraction ity ble
on point request using the Upload Docu-
Extra ment [page 127] endpoint.
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -04- -04-
Foun Technology
t Avail ogy pli- 20 20
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Fixed Val- You can now include fixed val- Info Gen- New Tech Not SAP 2023 2023
Business
men d ues in Tem- ues for selected extraction fields only eral nol- ap- -04- -04-
Foun Technology
t plate Ex- in a template. If you intend to Avail ogy pli- 04 04
dry Platform
Infor traction use a template with documents abil- ca-
mati Fields from only one supplier, for exam- ity ble
on ple, you can define the supplier’s
Extra name as the fixed value for the
ction senderName field.
Docu Clou Scene Text You can now extract text from Info Gen- New Tech Not SAP 2023 2023
Business
men d Recognition images using the OCR engine only eral nol- ap- -04- -04-
Foun Technology
t Schema for scene text recognition. When Avail ogy pli- 04 04
dry Platform
Infor you create a schema with the abil- ca-
mati document type Custom, you can ity ble
on choose between two types of
Extra OCR engine (Document or Scene
ction Text), depending on whether the
text you wish to extract is in an
image or not.
Docu Clou Filtering, or- The new Document API endpoint Info Gen- New Tech Not SAP 2023 2023
Business
men d dering, and Post Catalog [page 134] is now only eral nol- ap- -04- -04-
Foun Technology
t pagination available. You can use the fol- Avail ogy pli- 04 04
dry Platform
Infor lowing catalog options to get a abil- ca-
mati list with all document processing ity ble
on jobs in a JSON file:
Extra
• Filtering
ction
• Ordering
• Pagination
Docu Clou Configura- The Info Gen- New Tech Not SAP 2023 2023
tion API and Business
men d activateDocumentNotifi only eral nol- ap- -04- -04-
Foun Notifica- Technology
t cations configuration key is Avail ogy pli- 04 04
dry tions Platform
Infor now available. You can now ena- abil- ca-
mati ble the Notifications [page 227] ity ble
on functionality to get notifications
Extra about the status of your proc-
ction essed documents.
Docu Clou New Proce- There’s now a new procedure for Info Gen- Cha Tech Not SAP 2023 2023
Business
men d dure for As- adding documents to templates only eral nged nol- ap- -04- -04-
Foun Technology
t sociating on the Document Information Avail ogy pli- 04 04
dry Platform
Infor Documents Extraction UI. In the past, abil- ca-
mati with Tem- you selected these documents ity ble
on plates when creating the template or
Extra added them later using the
ction Template feature. Now, you se-
lect documents using the new
Add to Template function in the
Document feature.
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of invoice only eral nged nol- ap- -04- -04-
Foun Technology
t the Extrac- documents has been improved. Avail ogy pli- 04 04
dry Platform
Infor tion of abil- ca-
mati invoice ity ble
on Documents
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -04- -04-
Foun Technology
t Avail ogy pli- 04 04
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Get Tem- The new Document API endpoint Info Gen- New Tech Not SAP 2023 2023
Business
men d plates End- Get Templates Associated with only eral nol- ap- -03-1 -03-1
Foun Technology
t point Document [page 164] is now Avail ogy pli- 4 4
dry Platform
Infor available. You can get all the abil- ca-
mati templates associated with the ity ble
on specified document ID.
Extra
ction
Docu Clou New Tem- The Document Information Info Gen- New Tech Not SAP 2023 2023
Business
men d plate Fea- Extraction UI Template [page only eral nol- ap- -03-1 -03-1
Foun Technology
t ture Sup- 252] feature supports now the Avail ogy pli- 4 4
dry Platform
Infor ported Lan- Greek language. abil- ca-
mati guage - ity ble
See Extraction Using Template:
on Greek
Languages [page 92].
Extra
ction
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2023 2023
Business
men d Model for for the extraction of only eral nged nol- ap- -03-1 -03-1
Foun Technology
t the Extrac- purchaseOrder documents Avail ogy pli- 4 4
dry Platform
Infor tion of has been improved. abil- ca-
mati purchase ity ble
on Order
Extra Documents
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -03-1 -03-1
Foun Technology
t Avail ogy pli- 4 4
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -03- -03-
Foun Technology
t Avail ogy pli- 01 01
dry The performance of the Tem- Platform
Infor abil- ca-
plate [page 252] feature has
mati ity ble
been improved.
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements and stability improvements. only eral nged nol- ap- -02-1 -02-1
Foun Technology
t Avail ogy pli- 7 7
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code, Info Gen- Cha Tech Not SAP 2023 2023
Business
men d provements security, and stability improve- only eral nged nol- ap- -02- -02-
Foun Technology
t ments. Avail ogy pli- 06 06
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Barcode You can now see in the response Info Gen- New Tech Not SAP 2023 2023
Business
men d Header from Get Result [page 138], in only eral nol- ap- -01-3 -01-3
Foun Technology
t Field Sym- the symbology response field, Avail ogy pli- 0 0
dry Platform
Infor bology the type of the extracted bar- abil- ca-
mati code header fields. ity ble
on
Extra
ction
Mod
Tech ular
nical Envi- Line Busi Lat-
Com ron- Life- of ness est Avail
po- men Ac- cy- Busi Proc Revi- able
nent t Title Description tion cle Type ness ess Product sion as of
Docu Clou Configura- Info Gen- New Tech Not SAP 2022 2022
The coordinateFormat
Business
men d tion API configuration key is now availa- only eral nol- ap- -12-1 -12-1
Foun Technology
t ble. You can now choose the for- Avail ogy pli- 9 9
dry Platform
Infor mat of the bounding box coordi- abil- ca-
mati nates in the extraction results. ity ble
on
See Create Configuration [page
Extra
115].
ction
Docu Clou Enrichment The orderby parameter has Rec- Gen- Cha Tech Not SAP 2022 2022
Business
men d Data API been replaced by order. om- eral nged nol- ap- -12-1 -12-1
Foun Technology
t men Avail ogy pli- 9 9
dry Platform
Infor Note ded abil- ca-
mati ity ble
The legacy orderby pa-
on
rameter will still be sup-
Extra
ported for a limited amount
ction
of time. Please start using
the new parameter (order)
as soon as possible.
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -12-1 -12-1
Foun Technology
t Avail ogy pli- 9 9
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Document The Document Information Info Gen- New Tech Not SAP 2022 2022
Business
men d Information Extraction UI and associated in- only eral nol- ap- -12-0 -12-0
Foun Technology
t Extraction app help are now available in the Avail ogy pli- 7 7
dry Platform
Infor UI following new languages: abil- ca-
mati ity ble
• Chinese Simplified
on
Extra
• Chinese Traditional
ction • French
• Italian
• Japanese
• Korean
• Portuguese
• Russian
• Spanish
Docu Clou Enrichment You can now see in the response Info Gen- New Tech Not SAP 2022 2022
Business
men d Data from Get Result [page 138], only eral nol- ap- -12-0 -12-0
Foun Technology
t Method in the method response field, Avail ogy pli- 7 7
dry Platform
Infor the match strategy used for abil- ca-
mati each matched enrichment data ity ble
on record.
Extra
ction
Docu Clou Change You can now change instances Info Gen- New Tech Not SAP 2022 2022
Business
men d Service In- on the Document Information only eral nol- ap- -12-0 -12-0
Foun Technology
t stance by Extraction UI by entering the Avail ogy pli- 7 7
dry Platform
Infor Name service instance name. abil- ca-
mati ity ble
See Subscribing to the Docu-
on
ment Information Extraction UI
Extra
[page 234].
ction
Docu Clou Better The machine learning model Info Gen- Cha Tech Not SAP 2022 2022
Business
men d Model for for the extraction of only eral nged nol- ap- -12-0 -12-0
Foun Technology
t the Extrac- purchaseOrder documents Avail ogy pli- 7 7
dry Platform
Infor tion of has been improved. abil- ca-
mati purchase ity ble
on Order
Extra Documents
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -12-0 -12-0
Foun Technology
t Avail ogy pli- 7 7
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Document The Document Information Info Gen- New Tech Not SAP 2022 2022
Business
men d Information Extraction UI and associated in- only eral nol- ap- -11-1 -11-1
Foun Technology
t Extraction app help are now available in Avail ogy pli- 5 5
dry Platform
Infor UI German. abil- ca-
mati ity ble
See Set Screen Language [page
on
237].
Extra
ction
Docu Clou SAP Sche- The preconfigured SAP schema Info Gen- New Tech Not SAP 2022 2022
Business
men d mas SAP_OCROnly_schema is now only eral nol- ap- -11-0 -11-0
Foun Technology
t available for custom documents Avail ogy pli- 9 9
dry Platform
Infor and OCR (Optical Character abil- ca-
mati Recognition) output only. ity ble
on
See Upload Document [page
Extra
127], Get Result [page 138], and
ction
Add Document [page 240].
Docu Clou Configura- Info Gen- New Tech Not SAP 2022 2022
You can now use the client
Business
men d tion API scope configuration for the only eral nol- ap- -11-0 -11-0
Foun Technology
t documentRetentionTimeD Avail ogy pli- 9 9
dry Platform
Infor ays configuration key. abil- ca-
mati ity ble
You can now use the optional
on
Extra parameters clientId and
Docu Clou Free Serv- The Template [page 252] feature Info Gen- Cha Tech Not SAP 2022 2022
Business
men d ice Plan is now also available to Free only eral nged nol- ap- -11-0 -11-0
Foun Technology
t service plan users. Avail ogy pli- 9 9
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -11-0 -11-0
Foun Technology
t Avail ogy pli- 9 9
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Role Collec- The role collection Info Gen- Cha Tech Not SAP 2022 2022
Business
men d tions Document_Information_E only eral nged nol- ap- -10-0 -10-0
Foun Technology
t xtraction_UI_Templates Avail ogy pli- 4 4
dry Platform
Infor _Admin now includes permis- abil- ca-
mati sions for reading and writing ity ble
on documents.
Extra
See Role Collections [page 236].
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -10-0 -10-0
Foun Technology
t Avail ogy pli- 4 4
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Enrichment The following Info Gen- New Tech Not SAP 2022 2022
Business
men d Data API paymentAdvice fields now only eral nol- ap- -09-1 -09-1
Foun Technology
t support enrichment: Avail ogy pli- 3 3
dry Platform
Infor abil- ca-
• taxId
mati ity ble
on • senderAddress
Extra • senderName
ction
See Extracted Header Fields
[page 278].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -09-1 -09-1
Foun Technology
t Avail ogy pli- 3 3
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Role Collec- The role collection Info Gen- New Tech Not SAP 2022 2022
Business
men d tions Document_Information_E only eral nol- ap- -08- -08-
Foun Technology
t xtraction_UI_Document_ Avail ogy pli- 30 30
dry Platform
Infor Viewer is now available. This abil- ca-
mati new collection allows users to ity ble
on read documents in the UI appli-
Extra cation.
ction
See Role Collections [page 236].
Docu Clou Client Seg- You can now restrict user access Info Gen- New Tech Not SAP 2022 2022
Business
men d regation to specified clients. only eral nol- ap- -08- -08-
Foun Technology
t Avail ogy pli- 30 30
dry See Create Configuration [page Platform
Infor abil- ca-
115] and Add Document [page
mati ity ble
240].
on
Extra
ction
Docu Clou Free Serv- The Free service plan is Info Gen- New Tech Not SAP 2022 2022
Business
men d ice Plan now available for Document only eral nol- ap- -08- -08-
Foun Technology
t Information Extraction. Avail ogy pli- 30 30
dry Platform
Infor abil- ca-
See Service Plans [page 77],
mati ity ble
Tutorials [page 101] and Free
on
Tier Option and Trial Account
Extra
Technical Constraints [page
ction
276].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -08- -08-
Foun Technology
t Avail ogy pli- 30 30
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Extracted You can now extract the Info Gen- New Tech Not SAP 2022 2022
Business
men d Header following header fields from only eral nol- ap- -08- -08-
Foun Technology
t Fields paymentAdvice documents: Avail ogy pli- 04 04
dry Platform
Infor abil- ca-
• senderAddress
mati ity ble
on • taxId
Docu Clou New Busi- Document Information Info Gen- New Tech Not SAP 2022 2022
ness Card Business
men d Extraction supports now only eral nol- ap- -08- -08-
Foun Supported Technology
t businessCard documents in Avail ogy pli- 04 04
dry Language: Platform
Infor Hebrew Hebrew. abil- ca-
mati ity ble
See Business Card: Languages
on
[page 86].
Extra
ction
Docu Clou Accessibil- Documentation on Accessibility Info Gen- New Tech Not SAP 2022 2022
Business
men d ity Features Features in Document Informa- only eral nol- ap- -08- -08-
Foun Technology
t tion Extraction [page 294] is now Avail ogy pli- 04 04
dry Platform
Infor available. abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -08- -08-
Foun Technology
t Avail ogy pli- 04 04
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Technical The maximum number of clients Info Gen- Cha Tech Not SAP 2022 2022
Business
men d Constraints you can create in one API call only eral nged nol- ap- -06- -06-
Foun Technology
t has increased from 10 to 5000. Avail ogy pli- 23 23
dry Platform
Infor abil- ca-
The maximum number of sche-
mati ity ble
mas per client and templates per
on
tenant has increased from 100
Extra
to 1000.
ction
See Technical Constraints [page
275].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -06- -06-
Foun Technology
t Avail ogy pli- 23 23
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Handwrit- The handwriting detection fea- Info Gen- New Tech Not SAP 2022 2022
Business
men d ing Detec- ture is now available. For now, it only eral nol- ap- -06- -06-
Foun Technology
t tion detects only handwriting in Eng- Avail ogy pli- 23 23
dry Platform
Infor lish. abil- ca-
mati ity ble
See Optical Character Recogni-
on
tion (OCR): Best Practices [page
Extra
258].
ction
Docu Clou Barcode It's now available the list of sup- Info Gen- New Tech Not SAP 2022 2022
ported countries/regions and Business
men d Supported only eral nol- ap- -06- -06-
Foun extracted fields for barcodes in Technology
t Countries/ Avail ogy pli- 03 03
dry Invoice: Languages and Coun- Platform
Infor Regions tries/Regions [page 87] docu- abil- ca-
mati and Ex- ments. ity ble
on tracted
Extra Fields for
ction Invoice
Documents
Docu Clou New Sup- Document Information Info Gen- New Tech Not SAP 2022 2022
Business
men d ported Extraction supports now the fol- only eral nol- ap- -06- -06-
Foun Technology
t Countries/ lowing new countries/regions for Avail ogy pli- 03 03
dry Platform
Infor Regions for Invoice: Languages and Coun- abil- ca-
mati Invoice tries/Regions [page 87] docu- ity ble
on Documents ments:
Extra
• Austria
ction
• Belgium
• Czech Republic
• Denmark
• Finland
• Norway
• Poland
• Portugal
• Slovakia
• Slovenia
• Sweden
Note
To support the new lan-
guages, the machine learn-
ing models have been ex-
tended. Consequently, pre-
dictions (field extractions
and corresponding confi-
dence scores) may differ
from previous releases.
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -06- -06-
Foun Technology
t Avail ogy pli- 03 03
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Document You can now see all matched en- Info Gen- New Tech Not SAP 2022 2022
Business
men d API richment data records in the Get only eral nol- ap- -05- -05-
Foun Technology
t Result [page 138] response. Avail ogy pli- 06 06
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Enrichment The Create Data Activation Info Gen- New Tech Not SAP 2022 2022
Business
men d Data API [page 179] endpoint now in- only eral nol- ap- -05- -05-
Foun Technology
t cludes the optional parameters Avail ogy pli- 06 06
dry Platform
Infor type and subtype. abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Deskew The service now automatically Info Gen- New Tech Not SAP 2022 2022
Business
men d rotates document images to only eral nol- ap- -05- -05-
Foun Technology
t compensate for skewing. Avail ogy pli- 06 06
dry Platform
Infor abil- ca-
See Supported Document Types
mati ity ble
and File Formats [page 84].
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -05- -05-
Foun Technology
t Avail ogy pli- 06 06
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Document The Upload Document [page Info Gen- New Tech Not SAP 2022 2022
Business
men d API 127] endpoint now includes a only eral nol- ap- -04- -04-
Foun Technology
t schemaId parameter. This pa- Avail ogy pli- 22 22
dry Platform
Infor rameter is required in payloads abil- ca-
mati that include templateId. ity ble
on
Extra
ction
Docu Clou Enrichment You can now use variants to Info Gen- New Tech Not SAP 2022 2022
Business
men d Data API create multiple versions of the only eral nol- ap- -04- -04-
Foun Technology
t same data record. Avail ogy pli- 22 22
dry Platform
Infor abil- ca-
See Create Enrichment Data
mati ity ble
[page 167], Data Variants [page
on
172] and Data Duplicates [page
Extra
173].
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -04- -04-
Foun Technology
t Avail ogy pli- 22 22
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Template You can now use templates to Info Gen- Cha Tech Not SAP 2022 2022
Business
men d extract multiple tables from the only eral nged nol- ap- -03- -03-
Foun Technology
t same page, provided the tables Avail ogy pli- 31 31
dry Platform
Infor all have a standard structure abil- ca-
mati and the same table headers. See ity ble
on General Recommendations and
Extra Limitations [page 266].
ction
Docu Clou Global Ac- You can now move subaccounts Info Gen- New Tech Not SAP 2022 2022
Business
men d counts between your global accounts. only eral nol- ap- -03- -03-
Foun Technology
t Avail ogy pli- 31 31
dry See Initial Setup [page 96]. Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Trial Ac- The Free Tier Option and Trial Info Gen- Cha Tech Not SAP 2022 2022
Business
men d count Tech- Account Technical Constraints only eral nged nol- ap- -03- -03-
Foun Technology
t nical Con- [page 276] documentation has Avail ogy pli- 31 31
dry Platform
Infor straints been updated. abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -03- -03-
Foun Technology
t Avail ogy pli- 31 31
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Support for If you create more than one Info Gen- New Tech Not SAP 2022 2022
Business
men d Multiple service instance, the Document only eral nol- ap- -03-1 -03-1
Foun Technology
t Service In- Information Extraction UI now Avail ogy pli- 7 7
dry Platform
Infor stances allows you to change between abil- ca-
mati instances. See Subscribing to ity ble
on the Document Information Ex-
Extra traction UI [page 234].
ction
Docu Clou Document You can now select folders Info Gen- New Tech Not SAP 2022 2022
Business
men d Feature containing multiple documents only eral nol- ap- -03-1 -03-1
Foun Technology
t for upload. The Document Avail ogy pli- 7 7
dry Platform
Infor Information Extraction UI now abil- ca-
mati displays thumbnails of docu- ity ble
on ments and allows you to re-
Extra name documents before upload-
ction ing them. See Add Document
[page 240].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -03-1 -03-1
Foun Technology
t Avail ogy pli- 7 7
dry Metering and pricing details for Platform
Infor abil- ca-
the Compute Hours for Base Ed-
mati ity ble
ition [page 80] have been up-
on
dated.
Extra
ction
Docu Clou Document You can now download extrac- Info Gen- New Tech Not SAP 2022 2022
Business
men d Extraction tion values before and after you only eral nol- ap- -02- -02-
Foun Technology
t Results edit and save them. Avail ogy pli- 03 03
dry Platform
Infor abil- ca-
See View and Edit Extraction Re-
mati ity ble
sults [page 242].
on
Extra
ction
Docu Clou Document You can now view the raw values Info Gen- New Tech Not SAP 2022 2022
Business
men d Extraction for extraction results. Raw val- only eral nol- ap- -02- -02-
Foun Technology
t Results ues are the original field values Avail ogy pli- 03 03
dry Platform
Infor before postprocessing, which abil- ca-
mati can differ from the correspond- ity ble
on ing extraction results.
Extra
See View and Edit Extraction Re-
ction
sults [page 242].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements and stability improvements. only eral nged nol- ap- -02- -02-
Foun Technology
t Avail ogy pli- 03 03
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou SAP Sche- The SAP schemas for standard Info Gen- Cha Tech Not SAP 2022 2022
Business
men d mas document types now have the only eral nged nol- ap- -01-1 -01-1
Foun Technology
t status ACTIVE. As a result, you Avail ogy pli- 8 8
dry Platform
Infor no longer need to create copies abil- ca-
mati of these schemas before using ity ble
on them to upload documents or
Extra create templates.
ction
See Schema Configuration
[page 245].
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements improvements. only eral nged nol- ap- -01-1 -01-1
Foun Technology
t Avail ogy pli- 8 8
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Enrichment The new Enrichment Data API Info Gen- New Tech Not SAP 2022 2022
Business
men d Data API endpoint List Data-Persistence only eral nol- ap- -01-1 -01-1
Foun Technology
t Jobs [page 174] is now available. Avail ogy pli- 0 0
dry Platform
Infor abil- ca-
The new enrichment data entity
mati ity ble
type Product [page 172] is now
on
available.
Extra
ction
Docu Clou Configura- Info Gen- New Tech Not SAP 2022 2022
The performPIICheck sub-
Business
men d tion API configuration is now available. only eral nol- ap- -01-1 -01-1
Foun Technology
t Avail ogy pli- 0 0
dry See Create Configuration [page Platform
Infor abil- ca-
115].
mati ity ble
on
Extra
ction
Docu Clou Mass Dele- The Document [page 239] fea- Info Gen- New Tech Not SAP 2022 2022
Business
men d tion of ture has been enhanced: you can only eral nol- ap- -01-1 -01-1
Foun Technology
t Documents now select multiple documents Avail ogy pli- 0 0
dry Platform
Infor for simultaneous deletion. abil- ca-
mati ity ble
on
Extra
ction
Docu Clou Overall Im- There have been several code Info Gen- Cha Tech Not SAP 2022 2022
Business
men d provements improvements. only eral nged nol- ap- -01-1 -01-1
Foun Technology
t Avail ogy pli- 0 0
dry Platform
Infor abil- ca-
mati ity ble
on
Extra
ction
Tech-
nical
Com- Envi- Avail-
po- Capa- ron- able
nent bility ment Title Description Action Type as of
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-1
ent sion Foun- Im- only ed 2-06
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-1
ent sion Foun- Im- only ed 1-23
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tem- Improved template extraction results for header fields in mul- Info Chang 2021-1
ent sion Foun- plate tipage documents. only ed 1-23
dry
Inform Suite - Fea-
See Template [page 252].
ation Devel- ture
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Using Documentation updated: now includes requirement to use a Info Chang 2021-1
ent sion Foun- the schema when creating templates based on document extrac- only ed 1-23
dry
Inform Suite - Docu- tion results.
ation Devel- ment
See Document [page 239] and Template [page 252].
Extract op- Infor-
ion ment mat-
Effi- ion Ex-
ciency trac-
tion UI
Docum Exten- Cloud Tutori- The following tutorial missions are now available for Info New 2021-1
ent sion Foun- als Document Information Extraction: only 1-23
dry
Inform Suite -
• Shape Machine Learning to Process Standard Business
ation Devel-
Documents
Extract op-
ion ment • Shape Machine Learning to Process Custom Business
Effi- Documents
ciency
See Tutorials [page 101].
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-1
ent sion Foun- Im- only ed 1-05
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud En- The matching accuracy for the bankAccount Info Chang 2021-1
ent sion Foun- rich- only ed 1-05
businessEntity key has been improved.
dry
Inform Suite - ment
See BusinessEntity [page 170] and Data Enrichment: Best
ation Devel- Data
Practices [page 271].
Extract op- API
ion ment
Effi-
ciency
Docum Exten- Cloud Tech- The 3510 x 3510 pixels maximum limit for the file size of sin- Info De- 2021-1
ent sion Foun- nical gle-page JPEG, PNG and TIFF documents has been removed. only leted 1-05
dry
Inform Suite - Con- You can now upload to the service documents with any reso-
ation Devel- straint lution as long as the file size is not higher than 50 MB.
Extract op- s
See Optical Character Recognition (OCR): Best Practices
ion ment
[page 258] and Technical Constraints [page 275].
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-1
ent sion Foun- Im- only ed 0-15
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud New Document Information Extraction is now available in the AWS Info New 2021-1
ent sion Foun- AWS region Australia (Sydney). only 0-15
dry
Inform Suite - Region
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Sup- Document Information Extraction supports now, at API level Info New 2021-1
ent sion Foun- port to only, businessCard as one of the standard document types. only 0-15
dry
Inform Suite - Busi-
See Supported Document Types and File Formats [page
ation Devel- ness
84], Supported Languages and Countries/Regions [page
Extract op- Card
86], and Extracted Header Fields [page 278].
ion ment Docu-
Effi- ments
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 09-30
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Role The role collection Info New 2021-
ent sion Foun- Collec- Document_Information_Extraction_UI_Admin_ only 09-30
dry
Inform Suite - tions User is now available. This new collection provides access to
ation Devel- all the features of the UI application.
Extract op-
See Role Collections [page 236].
ion ment
Effi-
ciency
Docum Exten- Cloud Best Best practices covering all stages of processing documents in Info New 2021-
ent sion Foun- Practi- the Document Information Extraction UI are now available. only 09-30
dry
Inform Suite - ces
See Document: Best Practices [page 270], Template: Best
ation Devel-
Practices [page 265], and Schema Configuration: Best Practi-
Extract op-
ces [page 259].
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 09-10
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tem- You can now: Info New 2021-
ent sion Foun- plate only 09-10
dry • Import and export templates
Inform Suite - Fea-
ation Devel- ture
• Create templates from extracted documents
Docum Exten- Cloud Sup- Single-page document files in TIFF format are now sup- Info New 2021-
ent sion Foun- ported ported. only 09-10
dry
Inform Suite - File
See Supported Document Types and File Formats [page 84]
ation Devel- For-
and Technical Constraints [page 275].
Extract op- mats
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 08-31
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tech- The technical constraints for the number of schemas are now Info Chang 2021-
ent sion Foun- nical available. only ed 08-31
dry
Inform Suite - Con-
See Technical Constraints [page 275] and Free Tier Option
ation Devel- straint
and Trial Account Technical Constraints [page 276].
Extract op- s
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 08-12
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Docu- The Get Result [page 138] endpoint returns now two new Info New 2021-
ent sion Foun- ment response fields: only 08-12
dry
Inform Suite - API
• languageCodes
ation Devel-
Extract op-
• pageCount
ion ment
Effi-
ciency
Docum Exten- Cloud Con- All Configuration API [page 115] keys have now ten- Info New 2021-
ent sion Foun- figura- ant scope by default. Service instance scope is now only 08-12
dry
Inform Suite - tion also available for the dataFeedbackCollection and
ation Devel- API documentRetentionTimeDays keys.
Extract op-
The documentRetentionTimeDays configuration key is
ion ment
Effi- now available. See Create Configuration [page 115].
ciency
Docum Exten- Cloud Py- A Python client library is now available for Document Info New 2021-
ent sion Foun- thon Information Extraction. It provides easy access to the REST only 07-26
dry
Inform Suite - Client API, UI application, and facilitates the service onboarding
ation Devel- Li- process.
Extract op- brary
Go to Python Client Library .
ion ment
Effi-
ciency
Docum Exten- Cloud Bar- Decoded information is now available for barcode fields from Info New 2021-
ent sion Foun- code India invoices. only 07-26
dry
Inform Suite - Heade
See Extracted Header Fields [page 278] and Barcode Header
ation Devel- r Field
Field in Invoice Documents [page 285].
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tem- The Template [page 252] feature is now also available to all Info Chang 2021-
ent sion Foun- plate SAP BTP Trial users. only ed 07-26
dry
Inform Suite - Fea-
See Free Tier Option and Trial Account Technical Constraints
ation Devel- ture
[page 276].
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 07-26
dry The Service Guide documentation has been updated:
Inform Suite - prove-
ation Devel- ments • Capabilities API [page 104]
Extract op- • Save Ground Truth [page 154]
ion ment
• Extracted Header Fields [page 278]
Effi-
• Extracted Line Items [page 286]
ciency
Docum Exten- Cloud Tem- The Template [page 252] feature is now generally available to Info New 2021-
ent sion Foun- plate all Document Information Extraction UI application users. only 07-20
dry
Inform Suite - Fea-
ation Devel- ture
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Exten- The new Compute Hours for Base Edition [page 80] service Info New 2021-
ent sion Foun- sion plan is now available. only 07-20
dry
Inform Suite - Capa-
ation Devel- bilities
Extract op- Serv-
ion ment ice
Effi- Plan
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info New 2021-
ent sion Foun- Im- only 07-07
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Docu- You can now use the Get Document File [page 159] endpoint Info New 2021-
ent sion Foun- ment to get the original document file you uploaded to the service. only 07-07
dry
Inform Suite - API
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Secur- Auditing and logging information is now available in the Se- Info New 2021-
ent sion Foun- ity curity [page 288]. only 07-07
dry
Inform Suite - Guide
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Sche The Schema Configuration [page 245] feature is now availa- Info New 2021-
ent sion Foun- ma ble in the Document Information Extraction UI application. only 06-28
dry
Inform Suite - Fea-
Document Information Extraction supports now custom
ation Devel- ture
documents and fields. See Supported Document Types and
Extract op- and
File Formats [page 84].
ion ment Sup-
Effi- port
ciency for
Cus-
tom
Docu-
ments
and
Fields
Docum Exten- Cloud Sup- Document Information Extraction supports now Info Chang 2021-
ent sion Foun- port only ed 06-28
purchaseOrder documents for all users.
dry
Inform Suite - for
The list of line items you can extract from purchaseOrder
ation Devel- Pur-
documents has been updated. See Extracted Line Items
Extract op- chase
[page 286].
ion ment Order
Effi- Docu- See also Supported Document Types and File Formats [page
ciency ments 84] and Extracted Header Fields [page 278].
Docum Exten- Cloud Con- The dataFeedbackCollection Configuration API [page Info Chang 2021-
ent sion Foun- figura- 115] key is now available. only ed 06-28
dry
Inform Suite - tion
ation Devel- API
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tem- The Template API (Beta) and its endpoints are no longer ex- Info Chang 2021-
ent sion Foun- plate posed to users at API level. only ed 06-28
dry
Inform Suite - API
The Template [page 252] feature remains available from the
ation Devel- (Beta)
Document Information Extraction UI application.
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Secur- The Security [page 288] documentation has been updated. Info Chang 2021-
ent sion Foun- ity only ed 06-28
dry
Inform Suite - Guide
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 05-24
dry
Inform Suite - prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Bar- The barcode header field can now be extracted from Ticket- Info New 2021-
ent sion Foun- code BAI invoices for the three Basque provincial councils (Álava, only 05-24
dry
Inform Suite - Heade Vizcaya and Guipúzcoa) and the Basque government.
ation Devel- r Field
See Extracted Header Fields [page 278] and Barcode Header
Extract op-
Field in Invoice Documents [page 285].
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 05-05
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Bar- The barcode header field can now be extracted from: Info New 2021-
ent sion Foun- code only 05-05
• Brazil PIX (instant payments)
Inform Suite - dry Heade
ation Devel- r Field
• Argentina, Colombia and Uruguay invoices
Extract op- See Extracted Header Fields [page 278] and Barcode Header
ion ment Field in Invoice Documents [page 285].
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 03-29
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Sup- Document Information Extraction supports now the Factur- Info New 2021-
ent sion Foun- port X and ZUGFeRD standards (all versions) for e-invoice docu- only 03-29
Inform Suite - dry for ment files in PDF and XML hybrid format.
ation Devel- Fac-
See Supported Document Types and File Formats [page
Extract op- tur-X
84].
ion ment and
Effi- ZUG-
ciency FeRD
Stand-
ards
Docum Exten- Cloud Docu The Document Information Extraction UI application now fea- Info New 2021-
ent sion Foun- ment tures: only 03-29
Inform Suite - dry Inform
• Activation and deactivation of templates. See Template
ation Devel- ation
[page 252].
Extract op- Extrac
ion ment tion UI
• Field level confidence visualization. See Document [page
239].
Effi-
ciency • Web Assistant
Docum Exten- Cloud Tem- The following Template API (Beta) endpoints are now availa- Info New 2021-
ent sion Foun- plate ble: only 03-29
Inform Suite - dry API
• Activate Template (Beta)
ation Devel- (Beta)
Extract op-
• Deactivate Template (Beta)
ion ment
Effi-
ciency
Docum Exten- Cloud Data The data feedback collection feature is now available. Info New 2021-
ent sion Foun- Feed- only 03-29
See Get Result [page 138] and Confirm Document [page
Inform Suite - dry back
157].
ation Devel- Collec-
Extract op- tion
ion ment for
Effi- Model
ciency Im-
prove-
ment
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 03-22
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Sup- The list of supported countries/regions for Info Chang 2021-
ent sion Foun- ported purchaseOrder (controlled availability) documents, and only ed 03-22
Inform Suite - dry Lan- the list of supported languages for the Template API (Beta)
ation Devel- gua- and the Document Information Extraction UI Template (Beta)
Extract op- ges feature are now available.
ion ment and
See Supported Languages and Countries/Regions [page
Effi- Coun-
86].
ciency tries/
Re-
gions
Docum Exten- Cloud Bar- Barcode header field extraction has been improved. Info Chang 2021-
ent sion Foun- code only ed 03-22
See Extracted Header Fields [page 278] and Barcode Header
Inform Suite - dry Heade
Field in Invoice Documents [page 285].
ation Devel- r Field
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 03-01
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud En- You can now set data activation to manual, instead of using Info Chang 2021-
ent sion Foun- rich- the default automatic refresh of enrichment data, that takes only ed 03-01
Inform Suite - dry ment place every 4 hours.
ation Devel- Data
See Create Data Activation [page 179] and Get Data Activa-
Extract op- API
tion Details [page 180].
ion ment
Effi-
ciency
Docum Exten- Cloud Con- The Configuration API [page 115] is now available. Info New 2021-
ent sion Foun- figura- only 03-01
Inform Suite - dry tion
ation Devel- API
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Identi- The Identifier API [page 110] is now available. Info New 2021-
ent sion Foun- fier only 03-01
Inform Suite - dry API
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Sup- Document Information Extraction supports now Info New 2021-
ent sion Foun- ported paymentAdvice document files in Excel format. See Sup- only 03-01
Inform Suite - dry Docu- ported Document Types and File Formats [page 84].
ation Devel- ment
Extract op- Types
ion ment and
Effi- File
ciency For-
mats
Docum Exten- Cloud Docu- The rawValue response field is now available for the Get Info Chang 2021-
ent sion Foun- ment Result [page 138] endpoint. only ed 02-15
Inform Suite - dry API
ation Devel-
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud SAP Document Information Extraction is now available in the SAP Info New 2021-
ent sion Foun- API API Business Hub. only 02-15
Inform Suite - dry Busi-
See Document Information Extraction .
ation Devel- ness
Extract op- Hub
ion ment
Effi-
ciency
Docum Exten- Cloud En- You can now delete large numbers of data records for all Info Chang 2021-
ent sion Foun- rich- clients per data type (employee or businessEntity). only ed 02-01
Inform Suite - dry ment
See Delete Enrichment Data (Asynchronous) [page 182].
ation Devel- Data
Extract op- API
ion ment
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 02-01
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Tem- The Document Information Extraction UI Template [page Info Chang 2021-
ent sion Foun- plate 252] feature has been updated. See Add Documents and only ed 01-18
Inform Suite - dry (Beta) Activate/Deactivate Template [page 253].
ation Devel- Fea-
The role collection
Extract op- ture
Document_Information_Extraction_UI_Templa
ion ment
tes_Admin is now available. See Subscribing to the Docu-
Effi-
ment Information Extraction UI [page 234].
ciency
Docum Exten- Cloud Overall There have been several code improvements. Info Chang 2021-
ent sion Foun- Im- only ed 01-18
Inform Suite - dry prove-
ation Devel- ments
Extract op-
ion ment
Effi-
ciency
Docum Exten- Cloud Ex- The list of header fields you can extract from Info Chang 2021-
ent sion Foun- tracte purchaseOrder documents has been updated. only ed 01-04
Inform Suite - dry d
See Extracted Header Fields [page 278].
ation Devel- Heade
Extract op- r
ion ment Fields
Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Chang 2020-1
ent sion Foun- Im- ed 2-21
Informa Suite - dry prove-
tion Devel- ments
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Tem- The Document Information Extraction UI Template [page 252] New 2020-1
ent sion Foun- plate feature supports now purchaseOrder documents (only for 2-21
Informa Suite - dry (Beta) a previously selected group of beta customers). See Using the
tion Devel- Feature Document Information Extraction UI [page 234].
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements. Chang 2020-1
ent sion Foun- Im- ed 2-03
Informa Suite - dry prove-
tion Devel- ments
Extracti opment
on Effi-
ciency
Docum Exten- Cloud New You can now use the Set up account for Document Information New 2020-1
ent sion Foun- SAP Extraction booster to automate the onboarding steps on the SAP 1-20
Informa Suite - dry Cloud Cloud Platform cockpit, and quickly consume the service and its
tion Devel- Plat- UI application.
Extracti opment form
See Initial Setup [page 96] and Subscribing to the Document
on Effi- Cock-
Information Extraction UI [page 234].
ciency pit
Boos-
ter
Docum Exten- Cloud New • Document Information Extraction supports now New 2020-1
ent sion Foun- Beta purchaseOrder documents (only for a previously selected 1-20
Informa Suite - dry Fea- group of beta customers). See Supported Document Types
tion Devel- tures and File Formats [page 84], Extracted Header Fields [page
Extracti opment 278] and Extracted Line Items [page 286].
on Effi- • The Template [page 252] feature is now available (only
ciency for a previously selected group of beta customers) in the
Document Information Extraction UI. See Using the Docu-
ment Information Extraction UI [page 234].
Docum Exten- Cloud Overall • There have been several code improvements. Chang 2020-1
ent sion Foun- Im- • The Feature Scope Description for Document Information ed 1-20
Informa Suite - dry prove- Extraction has been updated.
tion Devel- ments • The Technical Constraints [page 275] have been updated.
Extracti opment
• The Document Information Extraction Tutorials [page 101]
on Effi-
have been updated.
ciency
Docum Exten- Cloud New Document Information Extraction is now available in the AWS re- New 2020-1
ent sion Foun- AWS gion Europe (Frankfurt) EU-ONLY (access from Europe only). 0-27
Informa Suite - dry Region
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Overall • There have been several code improvements. Chang 2020-1
ent sion Foun- Im- • The discount and dueDate header fields can now be extracted ed 0-27
Informa Suite - dry prove- from invoices. See Extracted Header Fields [page 278].
tion Devel- ments • To get better extraction and enrichment results with
Extracti opment Document Information Extraction, see Optical Character Rec-
on Effi- ognition (OCR): Best Practices [page 258].
ciency
Docum Exten- Cloud Meter- A new service plan is available for Document Information New 2020-1
ent sion Foun- ing and Extraction. 0-21
Informa Suite - dry Pricing
See Metering and Pricing [page 79].
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements: Chang 2020-1
ent sion Foun- Im- ed 0-16
• The barcode header field can now be extracted from India
Informa Suite - dry prove-
invoices. See Extracted Header Fields [page 278].
tion Devel- ments
Extracti opment
• The new returnNullValues request parameter is now
available for the Get Result endpoint. See Get Result [page
on Effi-
138].
ciency
Docum Exten- Cloud Ex- The unitOfMeasure line item can now be extracted from invoices. Chang 2020-1
ent sion Foun- tracted ed 0-05
See Extracted Line Items [page 286].
Informa Suite - dry Line
tion Devel- Items
Extracti opment
on Effi-
ciency
Docum Exten- Cloud UI Ap- The Document Information Extraction UI is now generally availa- New 2020-1
ent sion Foun- plica- ble to all SAP Cloud Platform customers. 0-05
Informa Suite - dry tion
See Using the Document Information Extraction UI [page 234].
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Sup- The Service Guide documentation has been updated with a new New 2020-0
ent sion Foun- ported section: Supported Document Types and File Formats [page 84]. 9-16
Informa Suite - dry Docu-
tion Devel- ment
Extracti opment Types
on Effi- and
ciency File
For-
mats
Docum Exten- Cloud Overall There have been several code improvements: Chang 2020-0
ent sion Foun- Im- ed 9-16
• The barcode header field can now be extracted from invoices.
Informa Suite - dry prove-
See Extracted Header Fields [page 278].
tion Devel- ments
Extracti opment
• The fileType response field is now available for the Get
Result [page 138] endpoint.
on Effi-
ciency
Docum Exten- Cloud Using A new version of the Document Information Extraction UI is now Chang 2020-0
ent sion Foun- the available (only for a previously selected group of beta customers). ed 8-28
Informa Suite - dry Docum
See details on the possible document statuses and the Confirm
tion Devel- ent
document functionality in Using the Key Features of the Document
Extracti opment Inform
Information Extraction UI [page 237].
on Effi- ation
ciency Extract
ion UI
(Beta)
Docum Exten- Cloud Overall There have been several code improvements. Chang 2020-0
ent sion Foun- Im- ed 8-28
Informa Suite - dry prove-
tion Devel- ments
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Docu- The clientId request parameter is no longer needed to send a Chang 2020-0
ent sion Foun- ment request to the following Document API [page 127] endpoints: ed 8-17
Informa Suite - dry API
• Get Result [page 138]
tion Devel-
Extracti opment • Save Ground Truth [page 154]
Docum Exten- Cloud Overall There have been several code improvements. Chang 2020-0
ent sion Foun- Im- ed 8-17
Informa Suite - dry prove-
tion Devel- ments
Extracti opment
on Effi-
ciency
Docum Exten- Cloud New You can now use the Set up account for Document Information New 2020-0
ent sion Foun- SAP Extraction booster to automatically create your Document 8-17
Informa Suite - dry Cloud Information Extraction service key on SAP Cloud Platform Trial.
tion Devel- Plat- Follow the steps described in the tutorial Set Up Account for
Extracti opment form Document Information Extraction .
on Effi- Trial
ciency Cock-
pit
Boos-
ter
Docum Exten- Cloud New Document Information Extraction is now available in the AWS re- New 2020-0
ent sion Foun- AWS gion US East (VA). 7-31
Informa Suite - dry Region
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud New Single-page PNG and JPEG paymentAdvice files are now sup- New 2020-0
ent sion Foun- docu- ported. See Upload Document [page 127] and Technical Con- 7-31
Informa Suite - dry ment straints [page 275].
tion Devel- file for-
Extracti opment mats
on Effi- for
ciency payme
ntAdv
ice
Docum Exten- Cloud Using A new version of the Document Information Extraction UI is now Chang 2020-0
ent sion Foun- the available (only for a previously selected group of beta customers). ed 7-31
Informa Suite - dry Docum
See Using the Document Information Extraction UI [page 234].
tion Devel- ent
Extracti opment Inform
on Effi- ation
ciency Extract
ion UI
(Beta)
Docum Exten- Cloud Overall There have been several code improvements. Chang 2020-0
ent sion Foun- Im- ed 7-31
Informa Suite - dry prove-
tion Devel- ments
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Overall There have been several code and usability improvements: Chang 2020-0
ent sion Foun- Im- ed 7-14
• Enrichment data upload performance. See Create Enrich-
Informa Suite - dry prove-
ment Data [page 167].
tion Devel- ments
Extracti opment
• Document confirmation feature. See the new Document API
endpoint Confirm Document [page 157].
on Effi-
ciency
Docum Exten- Cloud Overall There have been several code improvements: Chang 2020-0
ent sion Foun- Im- ed 6-15
• The deliveryDate, paymentTerms and senderBankAccount
Informa Suite - dry prove-
header fields can now be extracted from invoices. See Ex-
tion Devel- ments
tracted Header Fields [page 278].
Extracti opment
on Effi-
• The list of supported character types for the IDs of clients,
enrichment data records, system and company codes has
ciency
been updated. See Technical Constraints [page 275].
Docum Exten- Cloud Overall There have been several code and usability improvements: Chang 2020-0
ent sion Foun- Im- ed 6-02
• Single-page PNG and JPEG invoice files are now supported.
Informa Suite - dry prove-
See Upload Document [page 127] and Technical Constraints
tion Devel- ments
[page 275].
Extracti opment
on Effi-
• New Document API [page 127] endpoints are now available.
ciency • The Enrichment Data API [page 166] endpoints have also
been updated. Delete Enrichment Data (Asynchronous)
[page 182] is now available.
• The deliveryNoteNumber header field can now be extracted
from invoices. See Extracted Header Fields [page 278].
• You can now use the Capabilities API [page 104] to get the list
of document fields and enrichment data you can process by
document type.
Docum Exten- Cloud New The following beta features are now available (only for a previously New 2020-0
ent sion Foun- Beta selected group of beta customers): 6-02
Informa Suite - dry Fea-
• Template-based information extraction. See Template API
tion Devel- tures
(Beta) and Technical Constraints [page 275].
Extracti opment
on Effi-
• Document Information Extraction UI. See Using the Docu-
ment Information Extraction UI [page 234].
ciency
Docum Exten- Cloud Overall There have been several code and usability improvements: Chang 2020-0
ent sion Foun- Im- ed 5-18
• Higher model accuracy
Informa Suite - dry prove-
tion Devel- ments
• The Supported Languages and Countries/Regions [page 86]
list has been updated
Extracti opment
on Effi- • The tutorial mission Use Machine Learning to Enrich Data
ciency Extracted from Documents is now available. See Tutorials
[page 101].
Docum Exten- Cloud New The Notifications [page 227] functionality is now available. New 2020-0
ent sion Foun- Notifi- 5-18
Informa Suite - dry cations
tion Devel- Func-
Extracti opment tional-
on Effi- ity
ciency
Docum Exten- Cloud Overall There have been several stability and usability improvements, in- Chang 2020-0
ent sion Foun- Im- cluding the model accuracy. ed 4-20
Informa Suite - dry prove-
The Service Guide documentation has been updated:
tion Devel- ments
Extracti opment • Technical Constraints [page 275]
on Effi- • Free Tier Option and Trial Account Technical Constraints
ciency [page 276]
Docum Exten- Cloud Overall There have been several stability and usability improvements: Chang 2020-0
ent sion Foun- Im- ed 3-30
• Some field value types have been updated. See Capabilities
Informa Suite - dry prove-
API [page 104]
tion Devel- ments
Extracti opment
• The enrichment parameter top property has now a maxi-
mum possible value of 50. See Enrichment Parameter [page
on Effi-
132].
ciency
• Now, if no value is detected for fields in header or line items,
they do not appear in the response JSON file. See Get Result
[page 138].
Docum Exten- Cloud API The API Reference [page 102] documentation has been updated Chang 2020-0
ent sion Foun- Refer- with the following new sections: ed 3-30
Informa Suite - dry ence
• Get Access Token [page 103]
tion Devel-
Extracti opment
• Capabilities API [page 104]
Docum Exten- Cloud Tutori- A new tutorial mission is now available for Document Information New 2020-0
ent sion Foun- als Extraction. 3-02
Informa Suite - dry
See Use Machine Learning to Process Business Documents .
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Client The new clientIdStartsWith request parameter is now New 2020-0
ent sion Foun- API available for the Get Client endpoint. 3-02
Informa Suite - dry
See Get Client [page 108] .
tion Devel-
Extracti opment
on Effi-
ciency
Techni-
cal Envi- Availa-
Com- Capa- ron- ble as
ponent bility ment Title Description Type of
Docum Exten- Cloud New Document Information Extraction is now available in the AWS re- New 2019-1
ent sion Foun- AWS gion Japan (Tokyo). 2-19
Informa Suite - dry Region
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Trial You can now try out Document Information Extraction on SAP New 2019-1
ent sion Foun- Ac- Cloud Platform Trial. 2-05
Informa Suite - dry count
See Get a Trial Account.
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud API • Enrichment Data API documentation is now available. See Chang 2019-11
ent sion Foun- Refer- Enrichment Data API [page 166]. ed -04
Informa Suite - dry ence • Document API documentation has also been updated. See
tion Devel- Document API [page 127]
Extracti opment
• The documentNumber, documentDate, discountAmount,
on Effi-
deductionAmount, and grossAmount fields can now be ex-
ciency
tracted from line items. See Extracted Line Items [page 286].
Docum Exten- Cloud Getting CA-ML-BDP is now the BCP component for Document Information Chang 2019-11
ent sion Foun- Sup- Extraction. ed -04
Informa Suite - dry port
See Getting Support [page 295].
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Secur- The Security Guide has been updated with Enrichment Data API Chang 2019-11
ent sion Foun- ity details. ed -04
Informa Suite - dry Guide
See Security [page 288].
tion Devel-
Extracti opment
on Effi-
ciency
Docum Exten- Cloud Trou- The Troubleshooting section is now available. New 2019-11
ent sion Foun- ble- -04
See Troubleshooting [page 296].
Informa Suite - dry shoot-
tion Devel- ing
Extracti opment
on Effi-
ciency
See a glossary of definitions for artificial intelligence (AI) and machine learning (ML), and Document
Information Extraction concepts in AI & ML Glossary. In the third column Filter, select Document Information
Extraction.
Learn more about the different types of service plans for Document Information Extraction.
Document Information Extraction provides different types of service plans. The type you choose determines
pricing, conditions of use, resources, available services, and hosts.
It depends on your use case whether you choose a free or a paid service plan. If you plan to use your global
account in productive mode, you must purchase a paid enterprise account. It's important that you're aware of
the differences when you're planning and setting up your account model. See Initial Setup [page 96].
For more details about the available service plans, see the following table:
Base Edition (blocks_of_100) • Base Edition service plan that in- Enterprise
cludes all core features but doesn't
include document information ex-
traction using generative AI.
• Service plan intended for produc-
tive usage.
• Inference requests in blocks of 100
documents and compute hours.
• You can upload to the service up
to 2000 documents per hour per
tenant (each document can have
up to 100 pages).
Remember
• If you first activated the Free service plan, you can update the same service instance to switch to Base
Edition or Premium Edition for enterprise accounts.
• Both metadata and transaction data are transferred to Base Edition or Premium Edition for enterprise
accounts when you switch from Free to Base Edition or Premium Edition.
• If you don't want Free and Base Edition or Premium Edition data to be combined together, you can
split them by subscribing to the service plans in separate subaccounts.
Learn more about the different types of metering and pricing for Document Information Extraction by service
plan.
Tip
The metering and pricing details listed here are relevant only to users of the service plans Base Edition
(blocks_of_100) and Premium Edition (premium_edition) for enterprise accounts. See Service Plans
[page 77].
The service plan Document Information Extraction, base edition (blocks_of_100) is metered based on the
following metrics:
The service plan Document Information Extraction, premium edition (premium_edition) is metered based
on the following metrics:
Tip
Related Information
Usage Metric
The service plan Document Information Extraction, base edition (blocks_of_100) is metered based on the
usage of documents defined as unique records processed by the cloud service. One document can consist of
maximum 3 pages. If a document consists of more than 3 pages, each additional 3 pages are charged as an
additional document.
1 block = 100 documents. The final price is a sum of the number of documents uploaded to the service.
Basic Service
Caution
The price rates listed below might be outdated. Find updated price rates in the Pricing tab of the SAP
Discovery Center .
Document Information Extraction does not allow users to train and deploy customizable models. For this
service, the number of inference requests is relevant for the charged amount.
Example
Usage Metric
The service plan Document Information Extraction, base edition (blocks_of_100) is also metered based on
consumed compute hours defined as one hour, or portion thereof, consumed by the cloud service to process
one or several documents with a custom model.
Caution
The price rate listed below might be outdated. Find updated price rates in the Pricing tab of the SAP
Discovery Center .
The costs are associated with the usage of templates. See Template API [page 211] and the Template [page
252] UI feature.
Example
Note
The following calculation examples are based on current experiments. During the usage of the service, the
exact usage numbers can vary slightly.
Document Upload 1000 documents (10 blocks of 100 EUR 200 per month
documents)
Template usage 1000 (5 templates x 200 transac- EUR 1.00 per month
tions) x 1 second = 0.3 compute hour
(rounded up to 1 compute hour)
In this example, the total cost is EUR 201.00 per month, and EUR 25.00 only once when the five templates are
activated.
Usage Metric
The service plan Document Information Extraction, premium edition (premium_edition) is metered based
on the usage of documents defined as unique records processed by the cloud service.
One document can consist of maximum 1 page. If a document consists of more than 1 page, each additional
page is charged as an additional document.
You can extract a maximum of 50 fields per document. If you extract more than 50 fields per document, every
additional 50 fields are charged as an additional document. As a technical limit, you can add up to 500 header
fields and line items per schema. For more information, see Technical Constraints [page 275].
Block Size
1 block = 100 documents. The final price is a sum of the number of documents uploaded to the service.
Basic Service
Caution
The price rates listed below might be outdated. Find updated price rates in the Pricing tab of the SAP
Discovery Center .
Document Information Extraction does not allow users to train and deploy customizable models. For this
service, the number of inference requests is relevant for the charged amount.
Example
Document Types
• Standard document types: refer to document types for which SAP provides pre-trained machine learning
models that allow out-of-the-box (without prior training) extraction of information based on default
extractors, which are managed directly by SAP.
• businessCard
Note
• invoice
• paymentAdvice
• purchaseOrder
• Custom document types: refer to document types for which there are no pre-trained machine learning
models that are managed by SAP. Use the Template [page 252] and Schema Configuration [page 245]
features to extract information from custom documents that are different from the standard document
types listed above. See also Schema API [page 184] and Template API [page 211].
File Formats
Document Information Extraction supports the following document file formats as input:
• The endpoint Upload Document [page 127] accepts only multipart-encoded files with a file name and a
content type.
• The file name should contain a file extension. For example: “invoice” only, without a file extension, is
not a valid file name.
• The file name cannot be empty even if a file extension is provided. For example: “.pdf” is not a valid file
name.
Tip
The Document Information Extraction service handles distorted and asymmetrical images with a rotation
of multiples of 90 degrees. In addition, small rotations of up to 15 degrees are also handled by the service.
In both cases, the images are deskewed automatically.
Explore the Document Information Extraction supported languages and countries/regions by document type
and extraction approach.
The supported languages and countries/regions have been validated with Document Information Extraction.
It is also possible to get similar accuracy results with documents in other languages and from other countries/
regions that use Latin-1 (ISO-8859-1) character sets.
If you want to try out Document Information Extraction to check if it fulfills your business needs, you can use
a trial account to upload to the service a document in any language and from any country/region, and get the
results following the tutorial mission Use Machine Learning to Process Business Documents .
Restriction
Language
Document Information Extraction supports the following languages for businessCard documents:
Dutch nl
English en
French fr
German de
Hebrew he
Italian it
Japanese ja
Korean ko
Polish pl
Portuguese pt
Russian ru
Spanish es
See the list of supported languages and countries/regions for invoice documents. See also the supported
countries/regions, and extracted fields for barcodes in invoice documents.
Language
Document Information Extraction supports the following languages for invoice documents:
Czech cs
Danish da
Dutch nl
English en
Finnish fi
French fr
German de
Hungarian hu
Italian it
Japanese jp
Norwegian no
Polish pl
Portuguese pt
Romanian ro
Slovak sk
Slovenian sl
Spanish es
Swedish sv
Turkish tr
Country/Region
Document Information Extraction supports the following countries/regions for invoice documents:
• Australia
• Austria
• Belgium
• Canada
• Czech Republic
• Denmark
• Finland
• France
• Germany
• Hungary
• Italy
• Japan
• Mexico
• Netherlands
• New Zealand
• Norway
• Poland
• Portugal
• Romania
• Slovakia
• Slovenia
• Spain
• Sweden
• Switzerland
Note
Document Information Extraction does not guarantee to support all specific fields for the countries/
regions listed above, even if they are legally required in a country/region.
Document Information Extraction supports the following countries/regions, and extracted fields for barcodes
in invoice documents:
Argentina • currencyCode
• documentDate
• documentNumber
• grossAmount
Basque • documentNumber
• grossAmount
Brazil • currencyCode
• grossAmount
• senderName
China • documentDate
• documentNumber
• netAmount
Colombia • documentDate
• documentNumber
• grossAmount
• netAmount
• receiverTaxId
• taxAmount
India • documentDate
• documentNumber
• grossAmount
• receiverTaxId
• taxId
Mexico • grossAmount
• taxId
Switzerland • currencyCode
• documentNumber
• grossAmount
• senderAddress
• senderBankAccount
• senderName
• receiverAddress
• receiverName
Uruguay • documentNumber
• grossAmount
Note
The barcode supported countries/regions have been validated with Document Information Extraction. It is
also possible to get similar accuracy results with documents from other countries/regions that contain the
most common types of 1D and 2D barcodes as described in Barcode Header Field in Invoice Documents
[page 285].
See the list of supported languages and countries/regions for paymentAdvice documents.
Language
Document Information Extraction supports the following languages for paymentAdvice documents:
English en
German de
Country/Region
Document Information Extraction supports the following countries/regions for paymentAdvice documents:
• Germany
• United Kingdom
See the list of supported languages and countries/regions for purchaseOrder documents.
Language
Document Information Extraction supports the following languages for purchaseOrder documents:
English en
German de
Country/Region
Document Information Extraction supports the following countries/regions for purchaseOrder documents:
• Germany
• United Kingdom
• United States
See the list of languages supported when using a template to extract information from custom and standard
documents.
Note
When using templates to extract information from standard documents, the accuracy results are usually
higher when you take into account the supported languages and countries/regions listed for Business
Card: Languages [page 86], Invoice: Languages and Countries/Regions [page 87], Payment Advice:
Languages and Countries/Regions [page 90], and Purchase Order: Languages and Countries/Regions
[page 91] documents.
Language
Albanian sq
Bosnian bs
Catalan ca
Croatian hr
Czech cs
Danish da
Dutch nl
English en
Estonian et
Finnish fi
French fr
Galician gl
German de
Hungarian hu
Icelandic is
Indonesian id
Italian it
Irish ga
Latvian lv
Lithuanian lt
Malaysian ms
Montenegrin cnr
Norwegian no
Polish pl
Portuguese pt
Serbian sr
Slovak sk
Slovenian sl
Spanish es
Swedish sv
Turkish tr
Welsh cy
Arabic ar
Greek el
Hebrew he
Japanese ja
Korean ko
Russian ru
Thai th
Related Information
See the list of languages supported when using generative AI to extract information from custom and
standard documents.
Restriction
Extraction using generative AI is available with the service plan Document Information Extraction,
premium edition (premium_edition) only. See Service Plans [page 77] and Metering and Pricing [page
79].
You can also use an SAP BTP trial account to try out extraction using generative AI. Follow the tutorial:
Use Trial to Extract Information from Custom Documents with Generative AI and Document Information
Extraction .
Language
Albanian sq
Bosnian bs
Catalan ca
Croatian hr
Czech cs
Danish da
Dutch nl
English en
Estonian et
Finnish fi
French fr
Galician gl
German de
Hungarian hu
Icelandic is
Indonesian id
Italian it
Irish ga
Latvian lv
Lithuanian lt
Malaysian ms
Montenegrin cnr
Norwegian no
Polish pl
Portuguese pt
Serbian sr
Slovak sk
Slovenian sl
Spanish es
Swedish sv
Turkish tr
Welsh cy
Arabic ar
Greek el
Hebrew he
Japanese ja
Korean ko
Russian ru
Thai th
Related Information
Get started with Document Information Extraction using the standard procedures for SAP BTP Cloud Foundry
environment or Kyma environment.
Tip
See Tutorials [page 101] to find out how to use a trial account or the free tier option for Document
Information Extraction to try out the service.
Prerequisites
You have set up your global account and at least one subaccount on SAP BTP. For an overview of the required
steps, see Getting Started in the Cloud Foundry Environment or Getting Started in the Kyma Environment.
Note
Document Information Extraction allows you to move subaccounts between your global accounts. For more
information, see Relationship Between Global Accounts, Subaccounts, and Directories [Feature Set B].
Related Information
Enable Document Information Extraction using the standard procedures for SAP BTP Cloud Foundry
environment.
Context
Tip
You can also use the booster Set up account for Document Information Extraction to automate the steps
described below on the SAP BTP cockpit. See Boosters and the tutorials:
• Use Free Tier to Set Up Account for Document Information Extraction and Get Service Key
• Use Free Tier to Set Up Account for Document Information Extraction and Go to Application
Procedure
1. Create a service instance in the Cloud Foundry environment. See Creating Service Instances.
2. You can then bind the service instance to your application, or you can create a service key to communicate
directly with the service instance. See Binding Service Instances to Applications and Creating Service Keys.
Enable Document Information Extraction using the standard procedures for Kyma environment.
Procedure
Find out how to enable your service instance for authentication with an X.509 client certificate.
The Document Information Extraction service supports X.509 authentication with the certificates managed
either by the SAP Authorization and Trust Management service or self-managed.
Restriction
The X.509 authentication is currently available for Document Information Extraction at API level only.
To enable your service instance for authentication with an X.509 client certificate, in the New Instance or
Subscription wizard, enter in Parameters the following instance parameters in JSON format:
{
"xs-security":{
"xsappname":"<app-name>",
"oauth2-configuration":{
"credential-types":[
"x509"
]
}
}
}
Note
To use X.509 secrets, you need to set additional parameters when you create your service key or service
binding. We support the following two scenarios:
• The SAP Authorization and Trust Management service generates certificates for you. In this case, the
parameters you need to set when you create your service key or service binding are the following in JSON
format:
{
"xsuaa":{
"credential-type":"x509",
"x509":{
"key-length":2048,
"validity":8,
"validity-type":"DAYS"
For a detailed description of the parameters, see Parameters for X.509 Certificates Managed by SAP
Authorization and Trust Management Service.
• You already have your own public key infrastructure (PKI), with certificates issued from one of the trusted
Certificate Authorities (CAs). In this case, the parameters you need to set when you create your service key
or service binding are the following in JSON format::
{
"xsuaa":{
"credential-type":"x509",
"x509":{
"certificate":"-----BEGIN CERTIFICATE-----...-----END
CERTIFICATE-----",
"ensure-uniqueness":false,
"certificate-pinning":true,
"hide-certificate":true
}
}
}
For a detailed description of the parameters, see Parameters for Self-Managed X.509 Certificates. See also
Trusted Certificate Authentication.
To get an authorization token using an X.509 certificate, use “certurl”. In the scenario of already generated
certificates, also use “key” and “certificate” from the service key.
See also the blog post: X.509 certificate-based authentication(mTLS) – Generating X.509 certificates of BTP
managed services .
In the Cloud Foundry environment, you can develop and run multitenant applications, and share them with
multiple consumers simultaneously on SAP BTP.
Document Information Extraction supports this scenario and can be declared as a dependency of a multitenant
application. This means that Document Information Extraction gets provisioned automatically for every
consumer that subscribes to the multitenant application. Different consumers are independently provisioned
and data from these consumers is isolated inside Document Information Extraction.
Tip
See Developing Multitenant Applications in the Cloud Foundry Environment for more details on how to
declare Document Information Extraction as a dependency of a multitenant application using the SAP
SaaS Provisioning service.
Follow our tutorials to get familiar with the Document Information Extraction UI application, APIs, and
functionalities.
Use Generative AI to Process Business Documents Find out how to use the SAP Business Technology Platform
service Document Information Extraction with generative AI
to automate the extraction of information from any type of
document using large language models (LLMs).
Use Machine Learning to Process Business Documents Try out the Document Information Extraction Trial UI to proc-
ess business documents that have content in headers and
tables.
Use Machine Learning to Extract Information from Business Process business documents that have content in head-
Documents and Enrich Data
ers and tables, and enrich the information extracted with
your own master data records, using machine learning and
Swagger UI.
Shape Machine Learning to Process Standard Business Create your own header and line item fields, and edit extrac-
Documents tion results for documents associated with templates to au-
tomate the extraction of information from standard business
documents such as invoices and purchase orders.
Shape Machine Learning to Process Custom Documents Create your own header and line item fields, and edit ex-
traction results for documents associated with templates to
automate the extraction of information from custom docu-
ments (not supported out of the box) such as résumés and
power of attorney.
Tip
See also the following onboarding tutorials that use the free tier option for Document Information
Extraction:
• Use Free Tier to Set Up Account for Document Information Extraction and Get Service Key
• Use Free Tier to Set Up Account for Document Information Extraction and Go to Application
Related Information
Tutorial Navigator
Explore the sections listed below to get started with the Document Information Extraction APIs and the
Notifications feature.
Before using the Document Information Extraction APIs listed below, you need to retrieve your OAuth access
token as described in Get Access Token [page 103].
To display the comprehensive specification of the Document Information Extraction APIs in Swagger UI, add
the URL path extension /document-information-extraction/v1 to the Document Information Extraction
base URL (that is, the url value from outside the uaa section of your service key).
Related Information
Retrieve your OAuth access token, which will grant you access to the Document Information Extraction APIs.
Note
The token is valid for 12 hours. After that, you need to generate a new one.
Tip
Alternatively, you can follow the steps in this tutorial to Get OAuth Access Token for Document Information
Extraction via Web Browser .
Request
Base URL: url value from inside the uaa section of the service key
Request Headers
Request Parameters
client_id Yes String The clientid value from the service key.
client_secret Yes String The clientsecret value from the service key.
Response
The response is given as a status (200 or 401). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
See the list of document fields and enrichment data for each document type you can process with Document
Information Extraction.
Tip
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
The response is given as a status (200, 401, or 500) and JSON file. See Common Status and Error Codes [page
226].
{
"extraction":{
"headerFields":[
{
"name":"documentNumber",
"type":"string",
"category":"document",
"supportedDocumentTypes":[
"invoice",
"paymentAdvice",
"purchaseOrder"
]
},
{
"name":"taxId",
"type":"string",
"category":"amounts",
"supportedDocumentTypes":[
"invoice",
"purchaseOrder"
]
},
{
"name":"taxName",
"type":"string",
"category":"amounts",
"supportedDocumentTypes":[
"invoice"
]
},
{
"name":"purchaseOrderNumber",
"type":"string",
"category":"details",
"supportedDocumentTypes":[
"invoice"
]
},
{
"name":"shippingAmount",
"type":"number",
"category":"amounts",
"supportedDocumentTypes":[
"invoice"
]
},
"..."
],
"lineItemFields":[
{
"name":"description",
"type":"string",
"category":"details",
"supportedDocumentTypes":[
"invoice",
"purchaseOrder"
]
},
{
"name":"netAmount",
"type":"number",
"category":"amounts",
"supportedDocumentTypes":[
"invoice",
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Request Example
Single client:
{
"value":[
{
"clientId":"c_00",
"clientName":"client 00"
}
]
}
Multiple clients:
{
"value":[
{
"clientId":"c_00",
"clientName":"tyler"
},
{
"clientId":"c_01",
"clientName":"jlaix"
}
]
}
Response Fields
The response is given as a status (201, 400, 401, 429, or 500) and JSON file. See Common Status and Error
Codes [page 226] and Technical Constraints [page 275].
Response Example
201 “Created”
{
"inserted":1,
"modified":2
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientIdStartsW No String Filters the list of clients by the characters the clien-
ith tId starts with. For example: c
limit Yes Integer Number of clients to process. For example: 10. See
Technical Constraints [page 275]
Response Fields
id Tenant ID
payload List of all clients, including their zoneId, clientId, and clientName
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"id": "1234",
"payload": [
{
"clientId": "c_00",
"clientName": "client 00"
},
{
"clientId": "c_01",
"clientName": "client 01"
},
{
"clientId": "c_02",
"clientName": "client 02"
},
{
"clientId": "c_03",
"clientName": "client 03"
},
{
"clientId": "c_04",
"clientName": "client 04"
}
]
}
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"message": "Successfully deleted 1 client(s)."
}
Restriction
The Identifier API is only available for paymentAdvice documents in Excel format.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Note
In single POST calls, you can create aliases for only one documentType and fileType.
options Yes JSON Object Options for processing the document. See the
Options Payload table below.
Options Payload
Option Required Data Type Description
{
"documentType":"paymentAdvice",
"fileType":"Excel",
"headerFields":[
{
"language":"en",
"capabilities":{
"documentNumber":[
"Payment Number"
],
"documentDate":[
"Payment Date"
],
"currencyCode":[
"Invoice Currency"
],
"grossAmount":[
"Amount in Invoice Currency",
"Document currency"
]
}
},
{
"language":"de",
"capabilities":{
"documentNumber":[
"Beleg-Nr."
],
"documentDate":[
"RE-Datum"
]
}
}
],
"lineItemFields":[
{
"language":"en",
"capabilities":{
"documentNumber":[
"Invoice Number",
"Document Number"
],
"documentDate":[
"Invoice Date",
"Document Date"
],
"discountAmount":[
"Cash disc. amt LC"
],
"netAmount":[
"Amount Paid",
Response
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Retrieve all identifiers for client mappings by fileType, documentType, and clientId.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
documentType Yes String Type of the document submitted. For now, only
paymentAdvice is supported.
fileType Yes String Type of the file submitted. For now, only Excel is
supported.
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
fileType No String Type of the file submitted. For now, only Excel is
supported.
Note
If you want to delete aliases for a specific documentType and fileType, all parameter fields are required.
If the documentType and fileType are not provided, all aliases are deleted.
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Create, update, list, and delete configurations on tenant scope by default, or optionally, on instance or client
scope.
Related Information
Request
Base URL: url value from outside the uaa section of the service key
clientId No String The ID of the client you want to set the configura-
tion for. For example: c_00. This parameter is only
used for client scope configurations.
payload Yes JSON Object List of configuration key-value pairs. For more in-
formation, see Configuration Keys [page 117].
• client
• instance
• tenant
Tip
If you leave this parameter empty, the
tenant scope is used.
tenantId No String The ID of the tenant you want to set the configura-
tion for.
Tip
If you leave this parameter empty, the
tenantId sending the request is used.
Response
Response Fields
The response is given as a status (201, 401, or 500) and JSON file. See Common Status and Error Codes [page
226].
Response Example
201 “Success”
{
"inserted":1,
"modified":0
}
Explore the available configuration keys for the Document Information Extraction service.
Create Config-
uration [page
115] Request
Configuration Possible Val- Payload Exam-
Key Default Value ues Scope Description ple
"clientS
egregati
on":"tru
e"
}
}
"coordin
ateForma
t":"norm
alized"
}
}
Remember
As Document Information
Extraction learns from data,
enabling data feedback collec-
tion may help the service to be-
come more accurate in extract-
ing information from your docu-
ments. On the contrary, dele-
tion of data may result in ex-
traction results becoming less
accurate. Deletion of data is ir-
reversible.
documentRe 7 days 1 to 30 days • client Use this configuration key to set the
{
tentionTim
• instanc retention period, for inference docu-
eDays ments uploaded to the service. "value":
e {
• tenant
"documen
tRetenti
onTimeDa
ys":"10"
}
}
enrichment low low, medium, • client Use this configuration key to adjust
{
Confidence or high
• instanc the similarity confidence threshold
Threshold for the enrichment. "value":
e {
• tenant The low value results in more
"enrichm
matches with higher possiblity of entConfi
false-positve matches. denceThr
eshold":
The high value returns only very "medium"
}
confident matches and has lower
}
tolerance for differences between
document content and master data.
Note
Before setting the dataFeedbackCollection configuration key to true, and the performPIICheck
subconfiguration to false, review the subsection Deletion of Personal Data in Data Protection and Privacy
[page 288].
Restriction
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId No String The ID of the client you want to get the configura-
tion for. For example: c_00. This parameter is only
used for client scope configurations.
Tip
If you leave this parameter empty, the
active scope is used.
tenantId No String The ID of the tenant you want to get the configura-
tion for.
Tip
If you leave this parameter empty, the
tenantId sending the request is used.
Response
Response Fields
The response is given as a status (200, 401, or 500) and JSON file. See Common Status and Error Codes [page
226].
{
"results":{
"documentRetentionTimeDays": "10",
"manualDataActivation": "true",
"dataFeedbackCollection": "true",
"performPIICheck": "true"
}
}
Retrieve all configurations already created for a given key for the requested scope.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId No String The ID of the client you want to get the configura-
tion for. For example: c_00. This parameter is only
used for client scope configurations.
key Yes String One of the available Configuration Keys [page 117].
Tip
If you leave this parameter empty, the
active scope is used.
tenantId No String The ID of the tenant you want to get the configura-
tion for.
Tip
If you leave this parameter empty, the
tenantId sending the request is used.
Response
Response Fields
The response is given as a status (200, 400, 401, 404, or 500) and JSON file. See Common Status and Error
Codes [page 226].
Response Example
200 “Success”
{
"results":{
"documentRetentionTimeDays": "10"
}
}
{
"results":{
"manualDataActivation": "true"
}
{
"results":{
"dataFeedbackCollection": "true"
}
}
{
"results":{
"performPIICheck": "true"
}
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId No String The ID of the client you want to delete the config-
uration for. For example: c_00. This parameter is
only used for client scope configurations.
payload Yes JSON Object List of configuration keys. All configurations are
deleted if payload is empty. Possible configuration
key and subconfiguration values:
• activateDocumentNotifications
• clientSegregation
• coordinateFormat
• dataFeedbackCollection
Note
After sending the DELETE request using
the dataFeedbackCollection con-
figuration key, all documents already up-
loaded to the service for retraining by
this tenant (or service instance) are de-
leted. And all documents uploaded from
that moment onwards are no longer used
to retrain the service's machine learn-
ing models. See also Configuration Keys
[page 117] (if parameter is set to false).
• documentRetentionTimeDays
Note
After sending the DELETE request using
the documentRetentionTimeDays
configuration key, the default retention
period of 7 days is used again.
• enrichmentConfidenceThreshold
• manualDataActivation
• performPIICheck
• client
• instance
• tenant
Tip
If you leave this parameter empty, the
tenant scope is used.
tenantId No String The ID of the tenant you want to delete the config-
uration for.
Tip
If you leave this parameter empty, the
tenantId sending the request is used.
Request Examples
{
"value":[
"documentRetentionTimeDays"
]
}
{
"value":[
"manualDataActivation"
]
}
{
"value":[
"dataFeedbackCollection"
]
}
{
"value":[
"performPIICheck"
]
}
{
"value":[
"documentRetentionTimeDays",
"manualDataActivation",
"dataFeedbackCollection",
"performPIICheck"
]
}
Response
Response Fields
Response Example
200 “Success”
{
"deleted": 1
}
The core functionality of Document Information Extraction is extracting structured information from
documents automatically using machine learning. The Document API provides endpoints to upload documents
for processing and also to get the results.
Upload a document file to the service to get the extraction results from header fields and line items in JSON
format.
Tip
Base URL: url value from outside the uaa section of the service key
Request Parameters
file Yes File Document file you want to process. See Supported
Document Types and File Formats [page 84].
options Yes JSON Object Options for processing the document. See the
Options Payload table below.
Options Payload
Option Required Data Type Description
Caution
schemaId isn’t always
a required option. How-
ever, if your payload
includes templateId,
it must also include
schemaId. In such
cases, don’t include
headerFields or
lineItemFields in
the payload to avoid con-
flicts.
• SAP_OCROnly_schema:
"schemaId":"09e6c9e4-
d7b0-414f-bd85-
cfee6fbb2add" for
custom documents
• SAP_invoice_schema:
"sche-
maId":"cf8cc8a9-1eee-4
2d9-9a3e-507a61baac2
3" for invoice docu-
ments
• SAP_purchaseOrder_
schema: "sche-
maId":"fbab052e-6f9b-
4a5f-
b42f-29a8162eb1bf" for
purchaseOrder
documents
• SAP_paymentAdvice_
schema: "sche-
maId":"b7fdcfac-7853-4
2bb-89d2-
ede2ba1ce803" for
paymentAdvice
documents
{
"clientId":"c_00",
"documentType":"invoice",
"receivedDate":"2020-02-17",
"schemaId":"10c10bd2-082b-47c8-851d-e58827828637",
}
}
{
"clientId":"c_00",
"documentType":"invoice",
"receivedDate":"2020-02-17",
"schemaId":"10c10bd2-082b-47c8-851d-e58827828637",
"templateId":"0ebcd5c4-7843-4e6e-867a-1e5c997e4e4c",
"enrichment":{
}
}
{
"extraction":{
"headerFields":[
"documentNumber",
"taxId",
"purchaseOrderNumber",
"shippingAmount",
"netAmount",
"senderAddress",
"senderName",
"grossAmount",
"currencyCode",
"receiverContact",
"documentDate",
"taxAmount",
"taxRate",
"receiverName",
"receiverAddress"
],
"lineItemFields":[
"description",
"netAmount",
"quantity",
"unitPrice",
"materialNumber"
]
},
"clientId": "c_00",
"documentType": "invoice",
"receivedDate": "2020-02-17",
"enrichment": {}
}
{
"schemaId":"09e6c9e4-d7b0-414f-bd85-cfee6fbb2add",
"clientId":"c_10",
Response
Response Fields
id Request ID
The response is given as a status (201, 400, 401, 415, 429, 500, or 503). See Common Status and Error Codes
[page 226].
Response Example
201 “Created”
{
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"status":"PENDING",
"processedTime":"2020-03-26T17:00:00.000000+00:00"
}
The enrichment parameter can be used to retrieve a matching of enrichment data to extracted header fields.
See Create Enrichment Data [page 167]. The property should be a JSON object which can contain properties,
as listed in the table below, depending on the enrichment data you want to match.
Example
"enrichment":{
"sender":{
"top":5,
"type":"businessEntity",
"subtype":"supplier"
},
"employee":{
"type":"employee"
},
"product":{
"type":"product"
}
product No To match the product line items found on the document to enrichment
data, the product property should be present in enrichment.
receiver No To match the extracted visual information from the document to the
receiver enrichment data, the receiver property should be present in
enrichment.
sender No To match the extracted visual information from the document to the
sender enrichment data, the sender property should be present in
enrichment.
type Yes The type of enrichment data entities used for matching. Available values:
businessEntity, employee, and product. See Entities [page
170] for details about the available enrichment data entity types.
subtype No The subtype of enrichment data entities used for matching with
type businessEntity. Available values: supplier, customer, and
companyCode.
top No
The top property specifies the maximum number of matched enrich-
ment data entities returned.
Note
If the top property is not defined, the default value is 1. The maxi-
mum possible value of the property is 50. If you enter a value higher
than 50, you will get an error message with the maximum possible
value.
Note
The following properties are optional, but, in case you want to match enrichment data, at least one of them
is required:
• sender
• receiver
• employee
• product
Post a search or filter request to get the current status of document processing jobs. Returns a list with all
document processing jobs in a JSON file.
Optionally, the jobs can be filtered based on the client ID and a filter query. You have the following catalog
options:
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
options Yes JSON Object Catalog options used when searching for docu-
ments. See the Options Payload table below.
Options Payload
Option Required Data Type Description
{
"clientId":"c_00",
"limit":10,
"offset":2,
"order":"created desc",
"likeFilter":"fileName like \"test receipt\"",
"filter":"status eq failed or documentType eq invoice"
}
Response Fields
totalDocumentCount Total number of document processing jobs returned by the request options
usedOptions Options used in the filtering and/or ordering of document processing jobs
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"results":[
[
{
"status":"DONE",
"id":"c4f25368-d3e6-43f7-a0b4-55adf7f54e95",
"fileName":"test receipt_invoice1.pdf",
"documentType":"invoice",
"created":"2020-03-26 17:00:00.000000+00:00",
"clientId":"c_00",
"finished":"2020-03-26 17:01:30.000000+00:00"
},
{
"status":"PENDING",
"id":"50199d80-c742-453b-830d-8e6ce14568e2",
"fileName":"test receipt invoice2.pdf",
"documentType":"invoice",
"created":"2020-03-26 18:00:00.000000+00:00",
"clientId":"c_00"
},
{
"status":"FAILED",
"id":"50199d80-c742-453b-830d-8e6ce14568e2",
"fileName":"test receipt pa.pdf",
"documentType":"paymentAdvice",
"created":"2020-03-26 19:00:00.000000+00:00",
"clientId":"c_00",
"finished":"2020-03-26 19:01:30.000000+00:00"
}
]
],
"usedOptions":{
"clientId":"c_00",
"limit":10,
"offset":2,
"order":"created desc",
"likeFilter":"fileName like \"test receipt\"",
"filter":"status eq failed or documentType eq invoice"
},
"totalDocumentCount":5
}
Tip
Use the endpoint Post Catalog [page 134] to page through lists of more than 200 documents in a JSON file.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
id Document ID
status Processing status of the document. Possible values: “PENDING”, “DONE”, “CONFIRMED”,
or “FAILED”
The response is given as a status (200, 401, or 500) and JSON file. See Common Status and Error Codes [page
226].
Response Example
200 “Success”
{
"results":[
The Document Information Extraction service takes document files as input and returns a JSON file that
contains the information that has been extracted from the header fields and line items of the specified
document. See Supported Document Types and File Formats [page 84].
Remember
Document Information Extraction generally provides extraction results for an average document in 30
seconds.
However, processing can take longer if the task involved is more complex – for example, if the documents
processed are large.
Before you use the service for important or time-sensitive tasks, we strongly recommend running mass
tests to assess the performance of the service and make sure it meets your requirements.
Request
Base URL: url value from outside the uaa section of the service key
returnNullValue No Boolean Set to true to get all requested fields in the docu-
s ment results, even if they could not be extracted.
For fields that could not be extracted, for example,
because they are not available in the document or
because the service was not able to identify the
field, the value is null.
Response
Response Fields
attributes Dictionary containing the method of the matched enrichment data record. Or dictionary
containing the symbology of the extracted barcode header field.
bocrVersion The version number of the Optical Character Recognition (OCR) service.
clientId Identify the client that submitted the extraction request using the Upload Document [page
127] endpoint.
confidence Prediction confidence score for a field or enrichment data. The possible values are between
0.0 and 1.0.
coordinates Bounding box coordinates for this field (not present if value is null).
extraction Dictionary containing all the extracted header fields and line items.
fileType File format of the document submitted. For example: PDF, PNG, JPEG.
label User-friendly names for header fields and line items. See Add Fields to Schema Version
[page 199].
languageCodes Array containing strings of language codes. For example: "en" for English and "de" for
German.
method Match strategy for each matched enrichment data record. Possible values: “exactTaxId”,
“exactBankAccount”, “exactMaterialNumber”, or “similarity”.
model The model used to extract information from the specified field. Possible values: “ai” or “tem-
plate”. “ai” denotes the machine learning models of the Document Information Extraction
service.
page Page number of the document where the field was found (not present if value is null).
rawValue Value extracted for this field by the Document Information Extraction service as displayed in
the document.
schemaId The ID of the schema used when you uploaded the document.
schemaVersion The version number of the schema used when you uploaded the document.
sender Sender enrichment data. For example: sender name and sender address.
status Processing status of the document. Possible values: “PENDING”, “DONE”, or “FAILED”.
type Data type of the extracted hearder fields and line items.
value Value extracted for this field by the Document Information Extraction service in standar-
dized format.
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success” with SAP_OCROnly_schema ("schemaId":"09e6c9e4-d7b0-414f-bd85-cfee6fbb2add")
{
"status":"DONE",
"id":"2acc2040-f956-4178-9cf4-d02f020626a6",
"fileName":"sample-power_of_attorney-3.pdf",
"documentType":"custom",
"created":"2022-10-04T07:46:03.412498+00:00",
"finished":"2022-10-04T07:46:56.834313+00:00",
"clientId":"c_00",
"languageCodes":[
"en"
],
"pageCount":1,
"schemaId":"09e6c9e4-d7b0-414f-bd85-cfee6fbb2add",
"country":null,
"extraction":{
"headerFields":[
],
"lineItems":[
]
},
"bocrVersion":"2.7.1",
"doxVersion":"local",
"fileType":"pdf",
"dataForRetrainingStatus":"notUsedForTraining"
}
Response Example
200 “Success” without schemaId
{
"status":"DONE",
"id":"a712375f-0b6d-4550-83fb-2271a2301aad",
"fileName":"demo_taxid.pdf",
"documentType":"invoice",
"created":"2022-04-27T09:46:20.090953+00:00",
"finished":"2022-04-27T09:46:45.151654+00:00",
"clientId":"c_00",
"languageCodes":[
"xx"
],
"pageCount":1,
],
"product":[
]
},
"dataForRetrainingStatus":"notUsedForTraining"
}
If the document is processed successfully, Document Information Extraction provides the predictions for the
requested fields. The requested fields are those which were requested in Upload Document [page 127]. When
no value can be detected for fields in header or line items, they do not appear in the response JSON file.
Response Example
200 “Success” with barcode header field extraction
{
"status":"DONE",
"id":"2853a32c-9cf9-415f-9585-82c63c2fa699",
"fileName":"qr_three_codes.pdf",
"documentType":"invoice",
],
"employee":[
],
"product":[
]
},
"dataForRetrainingStatus":"notUsedForTraining",
"extraction":{
"headerFields":[
{
"name":"barcode",
"category":"details",
"value":"https://verificacfdi.facturaelectronica.sat.gob.mx/
default.aspx?
id=706220d0-3b0b-4801-82b8-5f771f8af9c1&re=CSA080218TQ8&rr=NME140730ME0&tt=000009
9576.720000&fe=ZI/I4A==",
"rawValue":"https://verificacfdi.facturaelectronica.sat.gob.mx/
default.aspx?
id=706220d0-3b0b-4801-82b8-5f771f8af9c1&re=CSA080218TQ8&rr=NME140730ME0&tt=000009
9576.720000&fe=ZI/I4A==",
"type":"string",
"page":1,
"confidence":1.0,
"coordinates":{
"x":0.14717741935483872,
"y":0.262617621899059,
"w":0.07782258064516129,
"h":0.05503279155973767
},
"model":"ai",
"group":1,
"attributes":{
"symbology":"QR"
},
"label":"barcode"
},
{
"name":"barcode",
"category":"details",
"value":"https://verificacfdi.facturaelectronica.sat.gob.mx/
default.aspx?
id=706220d0-3b0b-4801-82b8-5f771f8af9c1&re=CSA080218TQ8&rr=NME140730ME0&tt=000009
9576.720000&fe=ZI/I4A==",
"rawValue":"https://verificacfdi.facturaelectronica.sat.gob.mx/
default.aspx?
id=706220d0-3b0b-4801-82b8-5f771f8af9c1&re=CSA080218TQ8&rr=NME140730ME0&tt=000009
9576.720000&fe=ZI/I4A==",
"type":"string",
"page":1,
"confidence":1.0,
]
}
}
Fields can belong to a category. This is indicated by the category property of a field in the response JSON.
An example is a tax with multiple fields. Taxes are returned in the form of a category with the fields taxName,
taxRate, and taxAmount. See all field categories in Extracted Header Fields [page 278] and Extracted Line
Items [page 286].
Response Example
{
"code": "E93",
"message": "Required parameters not provided.",
"details": "string"
}
Response Example
401 “Unauthorized”
{
"message": "No Authorization given in the request header"
}
{
"message": "Internal server error"
}
Save the ground truth (correct values for document fields) for the specified document job ID.
This endpoint takes the job ID of a document submitted previously and returns the corresponding processing
results, or an error, if the given ID isn't found.
Add to the payload extraction (list of all the extracted header fields and line items), and enrichment (list of
the matched enrichment data).
For the fields, the following attributes are part of the ground truth:
• name (required)
• value (required)
• rawValue (optional)
• page (optional)
• coordinates (optional)
For enrichment data, the following attribute is part of the ground truth: id (required).
Caution
It's technically possible to add other attributes to the ground truth payload (for example, confidence), but
they have no impact on the stored values and are ignored.
Note
After saving the ground truth of a document, the prediction confidence score of all header fields and line
items is automatically set to 1.0 (100%). The service assumes that all field values are correct or have been
manually corrected. Only save the ground truth of documents that have been reviewed and don't contain
incorrect extraction results.
Caution
It isn't possible to save ground truth if you used the SAP_OCROnly_schema for the document extraction.
See second “Bad Request” error message in the Response section below.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
payload Yes JSON Object Fields of the document (header fields and line
items) and enrichment data
Note
The structure of the payload is as the
response returned by the Get Result [page
138] endpoint. However, while the top-N en-
richment matches are returned in Get Result
[page 138], for the Save Ground Truth end-
point, the enrichment list must not contain
more than one (ground truth) match for each
sender and employee.
Request Example
{
"extraction":{
"headerFields":[
{
"name":"documentDate",
"value":"2019-02-18"
},
{
"name":"grossAmount",
"value":200
}
],
"lineItems":[
[
{
"name":"description",
"value":"Professional Services"
},
{
"name":"netAmount",
"value":200
},
{
"name":"unitPrice",
"value":200
},
{
"name":"materialNumber",
"value":"007"
}
]
Response
Response Fields
status Status of the ground truth upload. Possible values: “PENDING”, “DONE”, or “FAILED”
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
201 “Created”
{
"status": "DONE",
"message": "Ground truth / corrected values uploaded successfully"
}
Response Example
400 “Bad Request”
{
"code": "E93",
"message": "Required parameters not provided.",
"details": "string"
}
Response Example
400 “Bad Request” (with SAP_OCROnly_schema)
{
"error":{
"code":"ES068",
"message":"Posting ground truth is not allowed for SAP_OCROnly_schema.",
"details":[
Response Example
401 “Unauthorized”
{
"message": "No Authorization given in the request header"
}
Response Example
500 “Internal server error”
{
"message":"Internal server error"
}
Change the status of a document from “DONE” to “CONFIRMED”. After that, the document status is
permanent and cannot be changed anymore. The document extraction values cannot be changed anymore
either. Also use this endpoint to enable the data feedback collection feature to allow documents to be used for
retraining.
Note
SAP reserves the right to use confirmed documents in the reporting of accuracy values and for analytics.
If you set the parameter dataForRetraining to true, you allow the use of confirmed documents to
retrain the machine learning models and improve the service.
Submitting retraining data and documents to SAP does not guarantee that SAP will actually use the data
for improving the service, or that SAP guarantees that potential errors will be fixed in future improved
versions of the service.
The prediction confidence score of all header fields and line items is automatically set to 1.0 (100%)
for confirmed documents. The service assumes that all field values are correct or have been manually
corrected. Only confirm documents that have been reviewed and don't contain incorrect extraction results.
Request
Base URL: url value from outside the uaa section of the service key
Note
The data feedback collection feature is only
available on production environments for en-
terprise accounts. This feature is not available
for trial account users.
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"status": "CONFIRMED",
"message": "Document confirmed successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
The response is given as a status (200, 400, 401, 404, or 500) and document file in the format previously
uploaded using the Upload Document [page 127] endpoint. See Common Status and Error Codes [page 226].
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
results List containing the text and the corresponding bounding boxes (specified by the returned
coordinates) of all pages of a document
The response is given as a status (200, 400, 401, 404, or 500) and JSON file. See Common Status and Error
Codes [page 226].
Response Example
200 “Success”
{
"results":{
"1":[
{
"word_boxes":[
{
"bbox":[
[
890,
141
],
[
1028,
174
]
],
"content":"Rocket"
},
{
"bbox":[
[
1049,
141
],
[
1275,
182
]
],
"content":"Enterprises"
},
{
"bbox":[
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
value List containing the text and the corresponding bounding boxes (specified by the returned
coordinates) of a single page of a document
The response is given as a status (200, 400, 401, 404, or 500) and JSON file. See Common Status and Error
Codes [page 226].
Response Example
200 “Success”
{
"value":[
{
"word_boxes":[
{
"bbox":[
[
890,
141
],
[
1028,
174
]
],
"content":"Rocket"
},
{
"bbox":[
[
1049,
141
],
[
1275,
182
]
],
"content":"Enterprises"
},
{
"bbox":[
[
1297,
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
extraction Dictionary containing all the extracted header fields and line items
receivedDate The date when the document was received, for example, 2020-02-17.
The response is given as a status (200, 400, 401, 404, or 500) and JSON file. See Common Status and Error
Codes [page 226].
Response Example
200 “Success”
{
"extraction": "...",
"documentType": "invoice",
"receivedDate": "2020-02-17"
}
Get all the templates associated with the specified document ID.
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"templateId":[
"4476cc01-72f3-4b64-9eb0-cdd9c1cb27ff"
]
}
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
status Deletion status of the document. Possible values: “PENDING”, “DONE”, or “FAILED”
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"status": "DONE",
"message": "Documents deleted successfully.",
"processedTime": "2020-03-26T17:00:00.000000+00:00"
}
Document Information Extraction can also enrich the information extracted from documents with your existing
structured data (typically master data records).
Enrichment in that context means to provide additional information to a document, which is not directly
contained on a document, but which is inferred based on information, which is contained on a document in
conjunction to other external data.
You can, for example, infer the proprietary ID of a customer from another SAP system based on the sender
address contained on an invoice document. Even though the customer ID is not explicitly contained on the
invoice, the ID from the SAP system can be inferred by using the address data contained on the invoice by
matching it against the relevant master data.
The service matches enrichment data entities with the Extracted Header Fields [page 278] and Extracted Line
Items [page 286] from processed documents.
The Enrichment Data API provides the functionalities to create, update, get and delete enrichment data. After
enrichment data entities have been maintained, please check the usage of the enrichment property in Upload
Document [page 127] in order to leverage the matching of enrichment data to extracted fields.
Related Information
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
payload Yes JSON Object List containing enrichment data entities in value
property. The entities can be:
type Yes String The type of enrichment data entities used for
matching specified in the JSON string of the
payload. Available values: businessEntity,
employee, and product.
Request Examples
Create BusinessEntity [page 170] entities:
payload:
{
"value":[
{
"id":"BE0001",
"name":"Emma Dowerg",
"accountNumber":"SK2421",
"address1":"Amalie-Klemm-Platz 0/9, 48581, Geithain",
"address2":"Near city church",
"city":"Geithain",
"countryCode":"DE",
"postalCode":"48581",
"state":"Schleswig-Holstein",
"email":"e.dowerf@mustermail.com",
"phone":"+49(0) 909979463",
"bankAccount":"DE345982837402",
"taxId":"DE435531312"
},
{
"id":"BE0002",
"name":"Ioannis Kruschwitz",
"accountNumber":"393H292",
"address1":"Alina-Reichmann-Allee 73, 63228, Staßfurt",
"city":"Staßfurt",
"countryCode":"DE",
"postalCode":"63228",
"state":"Hessen",
"email":"Ioannis.Kruschwitz@mustermail.com",
"phone":"+49(0) 818172710",
"bankAccount":"DE1093628093743",
"taxId":"DE593029048"
}
]
}
type: businessEntity
clientId: c_00
subtype: supplier
payload:
{
"value":[
{
"id":"E0001",
"email":"john.will.doe@mustermail.com",
"firstName":"John",
"middleName":"William",
"lastName":"Doe"
},
{
"id":"E0002",
"email":"m.gierschner@mustermail.com",
"firstName":"Maren",
"middleName":"Volkhard",
"lastName":"Gierschner"
}
]
}
type: employee
clientId: c_00
payload:
{
"value": [
{
"id": "12342",
"description": "Glycerin Retinol 80 ML",
"materialNumber": "B676817",
"unitPrice": "1000,0 €",
"unitOfMeasure": "LTR"
}
]
}
type: product
clientId: c_00
Response
Response Fields
id Request ID
The response is given as a status (201, 400, 401, 422, 429, or 500) and JSON file. See Common Status and
Error Codes [page 226].
Response Example
201 “Success”
Related Information
12.1.7.1.1 Entities
Entities are several actors which can be addressed by a business document. A business entity can be, for
example, a customer and a supplier. The employee entity represents an employee in the company. The product
entity represents a specific good or service available in a catalog or system.
Related Information
12.1.7.1.1.1 BusinessEntity
A businessEntity can represent different kind of organizations with which you deal as a company. It can
represent, for example, suppliers and customers.
Length (maximum
Key Type length of the string) Description Example
12.1.7.1.1.2 Employee
12.1.7.1.1.3 Product
Length (maximum
Key Type length of the string) Description Example
unitOfMeasure String 100 The unit of measure LTR for liter and KGM
UN/CEFACT code. for kilogram.
Use variants to create multiple versions of the same data record, which all point to the same record ID.
To create a data record variant, add the variant key to the Create Enrichment Data [page 167] payload:
payload:
{
All the variants are used for the enrichment. If a data record match is associated with a variant ID, the matched
variant ID is returned by Get Result [page 138] alongside the usual enrichment result information. For example:
enrichment: {
"id":"BE0001",
"confidence":98.647,
"variant":2
}
The variant ID is an optional parameter. If absent, the data record is not associated to any variant. If used,
variant IDs can be a number in the inclusive range 1 - 9. Any other variant ID is invalid and will result in an error.
Creating another master data record with the same ID and variant ID will not result in an error. Instead, the
behavior is the same as creating a data record with an already existing ID, but both without variant IDs. See
Data Duplicates [page 173].
Note
A single invalid variant ID value (for example, a variant that is not a number in the inclusive range 1 - 9) will
cause the whole batch (API request) to fail.
Tip
You can create multiple variants of the same data record (all sharing the same ID) but in different
languages.
Find out how how the Document Information Extraction service handles the upload of duplicated master data
records.
A master data record “X” is considered a duplicate by the Document Information Extraction service if there is
another existing record “Y” which fulfills all of the following conditions:
The service filters out duplicate records as part of the automatic or manual data activation. If one or more
duplicates are identified, the following update rule is applied to all of them: the most recently created record
replaces all previously created versions of that record.
This process optimizes the service experience and results for most common use cases in which duplicated
records are not intended. If duplicated records are required as part of an individual use case, this can be
achieved using variant IDs.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
limit No Integer Items per page. Defines a maximum limit. For ex-
ample: 10. See Technical Constraints [page 275].
Response
Response Fields
id Data-persistence job ID
status The status of this data-persistence job. Possible values: “PENDING”, “SUCCESS”, or
“FAILED”
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"value":{
"id":"c4f25368-d3e6-43f7-a0b4-55adf7f54e95",
"status":"PENDING",
"clientId":"c_00",
"created":"2020-05-08T10:39:59.916359+00:00"
}
}
Note
Enrichment data is refreshed automatically every 4 hours. It might take up to 4 hours until the enrichment
data prediction is available in the Get Result [page 138] response. Manual data activation is also available
and is the recommended process. You can set data activation to manual using the following endpoints:
Base URL: url value from outside the uaa section of the service key
Request Parameters
limit No Integer Items per page. Defines a maximum limit. For ex-
ample: 10. See Technical Constraints [page 275].
type Yes String The type of enrichment data entities used for
matching. Available values: businessEntity,
employee, and product.
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"value": [
{
"id": "BE0001",
"name": "A",
"accountNumber": "12345",
"address1": "A street 5",
"address2": "",
"city": "Heidelberg",
"countryCode": "DE",
"postalCode": "69117",
"state": "BW",
"email": "a@a.com",
"phone": "",
"bankAccount": "000001",
"taxId": "999",
"companyCode": "4711",
"system": "System A"
}
]
}
Give a data persistence job ID to check the database and receive information on this data persistence job.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
id Request ID.
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"value":{
"id":"b89645b4-605b-45cd-bf69-1147875e75f5",
"status":"SUCCESS",
"processedTime":"0:00:00.063022",
"refreshedAt":"2021-01-16T13:36:29.453713+00:00"
}
}
Response Example
{
"code": "E5",
"message": "Failed to retrieve data.",
"details": "string"
}
{
"message": "No Authorization given in the request header"
}
Create a data activation job record to see new or updated enrichment data in the extraction results if you
are using the manual data activation process. Only activated enrichment data will be added to the extraction
results.
Remember
Before creating an enrichment data activation job record, you need to Create Configuration [page 115].
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
201 “Success”
{
"id": "484b6e1c-501c-4a07-85cb-84554656a175",
"status": "PENDING"
}
Give an enrichment data activation job record ID to check the database, and receive information on this data
activation job.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response Fields
created Time when the enrichment data was submitted for processing
finished Time when the enrichment data status changed to “DONE”, or “FAILED”
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"value": {
"id": "484b6e1c-501c-4a07-85cb-84554656a175",
"status": "DONE",
"processedTime": "0:01:00",
"created": "2019-07-04T15:20:37.668873+00:00",
"finished": "2019-07-04T15:21:37.668873+00:00"
}
}
Caution
This endpoint has been deprecated and is scheduled for decommissioning in November 2024. Please use
the endpoint Delete Enrichment Data (Asynchronous) [page 182] to delete data records.
Note
To delete large numbers of data records, use only the endpoint Delete Enrichment Data (Asynchronous)
[page 182].
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
payload Yes JSON Object Comma-separated list of data record IDs that you
want to delete
type Yes String The type of enrichment data entities used for
matching specified in the JSON string of the
payload. Available values: businessEntity,
employee, and product.
Response
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"deleted": "2"
}
This endpoint accepts an array of data record IDs that you want to delete. If no array is entered in the payload,
all entries are deleted.
Tip
Delete outdated and no longer used data records frequently to improve the performance of the data
enrichment feature when matching a business document to an enrichment data record based on the
information extracted from the document.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
payload Yes JSON Object Comma-separated list of data record IDs that you
want to delete. All data records are deleted if
payload is empty.
Request Examples
Delete all data records:
payload:
{
"value":[]
}
payload:
{
"value":[]
}
payload:
{
"value":[]
}
type: employee
payload:
{
"value":[]
}
type: product
Response
Response Fields
id Request ID
The response is given as a status (201, 400, 401, 422, or 500) and JSON file. See Common Status and Error
Codes [page 226].
Response Example
201 “Success”
{
"id": "484b6e1c-501c-4a07-85cb-84554656a175",
"status": "PENDING"
}
Create schemas containing data fields found in standard or custom document types. You can use these
schemas as a basis for creating templates. You can select schemas and associated templates when adding
documents. The Schema API provides endpoints to create, list, update, and delete schemas and schema
versions.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Request Example
{
"clientId":"c_00",
"name":"Custom_Payment_Advice_Schema",
"schemaDescription":"Schema For Accounts Department Payment Advices",
"documentType":"paymentAdvice",
"documentTypeDescription":"Payment Advice with Order Number"
}
Response Fields
id ID of the schema
The response is given as a status (201, 400, 401, 429, or 500) and JSON file. See Common Status and Error
Codes [page 226] and Technical Constraints [page 275].
Response Example
201 “Success”
{
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"created":"2020-03-26T17:00:00.000000+00:00"
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response
Response Fields
id ID of the schema
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"schemas":[
[
{
"name":"Basic Involve FormatSchema",
"schemaDescription":"SAP Invoice Schema",
"documentType":"Invoice",
"documentTypeDescription":"Payment Advice with Order Number",
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"predefined":"True",
"created":"2020-03-26T17:00:00.000000+00:00",
"updated":"2020-04-26T17:00:00.000000+00:00",
"state":"draft"
},
{
Request
Base URL: url value from outside the uaa section of the service key
Response
Response Fields
The response is given as a status (200, 401, or 500) and JSON file. See Common Status and Error Codes [page
226].
Response Example
200 “Success”
{
"documentTypes":[
]
},
{
"name":"ml",
"properties":[
"x",
"y",
"w",
"z"
]
},
{
"name":"...",
"properties":"[]"
}
],
"formatting":[
{
"name":"string",
"properties":[
{
"name":"length",
"values":[
"number"
]
}
]
},
{
"name":"number",
"properties":[
{
"name":"length",
"values":[
"number"
]
},
{
"name":"thousandSeparator",
"values":[
".",
",",
" "
]
},
{
"name":"decimalSeparator",
"values":[
".",
",",
" "
]
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Request Example
{
"name":"Custom_Payment_Advice_Schema",
"schemaDescription":"Schema For Accounts Department Payment Advices",
"documentTypeDescription":"Payment Advice with Order Number"
}
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
201 “Success”
{
"message":"Schema has been updated successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response Fields
id ID of the schema
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"name":"Basic Involve FormatSchema",
"schemaDescription":"SAP Invoice Schema",
"documentType":"Invoice",
"documentTypeDescription":"Payment Advice with Order Number",
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"created":"2020-03-26T17:00:00.000000+00:00",
"updated":"2020-04-26T17:00:00.000000+00:00",
"predefined":"FALSE",
"state":"draft",
"headerFields":[
{
"name":"GrossAmountValue",
"description":"TotalAmountValue",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"1.0.0",
"setup":{
"type":"default",
"priority":1,
"filter":[
{
"key":"language",
"value":"EN"
},
{
},
"formattingTypeVersion":"1.0.0"
}
],
"lineItemFields":[
{
"name":"Amount",
"description":"TotalAmountValue",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"1.0.0",
"setup":{
"type":"default",
"priority":1,
"filter":[
{
"key":"language",
"value":"EN"
},
{
"key":"language",
"value":"DE"
}
],
"properties":[
{
"key":"deploymentID",
"value":"123e4567-e89b-12d3-a456-426614174000."
},
{
"key":"fieldName",
"value":"DocumentDate"
}
]
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Request Example
{
"value":[
"4476cc01-72f3-4b64-9eb0-cdd9c1cb27ff"
]
}
Response Fields
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"message":"Schemas deleted successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response Fields
id ID of the schema
The response is given as a status (201, 400, 401, 429, or 500) and JSON file. See Common Status and Error
Codes [page 226] and Technical Constraints [page 275].
Response Example
201 “Success”
{
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"version":"2",
"created":"2020-03-26T17:00:00.000000+00:00"
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226] and Technical Constraints [page 275].
Response Example
201 “Success”
{
"message":"Schema version activated successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226] and Technical Constraints [page 275].
Response Example
201 “Success”
{
"message":"Schema version deactivated successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Remember
Each label can have a maximum length of
200 characters.
• auto
• manual
Restriction
The setup type auto is available without de-
fault extractor for schemas with the service
plan Document Information Extraction, pre-
mium edition (premium_edition) only. See
Service Plans [page 77] and Metering and Pric-
ing [page 79].
Caution
Always validate information extracted using
generative AI before using it for critical appli-
cations.
Note
To consume the setup types "auto" and
"manual", use the setupTypeVersion
2.0.0.
{
"headerFields":[
{
"name":"documentDate",
"label":"Document Date",
"description":"Document Date",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"manual",
"priority":1
},
"formattingType":"date",
"formatting":{
"dateformat":"dd/mm/yy"
},
"formattingTypeVersion":"1.0.0"
}
],
"lineItemFields":[
{
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"discountAmount",
"label":"Discount Amount",
"description":"Discount Amount",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"manual",
"priority":1
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
}
]
}
Request Example: Payload with label, setupType auto without defaultExtractor, and
setupTypeVersion 2.0.0
{
"headerFields":[
{
"name":"documentDate",
"label":"Document Date",
"description":"Document Date",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"auto",
"priority":1
},
"formattingType":"date",
"formatting":{
"dateformat":"dd/mm/yy"
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"documentNumber",
"label":"Document Number",
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"auto",
"priority":1
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
}
],
"lineItemFields":[
{
"name":"netAmount",
"label":"Net Amount",
"description":"Net Amount",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"auto",
"priority":1
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"discountAmount",
"label":"Discount Amount",
"description":"Discount Amount",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"2.0.0",
"setup":{
"type":"auto",
"priority":1
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
}
]
}
{
"headerFields":[
{
"name":"DocumentNumber",
"description":"",
"defaultExtractor":{
},
"formattingType":"string",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"TaxId",
"description":"",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"1.0.0",
"setup":{
},
"formattingType":"string",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
}
],
"lineItemFields":[
{
"name":"Quantity",
"description":"",
"defaultExtractor":{
"fieldName":"quantity"
},
"setupType":"static",
"setupTypeVersion":"1.0.0",
"setup":{
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"netAmount",
"description":"",
"defaultExtractor":{
},
"setupType":"static",
"setupTypeVersion":"1.0.0",
"setup":{
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
},
{
"name":"UnitPrice",
},
"formattingType":"number",
"formatting":{
},
"formattingTypeVersion":"1.0.0"
}
]
}
Response
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226] and Technical Constraints [page 275].
Response Example
201 “Success”
{
"message":"Schema fields have been uploaded successfully."
}
Request
Base URL: url value from outside the uaa section of the service key
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response
Response Fields
id ID of the schema
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"schemas":[
[
{
"name":"Basic Involve FormatSchema",
"schemaDescription":"SAP Invoice Schema",
"documentType":"Invoice",
"documentTypeDescription":"Payment Advice with Order Number",
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"version":"1",
"predefined":"True",
"created":"2020-03-26T17:00:00.000000+00:00",
"updated":"2020-04-26T17:00:00.000000+00:00",
"state":"draft"
},
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
Response Fields
id ID of the schema
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"schemas":[
[
{
"name":"Basic Involve FormatSchema",
"schemaDescription":"SAP Invoice Schema",
"documentType":"Invoice",
"documentTypeDescription":"Payment Advice with Order Number",
"id":"484b6e1c-501c-4a07-85cb-84554656a175",
"predefined":"True",
"created":"2020-03-26T17:00:00.000000+00:00",
"updated":"2020-04-26T17:00:00.000000+00:00",
"state":"draft"
},
{
"name":"Daimier Payment Advice Schema",
"schemaDescription":"Payment Advice Schema",
"documentType":"Payment Advice",
"documentTypeDescription":"Payment Advice with Order Number",
"id":"484b6e1c-501c-4a07-85cb-84554656a189",
"predefined":"False",
"created":"2020-03-26T17:00:00.000000+00:00",
"updated":"2020-04-26T17:00:00.000000+00:00",
"state":"active"
}
]
]
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
clientId Yes String The ID of the client used when creating the
schema. Example: c_00
payload Yes JSON Object Comma-separated list of the schema versions you
want to delete. The schema and all its versions
are deleted if payload is empty. You can't delete
version "1".
Request Example
{
"version":[
"5"
]
}
Response
Response Fields
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
{
"message":"Schema versions deleted successfully."
}
Create, reuse, edit, and delete templates based on schemas and document types. You can select templates
together with a corresponding schema to extract information from business documents of the appropriate
type and structure. The Template API provides endpoints to create, update, list, import, export, activate,
deactivate, and delete templates. You can also associate documents with a template and dissociate documents
from a template using the Template API endpoints.
Request
Base URL: url value from outside the uaa section of the service key
Note
If id is not provided, a template ID is gener-
ated and returned.
Request Example
{
"id":"37c8a59b-b210-48c1-9002-19ec989066eb",
"name":"Test_Template",
"description":"Test description",
"clientId":"c_00",
"schemaId":"37c8a59b-b210-48c1-9002-19ec989066eb",
"schemaVersion":"1"
}
Response
Response Fields
id Template ID
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226] and Technical Constraints [page 275].
Response Example
201 “Created”
{
"id":"31516520-b4c9-40a6-b9ba-94d1800d472d"
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
offset No Integer The offset of the query result start index to be re-
turned. Example: 0
Response
Response Fields
extraction Dictionary containing all the extracted header fields and line items
id Template ID
isActive Set to true if template has been activated. Set to false if template has not been activated, or
it has been deactivated
schemaId Schema ID
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"results":[
{
"id":"5fb6279a-1bb9-4e37-b3bc-95ffb0e3d220",
"schemaId":"3e048fac-7799-45dc-a360-ff921d8ef152",
"name":"Test Template",
"description":"Test Description",
"language":"en",
"documentType":"invoice",
"clientId":"c_00",
"status":"NO_SAMPLES",
"isActive":true,
"creationDate":"2023-11-14T07:39:23.536547+00:00",
"lastUpdatedDate":"2023-11-14T07:39:23.536547+00:00",
"documentAssociations":[
{
"id":"sample_id"
}
],
"extraction":{
"headerFields":[
{
"name":"documentNumber",
"label":"Document Number:",
"type":"number"
}
]
}
},
{
"id":"1213723c-bdff-4b2a-b821-93f051966b0c",
"schemaId":"0f68b9c8-1e10-467d-a01a-23ffae9b5e4e",
"name":"Test Template 2",
"description":"Test Description 2",
"language":"en",
"documentType":"invoice",
"clientId":"c_00",
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
The response is given as a status (201, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
201 “Created”
Get template details for a template ID. You can only get template details that belong to the same zone_id and
client_id.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
extraction Dictionary containing all the extracted header fields and line items
id Template ID
isActive Set to true if template has been activated. Set to false if template has not been activated, or
it has been deactivated
schemaId Schema ID
The response is given as a status (200, 400, 401, or 500) and JSON file. See Common Status and Error Codes
[page 226].
Response Example
200 “Success”
{
"id":"37c8a59b-b210-48c1-9002-19ec989066eb",
"schemaId":"608aa59c-4895-4308-bcae-905f8f343acc",
"name":"Test Template",
"description":"Test Template Description",
"language":"en",
"documentType":"invoice",
"clientId":"c_00",
"status":"NO_SAMPLES",
"isActive":true,
"creationDate":"2023-11-14",
"lastUpdatedDate":"2023-11-14T07:39:23.536547+00:00",
"schemaName":"SAP_Schema",
"documentAssociations":[
{
"id":"f58f7e0b-a1a8-449c-aa4b-6c71e256cd3e"
}
],
"extraction":{
"headerFields":[
{
"name":"string",
"label":"string",
"type":"string"
}
],
"lineItemFields":[
{
"name":"string",
"label":"string",
"type":"string"
}
]
}
}
Delete a template and its links to the associated documents for a template ID.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"message":"Successfully deleted 1 template."
}
Activate a template.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"message":"Successfully activated the template"
}
Deactivate a template.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"message":"Successfully deactivated the template"
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"message":"Successfully added document to the template."
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"message":"Successfully removed document from the template."
}
Export a template.
Note
You can download malware-scanned documents only. You can't download documents that are part of the
template export package but haven't been malware-scanned during upload.
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
The response is given as a status (200, 400, 401, 410 or 500). See Common Status and Error Codes [page
226].
Response Example
200 “Success”
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
payload Yes JSON Object List containing all fixed-value fields of a template
Response
Response Fields
id Metadata ID
The response is given as a status (201, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
201 “Accepted”
{
"id":"b6e6ddaf-ceb0-4245-ab07-6ced50b18807"
}
Request
Base URL: url value from outside the uaa section of the service key
Request Parameters
Response
Response Fields
The response is given as a status (200, 400, 401, or 500). See Common Status and Error Codes [page 226].
Response Example
200 “Success”
{
"metadata":[
{
"name":"name",
"value":"value"
}
]
}
Code Reason
413 The request you are making is too large. Either you are send-
ing a file that is too large or trying to process too many
objects in a single request. See Technical Constraints [page
275].
12.2 Notifications
Use this functionality to get notifications about your processed documents status without having to constantly
poll the Document Information Extraction service. Through this functionality, Document Information Extraction
notifies an endpoint using a callback URL that you specify with the explicit name of document-information-
extraction-callback. The notification callback request will only be sent once document processing has
either completed or failed.
Note
Document Information Extraction sends only one notification per document without retry.
Restriction
The notifications functionality is available from 2020-05-18. Any service instance created before this date
does not include this functionality. If existing customers want to use their existing instances with this
new functionality, they need to subscribe to the Document Information Extraction UI in SAP Business
Technology Platform, as described in Subscribing to the Document Information Extraction UI [page 234]
(procedure steps from 1 until 6).
Related Information
Prerequisites
You have subscribed to the Document Information Extraction UI in SAP Business Technology Platform.
Tip
In Subscribing to the Document Information Extraction UI [page 234], observe the prerequisites and follow
the procedure steps from 1 until 4.
To use the notifications functionality, you need to enable the Cloud Foundry Destination Service at subaccount
level via the Entitlements. After that, Destinations will be visible in the left navigation pane.
Create a new destination configuration that includes the callback URL, and some additional information about
authentication credentials and the ProxyType.
Name the callback endpoint document-information-extraction-callback. You can only have one
callback endpoint with this name on subaccount level. This destination configuration callback URL must link to
an endpoint connected to the Internet.
Example
Example
• NoAuthentication
• BasicAuthentication
• OAuth2 Client Credentials
The Document Information Extraction callback sends a POST request to the URL specified in the destination
configuration with the name document-information-extraction-callback.
Example
Payload
The payload will be sent with the POST request to the specified callback URL in the destination configuration
specified by the customer.
The payload includes the ID of the uploaded document and its status. These two fields are in alignment with the
other Document Information Extraction API fields:
{
"id": "d7c08124-d852-408f-8d46-0466312f6007",
"status": "DONE"
}
Example
NoAuthentication
CURL representation of the POST request with no authentication to the callback URL of the customer:
Example
BasicAuthentication
CURL representation of the POST request with basic authentication to the callback URL of the customer:
Example
CURL representation of the POST request with OAuth2 client credentials to the callback URL of the customer:
The status of the callback response should be 200 “OK”, as you can see in the curl response below. Statuses
less than 400 are also accepted.
Request
Response
Note
The body of the callback response is not relevant to the Document Information Extraction service, only the
response status of 200.
Find out how to subscribe to, access, and use the Document Information Extraction UI.
Related Information
To use the Document Information Extraction UI and other features, you need to subscribe to the service UI in
SAP Business Technology Platform (SAP BTP).
Prerequisites
• You have an SAP BTP global account and a Cloud Foundry subaccount.
• You’re a global account administrator.
• You’ve created a service instance for Document Information Extraction.
• You’ve created business users and user groups in your identity provider (IdP). SAP ID Service is the default
IdP, but you can also add your instance of the Identity Authentication service or a different IdP.
Note
If you use the Identity Authentication service, see Establish Trust and Federation Between UAA and
Identity Authentication.
If you use a different IdP, see Establish Trust and Federation with UAA Using Any SAML Identity
Provider.
Tip
You can also use the Set up account for Document Information Extraction booster in the SAP BTP cockpit
to automate the process. In this case, you don’t need to perform the steps for subscribing to the Document
Information Extraction UI described here. See Boosters and the tutorial Use Free Tier to Set Up Account for
Document Information Extraction and Go to Application .
Note
You can create multiple service instances for Document Information Extraction. However, we recommend
creating only one, unless there’s a compelling reason for having more.
If you do use more than one instance, you can change between instances by choosing Settings
( cogwheels icon) Change Instance on the Document Information Extraction UI. You can specify the
instance by entering its name or its ID.
Procedure
Remember
Before proceeding, check whether you’ve created an instance for Document Information Extraction. If
you haven’t, create the service instance before continuing with the following steps. Creating a service
instance is a prerequisite for using the Document Information Extraction UI.
Note
You may not have to log on explicitly at this point if the following is true:
• You’ve configured your user to log in with a certificate.
• Your user already has an active session on your IdP.
Find out about the role collections you can use with the Document Information Extraction UI.
Document Information Extraction provides default role collections that you can assign to users. These role
collections determine which actions a user can carry out on the Document Information Extraction UI.
The default role collections grant users the following read/write permissions:
Document Template/Schema
Document_Inform
ation_Extractio
n_UI_Document_V
iewer
Document_Inform
ation_Extractio
n_UI_End_User
ation_Extractio
n_UI_Templates_
Admin
Remember
Find out how to use the Document Information Extraction UI features for documents, schemas, and templates.
Note
For recommendations on getting better extraction results, see Optical Character Recognition (OCR): Best
Practices [page 258].
For instructions on how to set the language of the Document Information Extraction UI, see Set Screen
Language [page 237].
For information about how to use the integrated digital assistant to find answers to support-related questions,
see Built-In Support [page 238].
Select the screen language for the Document Information Extraction UI.
Context
German de
English en
Spanish es
French fr
Italian it
Japanese ja
Korean ko
Portuguese pt
Russian ru
Note
The SAP Companion in-app help is also available in the language you select for the UI. Display this help by
choosing (question mark) in the top-right of the screen.
Procedure
1. Open the dropdown for your user name at the top-right of the screen.
2. Select Languages.
3. Select your preferred language.
4. Complete your entries by choosing Apply.
Use the integrated digital assistant on the Document Information Extraction UI to quickly find answers to your
support-related questions.
Context
The Document Information Extraction UI includes Built-In Support, an embedded digital assistant that allows
you to search for support-related information without leaving the UI.
If you have an s-user ID and the associated authorizations, Built-In Support also allows you to report issues,
review cases, and chat with an expert or a chatbot.
Procedure
The Built-In Support initial screen appears. This screen gives you access to the basic support functions
that are available to all users. Here, you can enter keywords in the intelligent search field to find
relevant information in the documentation for Document Information Extraction UI. You can also call up
recommended information about the service directly via the links provided.
2. Choose the Help Information ( hint icon).
The Contextual Help screen appears. Here, you can access information, including tutorial videos, the
Built-In Support documentation, the privacy statement, and the terms of use.
3. Choose (person icon) to view system context information.
If you have an s-user ID, you can sign in to access more Built-In Support functions. These functions allow
you to report issues via case or by chatting with an expert. In addition, you can review your cases.
13.2.3 Document
Use this Document Information Extraction UI feature to upload documents to the service and get machine
learning predictions for the extracted header fields and line items.
Context
For additional information on working with documents, see the best practices under Document: Best Practices
[page 270].
Procedure
1. Open the Document Information Extraction UI, as described in Subscribing to the Document Information
Extraction UI [page 234].
2. Click the Document icon in the left navigation pane.
3. Click Upload a new document ( add icon) at the top right of the screen.
The Select Document area appears. Here, you can upload a maximum of 50 files. Add files individually or
select a folder containing multiple files. Each file can have a maximum size of 50 MB and 100 pages. The
service supports the following document types: invoice, payment advice, purchase order, and custom in
PDF, JPG, PNG, and TIFF format.
4. Select the document type.
5. Choose a schema and a template, making sure that both match the document type you selected in the
preceding step. You can also use the Detect automatically function to get the service to search for the
correct template. These entries are optional.
The Document Information Extraction UI includes preconfigured SAP schemas for the following standard
document types: purchase order, payment advice, and invoice. In addition, there’s an SAP schema for
custom documents (SAP_OCROnly_schema). Templates are available only if your administrator has
created and activated them.
Tip
For best extraction results, we strongly recommend using a schema whenever you upload documents.
For further details, see the best practices for Schema Configuration: Best Practices [page 259].
Note
If you later want to create a template based on your document extraction results, you must choose a
schema here. See Create Template from Document Extraction Results [page 256].
Also, if you later want to add the document to a template, you must choose a schema here. Documents
and the templates they’re associated with must share the same schema. See Add Documents and
Activate/Deactivate Template [page 253].
6. Upload one or more document files by dragging and dropping them or by clicking (add icon).
7. Click Step 2
The Select Header Fields area contains the header fields for extraction from the uploaded documents. If
you didn’t choose a schema in the Select Document step, you can select fields from the list. If you did
choose a schema, the fields are selected automatically and can’t be changed.
8. Click Step 3.
The Select Line Item Columns area shows the line items for extraction from the documents you uploaded.
Here, too, if you didn’t choose a schema in the Select Document step, you can select fields from the list. If
you did choose a schema, the fields are selected automatically and can’t be changed.
You now see the documents you’ve uploaded, with Document Name, Upload Date, and Status. When the
selected header fields and line items have been extracted, the document status changes from “PENDING”
to “READY”. You can now review the extraction results and make any corrections required. If an error
occurs during document processing, the status changes from “PENDING” to “FAILED”. In this case, you
must upload the document again.
11. In the top right of the screen, you see the clientId (c_00, for example) of the listed uploaded documents.
Click Change Client and select another clientId (c_01, for example) to see the list of uploaded
documents that have a different clientId.
Before you can change clients, there must be at least one client in addition to Default. You can’t create
clients on the Document Information Extraction UI. To add new clients, use Swagger UI and follow the
steps in Create Client [page 107].
Note
You can restrict user access to specified clients using the clientSegregation configuration key. For
more details and guidance, see Create Configuration [page 115] and Client Segregation in Document
Information Extraction: A Brief Guide .
Find out how to download data needed to troubleshoot issues with adding documents to the Document
Information Extraction UI.
Context
For each document that you add to the Document Information Extraction UI, you can download a zip folder
with files for troubleshooting.
Procedure
1. Choose the Document icon in the navigation on the left of the screen.
2. Now, choose a document to display its details.
The Document Information Extraction UI downloads a zip folder to your local machine. The files in the
folder include the document that you uploaded as well as details of the document, template, and schema.
Context
Remember
Document Information Extraction generally provides extraction results within 1 hour for documents
uploaded to the service. The actual processing time can be much shorter.
Procedure
Note
If your device has a small screen, and you have difficulty checking the fields in the page preview,
download the PDF document for full-screen display.
3. Click Extraction Results to see the results for header fields and line items. You can also see the machine
learning model Extraction Confidence Range classified by colors: red (confidence between 0% and 50%),
yellow (confidence between 51% and 79%), and green (confidence between 80% and 100%). To view the
prediction confidence score for each header field and line item extracted, as well as the field name and
description, hover over a field name, for example Invoice Number.
Hovering over a field name also displays the raw value for that field – in other words, the value before
postprocessing. Raw values can differ from extraction results. For example, if the Delivery Date field of a
purchase order contains “ASAP”, Document Information Extraction can’t convert this text into a date and
therefore returns a null value. Viewing raw values enables you to identify the content of fields that couldn’t
be extracted.
Tip
If the label property is defined for schema fields, user-friendly names for header fields and line items
are displayed in the extraction results. For further information, see Add Fields to Schema Version [page
199].
4. If corrections are required, and the document status is “READY”, you can edit the extraction results under
Header Fields and Line Items.
To download the unedited results, click (download icon) and choose csv, json, or txt.
Tip
To avoid losing your work if there’s an outage, activate Autosave. The service then saves your edits
automatically every 10 seconds.
You can edit extracted values manually on the right of the screen. You can also select them from the
page preview in the middle of the screen. To do the latter, hover your mouse over the page preview. The
mouse pointer changes to a crosshair cursor. Position the cursor at the corner of the value that you wish to
select. Then, hold down the left mouse button. Move the cursor diagonally to the opposite corner to draw a
bounding box around the value you want to select. Select the appropriate header or line item field from the
Field dropdown in the Assign Field dialog. Add or change the value, as necessary. If you choose a line item,
set the number in the Row Index field. Make sure the number that you enter here matches the appropriate
line item in the Label column on the right of the screen. Click Apply in the Assign Field dialog to confirm
your edits.
Note
To prevent Document Information Extraction from extracting unwanted or irrelevant characters, you
can also draw bounding boxes around parts of the field values. In this case, you must edit the value so
that it includes only the values in the bounding box. If you associate documents edited in this way with
templates, the templates extract only those characters in the part of the field defined by the bounding
box. This approach can be useful if you want to exclude punctuation from the extraction, for example.
Tip
If you’ve uploaded your documents using a schema but without a template, you can create a template
here using the extraction values you’ve edited.
For instructions on how to do so, see Create Template from Document Extraction Results [page 256].
Note that this option is no longer available after you confirm the document.
Alternatively, you can associate the document with an existing template by choosing Add to Template.
Remember
If you associate a document with a template and then use that template to extract information from
the same document, the extraction values can differ from the ones you entered and confirmed during
editing.
The technical reason for differences of this kind is that the Document Information Extraction UI
extracts data based on heuristics and not on exact matching of bounding boxes.
6. Delete any bounding boxes that you don’t need. In Edit mode, hover over the tooltip for the relevant
bounding box in the page preview. Double-click the tooltip to display the Assign Field dialog and then
choose Delete to remove the bounding box and its coordinates.
7. Save your changes.
To download your edited results, click (download icon) and choose csv, json, or txt.
8. You can also confirm the document here. To do so, choose Edit again and then choose Confirm. When you
confirm documents, the prediction confidence score of all header and line item fields is set to 1.0 (100%).
Do not confirm documents that haven’t been reviewed and may have incorrect extraction results.
Once the document status changes from “READY” to “CONFIRMED”, you can no longer change the
extraction results.
For additional considerations when you confirm documents, see Confirm Documents [page 244].
There are a few points to bear in mind when you confirm documents.
• SAP reserves the right to use confirmed documents in the reporting of accuracy values and for analytics.
• By default, Document Information Extraction doesn’t use your documents to retrain the service’s
machine learning models. To allow SAP to use your documents for this purpose, set the
dataFeedbackCollection configuration key at API level to true. A checkbox appears on the UI
requesting your consent each time you confirm documents.
• If you allow SAP to use your documents for retraining, Document Information Extraction automatically
checks them for any Personally Identifiable Information (PII). If a document contains PII data, it isn’t used
for retraining. You can deactivate these checks by setting the performPIICheck subconfiguration at API
level to false.
For further details of API-level settings, see Create Configuration [page 115] and Configuration Keys [page 117].
Procedure
To select all the documents in the list, choose the checkbox above the table.
3. Click Delete and then click OK to delete the documents you selected. These documents are then removed
from the Documents list.
You can also delete individual documents by choosing Delete on the document detail screen.
Remember
You can’t delete documents that are associated with templates. In such cases, you must first navigate
to the Template overview screen and dissociate the document from the template. For further details,
see Add Documents and Activate/Deactivate Template [page 253].
Use this Document Information Extraction UI feature to create schemas containing data fields found in
standard or custom document types. As an administrator, you can use these schemas as a basis for creating
templates. End users can select schemas and corresponding templates when adding documents.
Context
Note
This feature is available only to users with the administrator role (role collection
Document_Information_Extraction_UI_Templates_Admin).
For additional information on using schemas, see the best practices under Schema Configuration: Best
Practices [page 259].
A schema contains a list of header fields and line item fields representing the target information you want to
extract from a particular type of document.
Tip
The Document Information Extraction UI provides preconfigured SAP schemas for the following standard
document types: purchase order, payment advice, and invoice. You can use these schemas unchanged to
upload documents.
You can’t edit original SAP schemas. Always create a copy and then change the default fields, as required.
Note
To extract text from images captured by camera, create a schema for a custom document type and use the
OCR engine type Scene Text.
Extraction results for scene text appear in the API, not on the Document Information Extraction UI.
For details of extracted header fields and line items, see the following sections of the Document Information
Extraction documentation:
For information about limitations on extraction from tables, see Technical Constraints [page 275].
Procedure
1. Open the Document Information Extraction UI, as described in Subscribing to the Document Information
Extraction UI [page 234].
2. In the left navigation pane, choose Schema Configuration.
3. In the top right of the screen, click Create.
4. Enter a name and optionally a description for the new schema.
5. Select the appropriate type of document.
If you select Custom here, you must also select an OCR engine type. To extract text from images, select
Scene Text; otherwise, select Document.
Remember
Extraction results for scene text recognition appear in the API, not on the Document Information
Extraction UI.
6. Choose Create.
7. Choose the row containing your new schema to display the details pane. Here, you can add data fields and
also edit, copy, activate/deactivate, or delete the schema, as described in the following sections.
Restriction
You can’t add data fields to schemas created with document type Custom and OCR engine type Scene
Text.
In schemas created using document type Custom and OCR engine type Document, you can add data
fields. In this case, no default extractors are available.
Procedure
Restriction
If a schema is currently active, deactivate it before editing. When you deactivate a schema, its status on
the Configurations screen changes to “INACTIVE”.
You can’t deactivate schemas that provide the basis for templates. Otherwise, any changes to the
schema would affect the field definitions for the relevant templates.
Before deactivating a schema of this kind, first deactivate all templates based on it and then delete
them.
Use this feature to copy SAP or custom schemas. SAP schemas support standard document types. You can
use these preconfigured schemas unchanged to add documents and create templates. You can also copy and
edit SAP schemas as a basis for configuring schemas of your own.
Procedure
The copy you’ve created now appears in the Schemas list, with the status “INACTIVE”.
Procedure
Restriction
You can’t deactivate schemas that provide the basis for templates. Otherwise, any changes to the
schema would affect the field definitions for the relevant templates.
Before deactivating a schema of this kind, first deactivate all templates based on it and then delete
them.
4. To add a header field to the schema, click the Add button for Header Fields.
5. In the Add Data Field dialog, enter the name of the header field you want to extract, an optional field label,
and an optional description. Next, select the data type – either country/region, currency, discount, date,
number, or string.
Tip
Use the Field Label option to define user-friendly names for header and line item fields. Any field labels
that you enter here replace the technical field names under Extraction Results in the Documents feature
of the Document Information Extraction UI.
Remember
The data type country/region extracts the values in a two-letter code (alpha-2) ISO 3166 format. For
example, DE for Germany, FR for France, GB for United Kingdom, and US for United States.
6. In the Setup Type dropdown, use the prefilled value (auto or manual) or change it in line with your needs.
Note
Which setup type you select here depends on a number of factors, including document type, preferred
extraction method, and which service plan you’re using.
For details of setup types and associated factors, see Setup Types [page 249].
7. Click Add.
On the Configurations panel on the left of the screen, the status of the schema changes to “DRAFT”.
8. If you want to edit the data field, click (edit icon) in the Action column for the field on the right of the
screen.
9. To add line item fields to the schema, click the Add button for Line Item Fields.
10. Enter the data for the new line item field in the same way as you did for the header field.
Related Information
Learn about the setup types available when you add data fields to schemas. Find out how these setup types
relate to document types, extraction methods, and default extractors.
When you add data fields to a schema on the Document Information Extraction UI, you can select one of the
following setup types:
• auto
• manual
These setup types support extraction using different methods, depending on whether the schema was created
for a standard or for a custom document type.
Default Values
When you first call up the Add Data Fields dialog, the service prefills the Setup Type field. The default values
depend on the document type and which edition of Document Information Extraction you use:
• Premium edition
• Schemas for standard and custom document types: auto
• Base edition
• Schemas for standard document types: auto
• Schemas for custom document types: manual
You can change these prefilled values in line with your needs.
The following table shows the various combinations of document type and setup type and how they relate to
the extraction method and the use of default extractors:
Document Type for Schema Setup Type Extraction Method Select Default Extractor?
Restriction
The setup type auto without default extractor (extraction method: generative AI) is available only with
the service plan Document Information Extraction, premium edition (premium_edition). See Service
Plans [page 77] and Metering and Pricing [page 79].
However, if you want to try out extraction using generative AI, you can do so with an SAP BTP trial account.
Simply follow the steps in the tutorial: Use Trial to Extract Information from Custom Documents with
Generative AI and Document Information Extraction
Remember
You can use different extraction types for header fields in the same schema. However, you can’t combine
different extraction types for line items in the same schema.
For example, if you use the setup type auto without a default extractor for one line item field, you must use
it for all the other line item fields that you add to your schema.
Caution
Always validate information extracted using generative AI before using it for critical applications.
If you prefer not to use generative AI to extract information from documents, select the setup type auto
with a default extractor (standard document types only). Alternatively, select the setup type manual
(standard and custom document types) when adding data fields to your schema.
Note
As of October 9, 2023, the setup type default is no longer available for new schemas. If an existing schema
includes fields added before this date with the setup type default, you can use only this setup type when
adding new fields. Schemas created before this date that don’t yet include any fields offer you the choice of
auto or manual as setup type.
Because SAP schemas include fields added before October 9, 2023, when you copy these schemas, the
only setup type available is default.
Related Information
Procedure
If a schema doesn’t yet have any data fields, the Activate button is grayed out.
4. When a schema has the status “ACTIVE”, the Deactivate button replaces the Activate button.
Note
If you wish to change or delete a schema that is active, you must first click Deactivate. When you
deactivate a schema, its status on the Configurations screen changes to “INACTIVE”. To enter your
changes, choose Edit (pen icon) Once you’ve completed your changes, activate the schema again.
Restriction
You can’t deactivate schemas that provide the basis for templates. Otherwise, any changes to the
schema would affect the field definitions for the relevant templates.
Before deactivating a schema of this kind, first deactivate all templates based on it and then delete
them.
Procedure
You can’t delete a schema that has the value “YES” in the SAP Schema column.
3. If the schema has the status “ACTIVE”, you must deactivate it before you can delete it. In this case, click
Deactivate.
You can’t deactivate schemas that provide the basis for templates. Otherwise, any changes to the
schema would affect the field definitions for the relevant templates.
Before deactivating a schema of this kind, first deactivate all templates based on it and then delete
them.
4. Click Delete and then Yes to delete the selected schema. The schema is removed from the Schemas list.
13.2.5 Template
Use this Document Information Extraction UI feature to create, reuse, edit, and delete templates based on
schemas and document types. End users can select templates together with a corresponding schema to
extract information from business documents of the appropriate type and structure.
Context
Note
This feature is available only to users with the following administrator role:
• Document_Information_Extraction_UI_Templates_Admin
For additional information on using templates, see the best practices under Template [page 265].
Templates are based on schemas and enable you to show the position of extraction fields in a particular
document layout. After creating a template, you use the Document feature to associate one or more
documents with it. You then edit the extraction results for these documents, indicating the location of fields
and their values.
Templates are essential for processing custom document types. However, you can also use them with standard
document types to fine-tune extraction results.
Tip
If you follow the guidance in General Recommendations and Limitations [page 266], you only have to edit
the extraction results for one document that you associate with your template.
Procedure
1. Open the Document Information Extraction UI, as described in Subscribing to the Document Information
Extraction UI [page 234].
2. Click the Template icon in the left navigation pane.
3. Click Create a new template ( add icon) at the top right.
4. Enter a name and optionally a description for the new template. Select the appropriate document type
(either Invoice, Payment Advice, Purchase Order, or Custom). Choose the schema you wish to use as a
basis for the new template. Click Create.
5. Choose OK to see the template details.
The Extraction Fields tab shows the header fields and line item fields from the schema you specified.
6. Note
This step and the ones that follow are optional and are only applicable if you want to assign a fixed
value to one or more extraction fields.
Choose the Extraction Fields tab and then choose Edit on that tab.
7. Enter a value that you wish to associate with all instances of a particular field.
For example, if you intend to use your template only for documents from one supplier, you could enter the
name of that supplier as the fixed value for the senderName field.
8. Repeat the preceding step for any other fields that you want to assign fixed values to.
9. Save your entries.
Context
To add documents to a template, you use the Document feature of the Document Information Extraction UI.
Adding documents to templates, as described here, helps improve accuracy.
Restriction
The document and the template that you wish to add it to must share the same schema. If the document
and template have different schemas, you can’t add the document to the template.
If no schema was selected when the document was uploaded to the Document Information Extraction UI,
you can’t add the document to a template. In this case, Add to Template is grayed out.
You now see the document details. It’s best if the file has at least 2 line items.
4. Edit the extraction results for the document as described in View and Edit Extraction Results [page 242].
You can confirm the document at this point. It’s not necessary to save the document. When you associate
a document with a template, the Document Information Extraction UI saves the extraction results
automatically.
Remember
If you associate a document with a template and then use that template to extract information from
the same document, the extraction values can differ from the ones you entered and confirmed during
editing.
The technical reason for differences of this kind is that the Document Information Extraction UI
extracts data based on heuristics and not on exact matching of bounding boxes.
5. To add this document to a template, choose Add to Template at the top of the pane on the right of the
screen.
6. Select the relevant template from the dropdown and choose Add.
The document file is added to the template that you selected. It’s displayed as an associated document on
the details page for this template.
7. Repeat the preceding steps to add more documents to your template.
Restriction
8. If you want to remove associated documents from a template, first choose the Template icon in the left
navigation pane.
9. Then select the relevant template.
10. Choose the (broken link) icon in the Action column of the Associated Documents tab.
11. Finally, choose OK to confirm the action.
12. Activate a template in status “DRAFT” to use it to extract results from documents similar to the ones
associated with it.
Context
If you want to make changes to a template, you can do so using the Edit function. You can change the template
name and description. In addition, you can select a different schema for the template. Changing the schema
makes a new set of extraction fields available for the template.
Restriction
If a template is currently active, you must deactivate it before you can edit it.
Procedure
The Edit Template dialog appears. Here, you can change the name and description by editing the
corresponding fields.
You can also select a different schema for your template. To change the schema, do the following.
4. Choose the Schema dropdown and select a schema from the list.
Note
This list includes only schemas that match the document type for which the template was originally
created.
Remember
If you’ve already edited extraction result for sample documents associated with your template, these
edits are preserved following the change of schema only for fields that appear in both the old and the
new schema. After changing the schema, you can annotate the newly added fields in your existing
sample documents.
Context
You’ve created a template in a test client by following the steps in Add Template [page 253] and Add
Documents and Activate/Deactivate Template [page 253]. You’re now happy with your new template and want
to export it from the current client before importing it into your production client.
Procedure
1. Choose Export.
Document Information Extraction downloads the template to your local machine. The download includes
the schema.json and template.json files and a folder with the associated documents.
2. Choose Change Client and select the production client to which you want to import your template.
The Document Information Extraction UI displays the Templates list for the production client.
3. Choose (upload icon) and navigate to the folder you downloaded in Step 1.
4. Select the folder and choose Open.
The new template appears in the list. Users can now select this template when adding documents of the
appropriate type to the Document Information Extraction UI .
This feature enables you to quickly and easily create templates when adding documents to the Document
Information Extraction UI.
Context
You’ve added a document by following the steps in Add Document [page 240] and View and Edit Extraction
Results [page 242].
To create a template based on document extraction results, you must use a schema when adding the
document.
Before creating a template from the document extraction results, make sure you meet the following
prerequisites:
• You’ve used an appropriate schema when adding your document for extraction.
• The document you want to base your template on has the status “READY”.
Procedure
The template detail screen appears, showing your new template with the preprocessing status “DONE”.
You can now use your template in the same way you’d use one created directly using the Template feature.
3. Activate, edit, export, or delete your template, as described in Add Documents and Activate/Deactivate
Template [page 253], Export/Import Template [page 256], and Delete Template [page 257].
Procedure
Restriction
If a template is currently active, you must deactivate it before you can delete it.
Find out about recommended approaches for optical character recognition, the main features of the Document
Information Extraction service, data enrichment, and extraction using generative AI.
The quality of your extraction results depends on a wide range of factors. This section is intended to help you
get the best out of the Document Information Extraction service. It includes the following information:
• General recommendations on how to get better extraction and enrichment results using OCR best
practices.
• Decision procedures, recommendations, and tips on how to use the schema configuration, template, and
document features of Document Information Extraction.
• Important considerations when using generative AI to extract information from documents automatically.
Related Information
To get better extraction and enrichment results, bear in mind the following when uploading document files to
the Document Information Extraction service:
Tip
Learn about best practices for using schemas to upload documents to the Document Information Extraction
UI.
For best results, we strongly recommend that you always use a schema when uploading standard document
types to the Document Information Extraction UI. You can also upload documents of this type directly, without
a schema or template. However, using a schema has the following benefits:
• You don’t have to select extraction fields manually every time you submit a document.
• There’s no risk of inconsistent settings for different documents.
Note
To use the Schema Configuration feature to create, copy, and edit schemas, you must have the
administrator rights provided by the following role collection:
• Document_Information_Extraction_UI_Templates_Admin
If you have the Document_Information_Extraction_UI_End_User role, you can use any available
schemas, except SAP schemas, to upload documents.
The steps involved in adding a schema differ depending on whether the document type is standard or custom.
For details of the respective processes, see the subtopics in this section.
Related Information
The Document Information Extraction UI supports the following standard document types:
• Invoice
• Payment advice
• Purchase order
Tip
The following image outlines the steps and settings for processing standard document types with or without a
template.
SAP schemas provide a set of typical fields with default extractors for standard document types. If
you don’t want to configure schemas for standard document types from scratch, you can select the
appropriate SAP schema unedited when you add a document or create a template on the Document
Information Extraction UI. No configuration is needed when you use SAP schemas in this way.
You can also create your own schema by copying the SAP schema for the relevant standard document type.
You can then edit this copy, choosing some or all the fields from the SAP schema as a basis for your own
schema and adding custom fields, as required.
When deciding whether to use a schema, bear in mind the following points:
• It’s better to use a preconfigured SAP schema than to use no schema at all.
• SAP schemas only support the setup type default.
You can use the following extraction methods for header fields in schemas for standard document types:
Restriction
The generative AI extraction method is available only with the service plan Document Information
Extraction, premium edition (premium_edition).
Remember
You can use different extraction types for header fields in the same schema. However, you can’t combine
different extraction types for line items in the same schema.
For example, if you use the setup type auto without a default extractor for one line item field, you must use
it for all the other line item fields that you add to your schema.
Default Extractors
Templates generally deliver better results for custom header fields than for custom line items. To get the
best extraction results when using a template or the machine learning models of the Document Information
Extraction service with standard document types, configure default extractors for header and line item fields as
follows:
• Header fields: Don’t use default extractors for custom header fields. You can then use a template to edit
them.
• Line items: Use default extractors, wherever possible.
Related Information
Custom documents are documents that don’t belong to the standard document types in Document
Information Extraction. There are many different types of custom document: Common examples include
powers of attorney, birth certificates, and résumés. If you want to process documents of this kind, always use a
schema.
The following image outlines the steps and settings for processing custom document types with and without a
template.
You can use the following combinations of extraction methods and setup types for header fields in schemas for
custom document types:
The generative AI extraction method is available only with the service plan Document Information
Extraction, premium edition (premium_edition).
Note
Remember
You can use different extraction types for header fields in the same schema. However, you can’t combine
different extraction types for line items in the same schema.
For example, if you use the setup type auto for one line item field, you must use it for all the other line item
fields that you add to your schema.
Related Information
Decide whether to use a template when uploading documents to the Document Information Extraction UI and
make the relevant settings.
When you opt to use a schema (recommended), you must also decide whether to use a template to upload
your documents. The associated procedure is as follows:
Note
To use the Template feature to create templates, you must have the administrator rights provided by the
following role collection:
• Document_Information_Extraction_UI_Templates_Admin
The Document Information Extraction UI delivers best results with standard table structures. If your
documents include custom fields, we recommend using a template. This approach allows you to edit extraction
results for fields that don’t have default extractors. Edit all custom header fields. If the line items in your
documents are in a standard table structure, also edit the line items. However, if the table has a custom
structure, don’t edit the line items.
If the documents don’t include custom fields, and only a few of the documents share the same template layout,
don’t use a template. In this case, upload the documents using a schema only.
Note
If there are extraction errors when using templates, refer to the subsections of these template best
practices.
Related Information
Follow best practices and be aware of limitations when using templates to extract information from custom and
standard document types.
Templates are essential when extracting information from custom document types, for which Document
Information Extraction has no pre-trained models. In addition, templates can help you fine-tune results when
extracting information from standard document types. (See Standard Document Types [page 260].)
Whether you use templates to extract information from custom or standard document types, note the
recommendations here and in Standard and Custom Tables [page 267]:
• Use templates only with well-structured form-like documents such as the following: structured forms,
application forms, certificates, prescriptions, and personal IDs.
• If possible, process one-page documents only. Otherwise, the results can be less accurate.
• If the same header field appears on more than one page, the Document Information Extraction UI extracts
this field only once.
• Templates support multiple tables per page, provided they all have a standard structure and the same table
headers. Multiple tables that are horizontally placed aren’t supported.
• Nested table structures (with items grouped in the same line) cause issues.
• Items that overlap horizontally (for example, different items in the same column) also cause problems.
• Header and line item fields with identical or very similar formatting prevent the template from
distinguishing the header from the main part of the table. As a result, the template can’t detect where
the table starts.
• If adjacent columns are too close to each other, the Document Information Extraction UI can’t distinguish
them. In such cases, the service extracts the contents of multiple columns as a single value.
If there are extraction errors when using templates, check for the following issues:
• Document for upload has significant page rotation/tilt (15 degrees or more).
• Size of pages and margins differs between document for upload and associated document.
• Position of image differs between document for upload and associated document.
• Line items in the document for upload differ slightly from the line items in the associated document.
• Images include scanning noise – for example, background images and bleed through, where text on the
back of the document is visible on the front.
• OCR results are poor.
These issues result in fields failing to map to their expected positions. In such cases, extraction can
either be incorrect (wrong value) or fail entirely (no value). If extraction fails, the system falls back to the
pre-trained global model, which can result in incorrect extraction.
Related Information
Compare the tables in your documents with examples of standard and custom structures.
If you use a template to extract information from tables, you get the best results from simple, well-structured
layouts (standard tables). By contrast, custom tables can cause issues.
Before using a template, compare the tables in your documents with the following examples of standard and
custom tables.
Remember
Whether you’re extracting information from standard or custom tables, bear the following layout-related
points in mind:
• If you use a template, make sure that the header and line item fields are formatted differently from
each other. If they have very similar or identical formatting, the template can’t distinguish the header
from the main part of the table and therefore can’t detect where the table starts.
• Make sure that adjacent table columns aren't too close to each other. If they are, the Document
Information Extraction UI can’t distinguish them. As a result, it extracts the contents of multiple
columns as a single value.
For best results, use tables with the standard structures shown here.
In the following examples, the column headings correspond to the header fields, and the line items appear
directly under them.
Description covering
several lines
Description covering
several lines
As shown in both of the preceding tables, headers are arranged horizontally from left to right in standard
tables. If a column includes content that covers more than one line (as in the Description column of the second
table), this content isn’t nested. In other words, it’s not spread across multiple columns.
Custom Tables
Tables structured as shown in this section can cause issues during extraction and deliver poorer results.
Quantity 1 2
Nested Structures
Tip
If your documents include custom tables, we recommend using default extractors for all line items when
configuring the corresponding schema. If you then decide to use the Template function with your schema,
you don’t have to edit the extraction results for the line items.
Note
If you follow the guidance in this subsection but still have extraction errors, refer to the general
recommendations for using templates.
Make the recommended settings for uploading documents to the Document Information Extraction UI.
We recommend always using a schema when uploading documents to the Document Information Extraction
UI. Schemas enable you to manage fields for extraction centrally, reducing manual effort, and inconsistencies.
If you want to use a schema without a template, simply select the appropriate schema and then upload your
documents to the Document Information Extraction UI.
If you want to use a schema with a template and know the template name, select the template from the
dropdown in the Select Document step. If you’re unsure which template to use, choose Detect Automatically.
The service then finds the best template for your document.
When uploading documents using a schema, you may find that a suitable template isn’t available. In this
case, you can create a template based on the extraction results for your documents. For details of how to
do this, see Create Template from Document Extraction Results [page 256].
To create templates in this way, you need the admin rights provided by the following role collection:
• Document_Information_Extraction_UI_Templates_Admin
Data enrichment is a powerful feature that matches vendors, customers, employees, and products found on a
document with master data uploaded to the Document Information Extraction service.
To improve the performance of the data enrichment feature, make sure that your master data is up to date and
activated. To get the best possible matching results, observe the following recommendations:
• Don’t use placeholder values for individual fields that lack a value. Remove these fields instead.
• Always include the keys name and address1 and populate them with a valid supplier or customer name
and address. Otherwise, the enrichment is unlikely to work as intended.
• Whenever possible, include taxId and bankAccount information in the businessEntity field. These
two fields have benefits for the enrichment.
• Always keep in mind that uploaded master data must be activated before it can be used for enrichment. If
automatic activation (default) is enabled, this process can take up to four hours.
Tip
With large numbers of data records and for better control, use manual data activation. While automatic
data activation is more convenient in many cases, it can lead to unexpected results, especially if
triggered during the upload of new data records.
• Make sure to select the correct subtype when uploading the data (supplier for vendors or senders, and
customer for buyers or receivers).
• Currently, products are matched by materialNumber only. This means that data enrichment only works
for product line items that include a materialNumber on the document.
• If you upload a product entity without a materialNumber, this entity won’t be matched. Always include a
valid materialNumber when uploading product master data.
• To take advantage of ongoing normalization improvements, reupload the entire master data from time to
time – for example, once a quarter. To optimize the matching of values, we make improvements of this kind
continuously.
Request Examples
Not recommended – Create Enrichment Data [page 167] request payload:
payload:
{
"value":[
Recommended – Create Enrichment Data [page 167] request payload (do not use fields with custom
placeholders or empty values):
payload:
{
"value":[
{
"id":"BE0001",
"name":"Emma Dowerg",
"accountNumber":"SK2421",
"address1":"Amalie-Klemm-Platz 0/9, 48581, Geithain",
"city":"Geithain",
"countryCode":"DE",
"postalCode":"48581",
"email":"e.dowerf@mustermail.com",
"bankAccount":"DE345982837402",
"taxId":"DE435531312"
}
]
}
type: businessEntity
clientId: c_00
subtype: supplier
Related Information
Find out about best practices for using generative AI to extract information from documents.
Restriction
Extraction using generative AI is available with the service plan Document Information Extraction,
premium edition (premium_edition) only. See Service Plans [page 77] and Metering and Pricing [page
79].
You can also use an SAP BTP trial account to try out extraction using generative AI. Follow the tutorial:
Use Trial to Extract Information from Custom Documents with Generative AI and Document Information
Extraction .
Caution
Bear the following in mind when using the Document Information Extraction service to process documents
using generative AI:
Confidence Scores: The Document Information Extraction service returns confidence scores for extracted
results. These values are usually reliable when the service uses a pre-trained model. Be aware, however,
that they can’t be relied on when the service uses generative AI to extract information.
Coordinates: Result objects returned by the API and the Document Information Extraction UI include
coordinates indicating the assumed location of extracted items of information on the page. These
coordinates are intended to let users see where the service extracted information and check manually for
errors. Even if the extraction results are correct, some coordinates may be missing or incorrect. Therefore,
coordinates can’t be relied on when the service extracts information automatically using generative AI.
See also Get Result [page 138] and View and Edit Extraction Results [page 242].
The better you describe the information that you want to extract using generative AI, the better your results will
be.
When adding fields to a schema, pay particular attention to their names and associated descriptions.
Tip
When entering field names and descriptions, it’s often useful to imagine that you’re explaining what you
want to extract to a person with no prior knowledge.
• Consider the wording of names and descriptions carefully, making sure that they’re accurate, complete,
and unambiguous.
• Write your definitions in English, even if documents for extraction are in a different language.
• Make sure that field names are self-explanatory and don’t include abbreviations or acronyms.
Example
• If one field can have different names, include as many of these names as possible in your description.
The Order Number field may be called Your Reference in some documents.
• If there are multiple fields with similar names, add all the fields to your schema, even if only one is needed
in the downstream application. Doing so simplifies processing because you can be sure of extracting a
value automatically, which you can later correct manually, if necessary.
Example
The field names receiver material number and sender material number are very similar and therefore
could be confused with each other.
Example
If you want a value extracted from a document to be output in uppercase, you can specify this
formatting in the description.
Related Information
All Document Information Extraction endpoints exposed to the end user have strict technical limits. See details
in the following table.
Note
The technical limits listed here are relevant only to users of the service plans Base Edition
(blocks_of_100) and Premium Edition (premium_edition) for enterprise accounts. See Service Plans
[page 77].
Note
The Document Information Extraction service supports extraction from single or multiple tables. A single
table can extend across multiple pages. It’s not possible to extract information from multiple tables if they
have different sets of line item fields.
Tip
See the following sections of the Document Information Extraction documentation for other useful
information:
Use only the following types of characters for the IDs of clients, enrichment data records, system and
company codes, and the name of templates, schemas, and schema header and line item fields:
Related Information
Free Tier Option and Trial Account Technical Constraints [page 276]
When using the free tier option for Document Information Extraction or a trial account, be aware of the
following technical limits:
Note
The technical limits listed here are relevant only to users of the Free service plan for enterprise accounts
and the Base Edition (blocks_of_100) service plan for trial accounts. See Service Plans [page 77].
Tip
The rolling period consists of the past 30 days. The to-
tal number of document pages available at any time is
calculated based on how many pages you’ve uploaded
during these 30 days.
Tip
A default client is created following tenant provision-
ing, enabling you to use the service immediately.
Note
You can't change the details of the default client, a previously created customized client, and enrichment
data records. Delete the client and data records, and then create new ones with the updated details. For
more information, see Client API [page 106] and Enrichment Data API [page 166].
See below the list of fields that can be extracted from header fields by Document Information Extraction.
Supported
Document Enrich-
Category Field Name Field Label Description Type Type ment Data
amounts grossAmount Gross Amount Invoice amount including taxes and invoice Number
shipping/handling costs.
amounts grossAmount Total Amount Sum of subtotal, taxes, special han- purchas Number
dling charges, and shipping charges, eOrder
without discounts, or total amount due
and payable.
amounts netAmount Net Amount Invoice amount without taxes and ship- invoice Number
ping/handling costs.
amounts netAmount Sub Total Amount without taxes and ship- purchas Number
Amount eOrder
ping/handling costs.
amounts taxAmount Tax Amount The tax amount applied to this docu- invoice Number
ment.
amounts taxId Supplier Tax The number used to identify the suppli- invoice String Used for
ID Busines-
er's company for tax purposes.
sEntity
[page 170]
sender
and re-
ceiver en-
richment.
amounts taxId Customer Tax Tax identifier of the organization send- payment String Used for
ID Advice Busines-
ing the payment advice.
sEntity
[page 170]
sender
and re-
ceiver en-
richment.
amounts taxId Tax ID Tax identifier of the sender’s business purchas String Used for
eOrder Busines-
entity. Unique to each sender.
sEntity
[page 170]
sender
and re-
ceiver en-
richment.
amounts taxIdNumber Tax ID Tax identifier number of the sender’s purchas String
Number
business entity. Unique to each sender. eOrder
amounts taxName Tax A brief description of the tax. For exam- invoice String
Description
ple: California sales tax.
amounts taxRate Tax Rate Primary tax rate applied to the docu- invoice Number
ment.
contact barcode Barcode The decoded content of the QR code busines String
for business cards supports the vCard sCard
standard. Also known as VCF (Virtual
Contact File), a vCard is a file format
standard for electronic business cards.
They can contain name and address
information, phone numbers, email ad-
dresses, URLs, logos, photographs, and
audio clips.
contact buildingName Building Name of the building in the address. busines String
Name sCard
contact city City Name of the city in the address. busines String
sCard
contact departmentN Department The area one works in a company. busines String
ame sCard
contact firstName First Name The name that stands first in one's full busines String
name. sCard
contact houseNumber House Number of the house in the address. busines String
Number sCard
contact middleName Middle Name Name between one's first name and busines String
surname. sCard
contact namePrefix Name Prefix Title used before a person's name. busines String
sCard
contact nameSuffix Name Suffix Title used after a person's name. busines String
sCard
contact poBox Post Office Post office box number. busines String
Box Number sCard
contact role Role The position one has in a company. busines String
sCard
contact state State Name of the state in the address. busines String
sCard
contact streetName Street Name Name of the street in the address. busines String
sCard
contact website Website Set of related web pages located under busines String
a single domain name, typically created sCard
by a single person or company.
contact zipCode Zip Code Postal code of the address. busines String
sCard
details barcode Barcode The decoded content of the QR code or invoice String
details purchaseOrde Purchase Number of the buyer’s purchase order. invoice String
rNumber Order
document documentDat Invoice Date Date of the invoice document. invoice Date
e
document documentDat Payment Date Date of the payment advice document. payment Date
e Advice
document documentDat Purchase Date of the purchase order document. purchas Date
e Order Date eOrder
document documentNu Invoice Number that identifies this invoice. invoice String
mber Number
document documentNu Payment Number of the payment advice that ref- payment String
mber Reference Advice
erences the payment.
document documentNu Purchase Number that identifies this purchase purchas String
mber Order Number eOrder
order.
payment discount Discount Amount deduced from gross amount. invoice String
payment dueDate Due Date Expected date of payment in extended invoice Date
ISO 8601 format (YYYY-MM-DD).
payment paymentTerm Payment Payment terms as found on the invoice invoice String
s Terms
document. Payment terms are a com-
bination of the payment due date and
the discount rate or penalty rate.
receiver receiverAddre Buyer Address of the organization that or- invoice String Used for
ss Address Busines-
dered the goods or services.
sEntity
[page 170]
receiver
enrich-
ment.
receiver receiverConta Buyer Contact Name of the employee that should re- invoice String Used for
ct Employee
ceive this invoice.
[page 171]
enrich-
ment.
receiver receiverId Supplier ID A unique code that identifies the sup- purchas String
plier. eOrder
receiver receiverName Buyer Name Name of the organization that ordered invoice String Used for
Busines-
the goods or services.
sEntity
[page 170]
receiver
enrich-
ment.
receiver receiverTaxId Buyer Tax ID Tax identifier of the buyer's business invoice String
entity. Unique to each buyer.
sender senderAddres Supplier Address of the organization generating invoice String Used for
s Address Busines-
this invoice.
sEntity
[page 170]
sender en-
richment.
sender senderAddres Customer Address of the organization sending payment String Used for
s Address Advice Busines-
the payment advice.
sEntity
[page 170]
sender en-
richment.
sender senderAddres Sender Address of the sender, only one box for purchas String Used for
s Address eOrder Busines-
the street, city, and country/region of
sEntity
the sender.
[page 170]
sender en-
richment.
sender senderBankA Supplier Bank Bank account of the organization gen- invoice String Used for
ccount Account Busines-
erating this invoice.
sEntity
[page 170]
sender
and re-
ceiver en-
richment.
sender senderBankA Sender Bank Bank account number of the sender. purchas String Used for
ccount Account eOrder Busines-
sEntity
[page 170]
sender
and re-
ceiver en-
richment.
sender senderCity Sender City City or town name of the sender's ad- purchas String
dress. eOrder
sender senderDistrict Sender District name of the sender's address. purchas String
District eOrder
sender senderEmail Sender Email Email address of the sender. purchas String
eOrder
sender senderExtraA Sender Extra Any part of the sender's address not purchas String
ddressPart Address eOrder
included in the other address fields.
sender senderFax Sender Fax Fax number of the sender. purchas String
eOrder
sender senderHouse Sender House House number of the sender's address. purchas String
Number Number eOrder
sender senderId Sender ID A unique code that identifies the purchas String
sender. eOrder
sender senderName Supplier Name of organization generating this invoice String Used for
Name Busines-
invoice.
sEntity
[page 170]
sender en-
richment.
sender senderName Customer Name of the organization sending the payment String Used for
Name Advice Busines-
payment advice.
sEntity
[page 170]
sender en-
richment.
sender senderName Sender Name Name of the sender of the document purchas String Used for
eOrder Busines-
(usually the sending company).
sEntity
[page 170]
sender en-
richment.
sender senderPhone Sender Phone Telephone number of the sender. purchas String
eOrder
sender senderPostal Sender Postal Postal code of the sender's address. purchas String
Code Code eOrder
sender senderState Sender State State or province name of the sender's purchas String
address. eOrder
sender senderStreet Sender Street Street name of the sender's address. purchas String
eOrder
shipTo deliveryDate Delivery Date Date of the delivery in extended ISO invoice Date
8601 format (YYYY-MM-DD). purchas
eOrder
shipTo deliveryNoteN Delivery Note Unique identifier on the invoice follow- invoice String
umber Number
ing the goods.
shipTo shippingTerm Shipping Indicate when the goods should be de- purchas String
s Terms eOrder
livered and how.
shipTo shipToAddres Shipping Address where the goods will be ship- purchas String
s Address eOrder
ped to: only one box for the street, city,
and country/region.
shipTo shipToCity Shipping City City or town name of the shipping ad- purchas String
dress. eOrder
shipTo shipToDistrict Shipping District name of the shipping address. purchas String
District eOrder
shipTo shipToEmail Shipping Email address for the shipping address. purchas String
Email eOrder
shipTo shipToExtraAd Shipping Extra Any part of the shipping address not purchas String
dressPart Address eOrder
included in the other address fields.
shipTo shipToFax Shipping Fax Fax number for the shipping address. purchas String
Number eOrder
shipTo shipToHouseN Shipping House number of the shipping address. purchas String
umber House eOrder
Number
shipTo shipToName Shipping Company name for the shipping ad- purchas String
Company eOrder
dress.
Name
shipTo shipToPhone Shipping Telephone number for the shipping ad- purchas String
Telephone eOrder
dress.
Number
shipTo shipToPostalC Shipping Postal code of the shipping address. purchas String
ode Postal Code eOrder
shipTo shipToState Shipping State or province name of the shipping purchas String
State eOrder
address.
shipTo shipToStreet Shipping Street name of the shipping address. purchas String
Street eOrder
When the barcode header field is requested for extraction, the Document Information Extraction service scans
the whole document for 1D and 2D barcodes and provides the extracted content of the barcode as a string
value. The service can detect multiple barcodes in the same document and provide all the detected content in
the extracted results. The most common types of 1D and 2D barcodes are supported by this field, for example:
• Code39
• Code128
• DataMatrix
• EAN
• Interleaved 2 of 5
• PDF417
• QRCode
• UPC
The document quality affects the result of the extraction. For example, a low quality (low resolution) image
of a scanned document with a barcode may not return any barcode header field extraction, if the barcode
in the document is not identifiable. Therefore, the quality of a decoded barcode interferes in the prediction
confidence score of the barcode header field. Use high quality (high resolution) images to increase the chance
of extraction for barcodes in the document.
See below the list of fields that can be extracted from line items by Document Information Extraction.
Supported
Document Enrich-
Category Field Name Field Label Description Type Type ment Data
amounts netAmount Amount Total amount of the line item (typically invoice Number
unit price * quantity). payment
Advice
purchas
eOrder
details customerMat Customer Unique code that identifies a specific purchas String Used for
erialNumber Material eOrder Product
good or service in a customer catalog
Number [page 172]
or system.
enrich-
ment.
details materialNumb Material Unique code that identifies a specific invoice String Used for
er Number Product
good or service in a supplier catalog or
[page 172]
system.
enrich-
ment.
details purchaseOrde Purchase Number of the associated purchase or- invoice String
rNumber Order Number
der (if available on line item field level).
details supplierMateri Supplier Unique code that identifies a specific purchas String Used for
alNumber Material eOrder Product
good or service in a supplier catalog or
Number [page 172]
system.
enrich-
ment.
details unitOfMeasur Unit of The unit of measure UN/CEFACT code. invoice String
e Measure
For example: EA for each, HR for hour purchas
and YR for year. eOrder
details unitPrice Unit Price Price for a single instance of an object. invoice Number
purchas
eOrder
document documentNu Document Document number that is used by the payment String
mber Number Advice
receiver.
item itemNumber Item Number Item number that is used by the re- purchas String
ceiver. eOrder
Get an overview on the security information that applies to Document Information Extraction. Learn about the
main security aspects of the service and its components.
Related Information
Introduction
Data protection is associated with numerous legal requirements and privacy concerns. In addition to
compliance with general data privacy regulation, it is necessary to consider compliance with industry-specific
legislation in different countries/regions. SAP provides specific features and functions to support compliance
with regard to relevant legal requirements, including data protection. SAP does not give any advice on whether
these features and functions are the best method to support company, industry, regional, or country/region-
specific requirements. Furthermore, this information does not give any advice or recommendation in regards
to additional features that would be required in particular IT environments; decisions related to data protection
must be made on a case-by-case basis, under consideration of the given system landscape and the applicable
legal requirements.
Note
SAP software supports data protection by providing security features and specific data protection-relevant
functions such as functions for the simplified blocking and deletion of personal data. SAP does not provide
legal advice in any form. The definitions and other terms used in this document are not taken from any
given legal source.
Document Information Extraction may process personal data, such as employee names and email addresses,
depending on the information available in documents and enrichment data.
All data processed by the service is stored in the SAP BTP, Cloud Foundry environment. Document Information
Extraction generally processes the following data types:
Data Purpose
Inference Docu- Refers to documents that are submitted by users to receive machine learning predictions.
ments
Data Feedback Col- Refers to documents that are submitted by users to receive machine learning predictions, and to be
lection Documents used to retrain the service's machine learning models through the data feedback collection feature.
Documents Associ- Refers to documents that are submitted by users and associated with templates to extract informa-
ated with Templates tion from other similar business documents.
Enrichment Data Refers to enrichment data records, for example, supplier name and supplier address. The serv-
ice matches your existing structured data (typically master data records) with the information
extracted from documents.
Document Information Extraction does not persist any sensitive personal data. For this reason, it does not log
read access to sensitive personal data.
Information Report
The data from inference documents and data feedback collection documents used by Document Information
Extraction is controlled and managed by the consuming application which calls the Document Information
Extraction APIs. Document Information Extraction does not create or modify inference or retraining data
provided by the consuming application. Therefore it is not possible for Document Information Extraction to
provide a retrieval function to identify data of specific individuals.
It is recommended that the consuming application which uses Document Information Extraction provides
personal data reports to its users and transfers to Document Information Extraction for processing. After every
change of the data in the customer system, customers should call the Create Enrichment Data [page 167]
endpoint.
See in the table below, retention period and deletion details for all data types required by the Document
Information Extraction service.
Deletion of personal data is logged using audit logging services. For more information, see Audit Logging in the
Cloud Foundry Environment.
Inference Documents The default retention period for inference data documents is 7 days. You can also use the
documentRetentionTimeDays key to Create Configuration [page 115] and customize the
retention period, for inference documents uploaded to the service, from 1 to 30 days.
You can delete inference data using the Delete Document [page 165] endpoint at any time, even
before the retention period expires.
Data Feedback Collec- There is no default retention period for retraining data documents.
tion Documents
You can delete all retraining data using the Create Configuration [page 115] and Delete Configu-
ration [page 124] endpoints at any time.
You can also individually delete documents previously submitted for retraining using the Delete
Document [page 165] endpoint at any time.
If the performPIICheck subconfiguration is set to true, the service automatically scans all
submitted documents and tries to exclude all documents where Personally Identifiable Informa-
tion (PII) data is detected from being used for retraining and improving the service.
It is the customer's responsibility to ensure that no personal data is submitted when using the
data feedback collection feature.
Documents Associated The documents uploaded to the document feature and associated with templates are not de-
with Templates leted automatically. To minimize the processing of personal data, do not use sample documents
that contain personal data.
Enrichment Data Enrichment data containing personal data is deleted automatically when customers delete the
service instances.
You also control the enrichment data retention period using the Delete Enrichment Data (Syn-
chronous) - Deprecated [page 181] and Delete Enrichment Data (Asynchronous) [page 182]
endpoints to delete enrichment data records at any point in time.
Change Log
The application does not perform any update of enrichment data automatically. Any update of enrichment data
per customer request would be logged using audit logging services. For more information, see Audit Logging in
the Cloud Foundry Environment.
Consent
According to Personal Data Processing Agreement for SAP Cloud Services, SAP acts as data processor. Thus,
customers are responsible for obtaining relevant consent to process personal data, including when applicable
approval by controllers to use SAP as a processor.
Here you can find a list of the security events that are logged by the Document Information Extraction service.
Authentication related events Authentication success Successful login attempt for See below the definitions of
tenant {tenant_id} on {in- the notations used in the log
stance_id} on {time} events.
Deletion of dataset:{data-
set_id} failed
Related Information
The Document Information Extraction UI (User Interface) is a web application that supports the following
features:
To optimize your experience of Document Information Extraction, SAP Business Technology Platform (SAP
BTP) provides features and settings that help you use the software efficiently.
Note
Document Information Extraction runs on the SAP BTP cockpit. For this reason, the accessibility features
for SAP BTP cockpit apply. For more information, see the accessibility documentation for SAP BTP cockpit
on SAP Help Portal at Accessibility Features in SAP BTP Cockpit.
The Document Information Extraction UI is based on SAPUI5. It provides accessibility support in its tools
and customer documentation. For more information on keyboard handling for SAPUI5 UI elements and screen-
reader support for SAPUI5 controls, see Accessibility for End Users.
Find out how to get support, and explore solutions to potential issues.
Related Information
If you encounter an issue with this service, we recommend that you follow the procedure below.
For more information about selected platform incidents, see Root Cause Analyses.
Related Information
20.2 Troubleshooting
In this section, see possible reasons for the following Document Information Extraction potential issues:
If you are getting a 4** status code for your request (such as 400, 401, or 422), make sure that you
are submitting the request correctly. In most cases, the problem can be fixed in the request. Perhaps the
authentication information is missing or the request is using the wrong HTTP method (GET, POST, DELETE). Or
maybe the payload is invalid.
Output Code
Possible reasons:
A 400 error means that the request is malformed. This can be because of one of the following reasons:
• The request does not have the correct Content-Type header (usually application/json)
• The request payload is not a valid JSON
• The request payload does not contain some of the required fields and files
• The authorization token was not included in the headers. The error message will be "Authorization
token was not found in headers". The header should look like Authorization: Bearer
eyJhbGc....
Output Code
A 401 error means that you did not supply correct authentication information. This can be because of one of
the following reasons:
Output Code
Possible reasons:
A 413 status indicates that the request you are making is too large. Either you are sending a file that is too large
or trying to process too many objects in a single request.
Output Code
You get a 415 status code when you use the wrong content type or file format. See Supported Document Types
and File Formats [page 84].
Output Code
Possible reasons:
You get a 422 status code when your request payload references a clientId, senderId, or documentId that does
not exist. For example, you will get this error if you try to create a document for a client that does not exist.
You may also get this error if the document you upload cannot be parsed.
Output Code
You get a 429 status code when you have reached the rate limit for this user. You have made too many
requests.
You get a 500 status code for your request due to a server error and not an issue with the request. A 500 error
is usually an error in the Document Information Extraction application code. To report 500 errors, create an
incident on the component CA-ML-BDP, as described in Getting Support [page 295].
Hyperlinks
Some links are classified by an icon and/or a mouseover text. These links provide additional information.
About the icons:
• Links with the icon : You are entering a Web site that is not hosted by SAP. By using such links, you agree (unless expressly stated otherwise in your
agreements with SAP) to this:
• The content of the linked-to site is not SAP documentation. You may not infer any product claims against SAP based on this information.
• SAP does not agree or disagree with the content on the linked-to site, nor does SAP warrant the availability and correctness. SAP shall not be liable for any
damages caused by the use of such content unless damages have been caused by SAP's gross negligence or willful misconduct.
• Links with the icon : You are leaving the documentation for that particular SAP product or service and are entering an SAP-hosted Web site. By using
such links, you agree that (unless expressly stated otherwise in your agreements with SAP) you may not infer any product claims against SAP based on this
information.
Example Code
Any software coding and/or code snippets are examples. They are not for productive use. The example code is only intended to better explain and visualize the syntax
and phrasing rules. SAP does not warrant the correctness and completeness of the example code. SAP shall not be liable for errors or damages caused by the use of
example code unless damages have been caused by SAP's gross negligence or willful misconduct.
Bias-Free Language
SAP supports a culture of diversity and inclusion. Whenever possible, we use unbiased language in our documentation to refer to people of all cultures, ethnicities,
genders, and abilities.
SAP and other SAP products and services mentioned herein as well as
their respective logos are trademarks or registered trademarks of SAP
SE (or an SAP affiliate company) in Germany and other countries. All
other product and service names mentioned are the trademarks of their
respective companies.