Hello Erik Åkerberg,
I understand you are looking to optimize latency for your model analysis.
First, to set a clear expectation: a processing time of 6-7 seconds is within the normal and expected range for the prebuilt-receipt model. And to answer your first question directly: No, there are no parameters to make the prebuilt-receipt model "lighter" or run a partial analysis. However, you can absolutely optimize the total time of the operation.
Here are the recommended strategies/ workaround to try:
1. Use the Asynchronous Pattern (I would recommend)
Instead of making one long, blocking call and waiting 7 seconds for the result, you should:
- POST your request to the
.../analyzeendpoint. The service will respond immediately (in milliseconds) with a202 Acceptedstatus and anOperation-LocationURL in the header. - Your application is now free to do other work or show a "processing" UI to the user.
- GET (poll) the
Operation-LocationURL. It will return arunningstatus. You can check this every 1-2 seconds. - When the analysis is done (after 6-7 seconds), the poll will return a
succeededstatus, and the response body will contain your full JSON result.
This is the most critical optimization you can make. It won't make the 6-7 second analysis itself faster, but it will make your application feel instantaneous to the user.
2. Optimize Image Quality and Format
This directly addresses your pre-processing idea and can shave time off the analysis.
- Resolution: The model needs clear text, but it doesn't need a massive 10MB, 4K-resolution image. A very large file takes longer to upload and for the service to decode. The sweet spot is 150-300 DPI. Resizing images before sending is a good practice.
- File Format: Use a compressed format like JPEG (for photos) or PNG. Avoid sending uncompressed formats like BMP or large TIFFs, as the upload and ingestion will be slower.
- Clarity: While the model is robust, a blurry, crumpled, or poorly lit receipt will take the model longer to analyze. Pre-processing that straightens or improves the contrast of an image can help.
3. Co-locate Your Azure Resources
Ensure that the resource making the API call (e.g., your App Service, Azure Function, or VM) is deployed in the same Azure region as your Document Intelligence resource (e.g., both in East US 2). This minimizes the network round-trip time, which can easily add 1-2 seconds to every request if your resources are on opposite sides of the world.
4. Use the Latest GA Model Version
Always specify the latest stable, generally available (GA) model version in your API call. Newer models are not only more accurate but are often optimized for better performance.
If I was helpful, kindly accept the answer and upvote for visibility to other community members remediation.