Build an Invoice Extraction API with FastAPI and Python

Build a FastAPI invoice extraction endpoint with the Python SDK. Covers file uploads, Pydantic response models, async batch processing, and deployment.

Published
Updated
Reading Time
18 min
Topics:

A FastAPI invoice extraction API is an HTTP endpoint that accepts uploaded invoices (PDF or image), passes them to a managed extraction SDK, and returns structured invoice data as a typed JSON response. The Invoice Data Extraction Python SDK reduces this to a single extract() call: your endpoint receives the file, the SDK handles all OCR, document interpretation, and field extraction, and your FastAPI route returns the result as a Pydantic model containing the invoice number, date, vendor details, line items, and totals. No Tesseract pipeline, no LLM prompt chains, no post-processing glue code.

FastAPI is well suited to this architecture. According to the Python Developers Survey 2025, FastAPI was the biggest winner among Python web frameworks, jumping from 29% to 38% adoption, a 30% year-over-year increase. That growth reflects what backend developers already know: FastAPI's async-first design and built-in Pydantic validation make it a natural fit for file-processing APIs.

This tutorial builds a web service, not a standalone script. If you need to extract invoice data from a local directory of files without an HTTP layer, see the guide on extracting invoice data with standalone Python scripts. The approach here covers everything a web service requires: UploadFile handling for multipart form data, Pydantic response schemas for typed output, dependency injection for SDK client management, and async endpoint patterns for batch workloads.

By the end of this tutorial, you will have a production-ready FastAPI endpoint that accepts PDF and image uploads, extracts structured invoice data through the SDK, and returns typed JSON responses. You will also implement async batch processing for high-volume workloads and structured error handling for production deployment.


Setting Up FastAPI with the Extraction SDK

A FastAPI project for invoice extraction needs only four dependencies. Install them into your virtual environment:

pip install fastapi uvicorn python-multipart invoicedataextraction-sdk

fastapi and uvicorn give you the ASGI application server. python-multipart is required for FastAPI's UploadFile handling (without it, file upload endpoints will fail silently). invoicedataextraction-sdk is the Python SDK that wraps the same extraction engine available through the Invoice Data Extraction API, giving you document parsing, field extraction, and structured output through a single method call.

For this tutorial, a single-file structure keeps things focused:

invoice-api/
├── main.py
├── .env
└── requirements.txt

In a production microservice, you would split route handlers, Pydantic models, and dependency providers into separate modules. For now, everything lives in main.py.

SDK Client Initialization

The SDK client needs an API key, which you generate and manage from your account dashboard. Store it as an environment variable and never hardcode it:

import os
from fastapi import FastAPI, Depends
from invoicedataextraction import InvoiceDataExtraction

app = FastAPI()

def get_extraction_client() -> InvoiceDataExtraction:
    return InvoiceDataExtraction(
        api_key=os.environ.get("INVOICE_DATA_EXTRACTION_API_KEY"),
    )

This get_extraction_client function is a FastAPI dependency. Inject it into any route handler with Depends(), and every endpoint shares the same client configuration without importing or instantiating it directly:

@app.post("/extract")
async def extract_invoice(
    client: InvoiceDataExtraction = Depends(get_extraction_client),
):
    # client is ready to use
    ...

This pattern gives you a single place to change configuration (swap API keys per environment, point to a staging URL) and makes testing straightforward. In your test suite, override the dependency to return a mock client instead of hitting the live API.

Why a Managed SDK Instead of DIY OCR

If you have spent any time comparing Python OCR libraries for invoice processing, you know what a DIY invoice extraction pipeline looks like: Tesseract or PaddleOCR to convert scans to raw text, custom parsing logic to locate invoice numbers and totals in varying layouts across vendors, table extraction to handle line items that span page breaks, and eventually an LLM layer to handle the edge cases that rule-based parsing misses. Each layer adds dependencies, failure modes, and code you have to own.

A managed extraction SDK collapses that entire stack into one function call. The SDK handles document interpretation, field extraction, and data structuring on the server side, including multi-language invoices and low-quality scans. Your FastAPI service sends files and receives structured JSON, with no model weights to host, no prompt engineering to maintain, and no OCR engine versions to track. Your endpoint code stays focused on request handling and business logic rather than extraction internals.

Developers who prefer working with raw HTTP requests can call the REST API directly instead of using the SDK wrapper. The SDK is a convenience layer over the same endpoints.


From File Upload to Structured Invoice Data

The endpoint you are building has three jobs: accept an uploaded invoice file, run it through the extraction SDK, and return typed invoice data that FastAPI can document automatically. Start with the data models that define what "structured invoice data" looks like in your API.

Defining Pydantic Response Models

Pydantic models serve double duty here. They validate the extraction output at runtime and generate the OpenAPI schema that FastAPI exposes at /docs. Define a model hierarchy that mirrors real invoice structure:

from pydantic import BaseModel
from typing import Optional

class InvoiceLineItem(BaseModel):
    description: str
    quantity: float
    unit_price: float
    line_total: float

class InvoiceData(BaseModel):
    invoice_number: str
    date: str
    vendor_name: str
    subtotal: Optional[float] = None
    tax: Optional[float] = None
    total: float
    currency: Optional[str] = None
    line_items: list[InvoiceLineItem] = []

class ExtractionResponse(BaseModel):
    success: bool
    extraction_id: str
    data: InvoiceData

InvoiceLineItem captures each product or service row. InvoiceData holds the header-level fields plus a list of line items. ExtractionResponse wraps everything with a success flag and a unique extraction ID for traceability. Optional fields like currency and subtotal account for invoices that omit them — the SDK returns all fields it can extract, and missing or unreadable values come back as null. This is typed Python, not a raw dictionary or an unstructured text dump from an OCR pipeline. Downstream API consumers get a guaranteed contract: every response has an invoice_number string and a total float, and the OpenAPI spec FastAPI generates from these models serves as live documentation.

Building the Extraction Endpoint

The POST route accepts a file via FastAPI's UploadFile, which gives you the file object, its filename, and the declared content type in a single parameter. Before touching the SDK, validate that the upload is a format the extraction engine supports:

import os
import json
import tempfile
from fastapi import FastAPI, UploadFile, HTTPException
from invoicedataextraction import InvoiceDataExtraction

app = FastAPI(title="Invoice Extraction API")

client = InvoiceDataExtraction(
    api_key=os.environ.get("INVOICE_DATA_EXTRACTION_API_KEY"),
)

ALLOWED_TYPES = {
    "application/pdf",
    "image/jpeg",
    "image/png",
}

@app.post("/extract", response_model=ExtractionResponse)
async def extract_invoice(file: UploadFile):
    if file.content_type not in ALLOWED_TYPES:
        raise HTTPException(
            status_code=422,
            detail=f"Unsupported file type: {file.content_type}. "
                   f"Accepted formats: PDF, JPG, PNG.",
        )

The 422 response follows HTTP semantics for unprocessable content. Clients sending a Word document or a TIFF get an immediate, descriptive rejection before any processing begins.

Calling the SDK

With the file validated, save it to a temporary location and pass the path to the SDK's extract() method. The method's prompt parameter accepts a plain string (up to 2,500 characters) describing what to extract in natural language, or a dict with structured field definitions where each field has a name and an optional prompt for field-specific instructions. The string form keeps the code readable for a single-endpoint service:

    with tempfile.NamedTemporaryFile(
        delete=False, suffix=os.path.splitext(file.filename)[1]
    ) as tmp:
        tmp.write(await file.read())
        tmp_path = tmp.name

    try:
        result = client.extract(
            files=[tmp_path],
            prompt="Extract invoice number, date, vendor name, line items with description quantity unit price and line total, subtotal, tax, total, and currency",
            output_structure="per_invoice",
            download={
                "formats": ["json"],
                "output_path": "./output",
            },
        )
    finally:
        os.unlink(tmp_path)

The output_structure parameter controls extraction granularity. Setting it to "per_invoice" produces one result object per invoice in the file. If you needed row-level detail instead (one object per product line across all invoices), you would switch to "per_line_item". The SDK handles uploading the file to the extraction engine, polling for completion, and downloading the JSON result to the output path you specify.

Parsing the Result Into Your Response Model

The SDK returns a result object with the extraction status, ID, and output URLs. When you request JSON format via the download parameter, the SDK writes the structured extraction data to your output directory. Read that file and map it onto your Pydantic models:

    if not result.get("success"):
        raise HTTPException(status_code=502, detail="Extraction failed.")

    output_dir = "./output"
    json_files = [f for f in os.listdir(output_dir) if f.endswith(".json")]
    if not json_files:
        raise HTTPException(status_code=502, detail="No extraction output.")

    with open(os.path.join(output_dir, json_files[-1])) as f:
        extracted = json.load(f)

    invoice_raw = extracted[0] if isinstance(extracted, list) else extracted

    line_items = [
        InvoiceLineItem(**item)
        for item in invoice_raw.get("line_items", [])
    ]

    invoice_data = InvoiceData(
        invoice_number=invoice_raw.get("invoice_number", ""),
        date=invoice_raw.get("date", ""),
        vendor_name=invoice_raw.get("vendor_name", ""),
        subtotal=invoice_raw.get("subtotal"),
        tax=invoice_raw.get("tax"),
        total=invoice_raw.get("total", 0),
        currency=invoice_raw.get("currency"),
        line_items=line_items,
    )

    return ExtractionResponse(
        success=True,
        extraction_id=result.get("extraction_id", ""),
        data=invoice_data,
    )

Because ExtractionResponse is declared as the response_model on the route, FastAPI validates the return value against the schema before serializing it to JSON. Any missing required fields or type mismatches raise an error rather than sending malformed data to the client. A successful extraction returns a response like this:

{
  "success": true,
  "extraction_id": "ext_8f2a1b3c",
  "data": {
    "invoice_number": "INV-2026-0042",
    "date": "2026-03-15",
    "vendor_name": "Acme Office Supplies Ltd",
    "subtotal": 450.00,
    "tax": 90.00,
    "total": 540.00,
    "currency": "USD",
    "line_items": [
      {
        "description": "Printer Paper A4 (5 reams)",
        "quantity": 5,
        "unit_price": 12.00,
        "line_total": 60.00
      },
      {
        "description": "Ergonomic Desk Chair",
        "quantity": 1,
        "unit_price": 390.00,
        "line_total": 390.00
      }
    ]
  }
}

Complete Runnable Example

The full endpoint in a single file:

import os
import json
import tempfile
from fastapi import FastAPI, UploadFile, HTTPException
from pydantic import BaseModel
from typing import Optional
from invoicedataextraction import InvoiceDataExtraction

# --- Pydantic models ---

class InvoiceLineItem(BaseModel):
    description: str
    quantity: float
    unit_price: float
    line_total: float

class InvoiceData(BaseModel):
    invoice_number: str
    date: str
    vendor_name: str
    subtotal: Optional[float] = None
    tax: Optional[float] = None
    total: float
    currency: Optional[str] = None
    line_items: list[InvoiceLineItem] = []

class ExtractionResponse(BaseModel):
    success: bool
    extraction_id: str
    data: InvoiceData

# --- App and SDK client ---

app = FastAPI(title="Invoice Extraction API")

client = InvoiceDataExtraction(
    api_key=os.environ.get("INVOICE_DATA_EXTRACTION_API_KEY"),
)

ALLOWED_TYPES = {"application/pdf", "image/jpeg", "image/png"}

# --- Extraction endpoint ---

@app.post("/extract", response_model=ExtractionResponse)
async def extract_invoice(file: UploadFile):
    if file.content_type not in ALLOWED_TYPES:
        raise HTTPException(
            status_code=422,
            detail=f"Unsupported file type: {file.content_type}. "
                   f"Accepted formats: PDF, JPG, PNG.",
        )

    with tempfile.NamedTemporaryFile(
        delete=False, suffix=os.path.splitext(file.filename)[1]
    ) as tmp:
        tmp.write(await file.read())
        tmp_path = tmp.name

    try:
        result = client.extract(
            files=[tmp_path],
            prompt="Extract invoice number, date, vendor name, line items with description quantity unit price and line total, subtotal, tax, total, and currency",
            output_structure="per_invoice",
            download={
                "formats": ["json"],
                "output_path": "./output",
            },
        )
    finally:
        os.unlink(tmp_path)

    if not result.get("success"):
        raise HTTPException(status_code=502, detail="Extraction failed.")

    output_dir = "./output"
    json_files = [f for f in os.listdir(output_dir) if f.endswith(".json")]
    if not json_files:
        raise HTTPException(status_code=502, detail="No extraction output.")

    with open(os.path.join(output_dir, json_files[-1])) as f:
        extracted = json.load(f)

    invoice_raw = extracted[0] if isinstance(extracted, list) else extracted

    line_items = [
        InvoiceLineItem(**item)
        for item in invoice_raw.get("line_items", [])
    ]

    invoice_data = InvoiceData(
        invoice_number=invoice_raw.get("invoice_number", ""),
        date=invoice_raw.get("date", ""),
        vendor_name=invoice_raw.get("vendor_name", ""),
        subtotal=invoice_raw.get("subtotal"),
        tax=invoice_raw.get("tax"),
        total=invoice_raw.get("total", 0),
        currency=invoice_raw.get("currency"),
        line_items=line_items,
    )

    return ExtractionResponse(
        success=True,
        extraction_id=result.get("extraction_id", ""),
        data=invoice_data,
    )

# --- Entry point ---

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Run it with python main.py or uvicorn main:app --reload during development. Once the server starts, open http://localhost:8000/docs in your browser. FastAPI generates interactive Swagger UI documentation directly from your Pydantic models. Every field in ExtractionResponse, InvoiceData, and InvoiceLineItem appears in the schema panel with its type and optionality. API consumers can test the file upload endpoint from that same page, inspect the exact response shape, and use the OpenAPI spec to generate client libraries in any language.


Async Batch Processing for Large Workloads

The single-file extraction endpoint works well for on-demand requests, but production workloads rarely arrive one invoice at a time. An accounts payable team uploads 200 supplier invoices at month-end. A document ingestion pipeline feeds in multi-page PDFs that take minutes to process. In both cases, a synchronous endpoint that blocks until extraction finishes will either time out or starve your API of worker threads.

The fix is a job-based pattern: accept the files, return a job ID immediately, and process the extraction in the background. The SDK's staged workflow methods are built for exactly this.

The Staged Workflow

Instead of the single extract() call, the SDK exposes individual methods that map directly to a job-based async invoice processing Python API:

  1. upload_files() sends documents to the extraction service and returns a session ID plus a list of file IDs.
  2. submit_extraction() kicks off the extraction job against those uploaded files and returns an extraction_id, which becomes your job reference.
  3. wait_for_extraction_to_finish() polls the service until the job reaches a terminal state. You configure the polling loop with an interval_ms (minimum 5,000 ms, default 10,000 ms) and an optional timeout_ms (defaults to None for no timeout).
  4. check_extraction() performs a single status check without entering a polling loop, returning the current state and a progress percentage for in-flight jobs.
  5. download_output() retrieves the finished results as JSON, CSV, or XLSX.

This separation gives you full control over each stage. In a FastAPI background task, wait_for_extraction_to_finish() handles the polling loop internally so your background worker can fire-and-forget.

Building the Batch Endpoint

The pattern pairs FastAPI's BackgroundTasks with the staged workflow. A POST endpoint accepts multiple files, validates them against the SDK's batch limits, and returns a 202 response with the extraction ID. A background task handles the waiting. A separate GET endpoint lets callers poll for results.

import os
import tempfile
from fastapi import FastAPI, UploadFile, BackgroundTasks, HTTPException
from fastapi.responses import JSONResponse
from invoicedataextraction import InvoiceDataExtraction

app = FastAPI()
client = InvoiceDataExtraction(
    api_key=os.environ.get("INVOICE_DATA_EXTRACTION_API_KEY"),
)

MAX_FILES = 6000
MAX_TOTAL_BYTES = 2 * 1024 * 1024 * 1024  # 2 GB
MAX_PDF_BYTES = 150 * 1024 * 1024          # 150 MB per PDF
MAX_IMAGE_BYTES = 5 * 1024 * 1024          # 5 MB per image


def validate_batch(files: list[UploadFile]) -> None:
    if len(files) > MAX_FILES:
        raise HTTPException(400, f"Batch exceeds {MAX_FILES} file limit")
    total_size = 0
    for f in files:
        f.file.seek(0, 2)
        size = f.file.tell()
        f.file.seek(0)
        total_size += size
        ext = f.filename.lower().rsplit(".", 1)[-1] if f.filename else ""
        if ext == "pdf" and size > MAX_PDF_BYTES:
            raise HTTPException(400, f"{f.filename} exceeds 150 MB PDF limit")
        if ext in ("jpg", "jpeg", "png") and size > MAX_IMAGE_BYTES:
            raise HTTPException(400, f"{f.filename} exceeds 5 MB image limit")
    if total_size > MAX_TOTAL_BYTES:
        raise HTTPException(400, "Total upload exceeds 2 GB limit")


@app.post("/extractions", status_code=202)
async def create_extraction(
    files: list[UploadFile],
    background_tasks: BackgroundTasks,
    prompt: str = "Extract invoice number, date, vendor name, line items, tax, and total",
):
    validate_batch(files)
    tmp_dir = tempfile.mkdtemp()
    file_paths = []
    for f in files:
        path = os.path.join(tmp_dir, f.filename)
        with open(path, "wb") as out:
            out.write(await f.read())
        file_paths.append(path)

    upload = client.upload_files(files=file_paths)
    submission = client.submit_extraction(
        upload_session_id=upload["upload_session_id"],
        file_ids=upload["file_ids"],
        prompt=prompt,
        output_structure="per_invoice",
    )
    extraction_id = submission["extraction_id"]

    background_tasks.add_task(
        lambda: client.wait_for_extraction_to_finish(
            extraction_id=extraction_id,
            polling={"interval_ms": 10000},
        )
    )

    return JSONResponse(
        status_code=202,
        content={"extraction_id": extraction_id, "status": "processing"},
    )

The status endpoint uses check_extraction() to return the current state without blocking:

@app.get("/extractions/{extraction_id}/status")
async def get_extraction_status(extraction_id: str):
    result = client.check_extraction(extraction_id=extraction_id)
    return result

When the caller sees a completed status, they can hit a download endpoint:

@app.get("/extractions/{extraction_id}/download")
async def download_extraction(extraction_id: str, format: str = "json"):
    output_path = f"./outputs/{extraction_id}.{format}"
    client.download_output(
        extraction_id=extraction_id,
        format=format,
        file_path=output_path,
    )
    from fastapi.responses import FileResponse
    return FileResponse(output_path, filename=f"extraction.{format}")

Polling Configuration

The polling parameter on wait_for_extraction_to_finish() accepts two fields:

  • interval_ms controls how frequently the SDK checks for completion. The minimum is 5,000 ms. For large batches, the default of 10,000 ms avoids unnecessary requests while still surfacing results promptly.
  • timeout_ms sets a ceiling on total wait time. When set to None (the default), the method polls indefinitely until the extraction completes or fails. If you set a timeout and the job exceeds it, the SDK raises an error, but the extraction continues server-side. You can pick it up later with check_extraction().

For background tasks where no HTTP connection is waiting, leaving timeout_ms at None is typically the right choice. The background worker simply waits until the job finishes.

Batch Limits and Credit Consumption

The validation logic above mirrors the SDK's actual constraints: up to 6,000 files per upload session, a 2 GB total upload size, individual PDFs up to 150 MB, and images up to 5 MB each. Enforcing these at the endpoint level gives callers clear error messages instead of opaque SDK failures.

On the cost side, the SDK and web platform share the same credit pool with no separate API subscription fees. Each successfully processed page consumes 1 credit, whether it arrives through your FastAPI endpoint or the web interface. Every account receives 50 free pages per month, and additional credits are available pay-as-you-go. If you are building a batch service that processes hundreds or thousands of pages, factor per-page credit consumption into your capacity planning and pass costs through to your users accordingly.

For developers who want to understand the HTTP API that the SDK wraps underneath these method calls, the invoice extraction API developer quickstart covers the raw endpoints and authentication flow.


Error Handling and Production Deployment

A working endpoint is not a production-ready endpoint. The difference comes down to how your service handles failures, validates input before burning credits, and scales under real traffic. This section covers the patterns specific to running an extraction-heavy FastAPI service.

Structured SDK Error Handling

The Python SDK exposes two exception classes you need to catch: SdkError for client-side failures (file system issues, network timeouts, upload orchestration) and ApiResponseError for server-side rejections (invalid files, extraction failures). Import both from the SDK's errors module:

from invoicedataextraction.errors import SdkError, ApiResponseError

Both exceptions attach a structured body with a consistent shape:

{
  "success": false,
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description",
    "retryable": true,
    "details": null
  }
}

The retryable flag is the key field for your error-handling logic. When it is true, the failure is transient (a network blip, a temporary service issue) and the caller should retry. When false, the request itself is invalid and retrying will produce the same result. Map this directly to your HTTP responses:

from fastapi import HTTPException
from invoicedataextraction.errors import SdkError, ApiResponseError

async def handle_extraction(client, file_path: str, prompt: str):
    try:
        result = client.extract(
            files=[file_path],
            prompt=prompt,
            output_structure="per_invoice",
        )
        return result
    except ApiResponseError as e:
        error_body = e.body["error"]
        if error_body["retryable"]:
            raise HTTPException(
                status_code=503,
                detail={
                    "error": error_body["code"],
                    "message": error_body["message"],
                    "retry": True,
                },
            )
        raise HTTPException(
            status_code=422,
            detail={
                "error": error_body["code"],
                "message": error_body["message"],
                "retry": False,
            },
        )
    except SdkError as e:
        error_body = e.body["error"]
        code = error_body["code"]
        if code == "SDK_NETWORK_ERROR":
            raise HTTPException(
                status_code=502,
                detail="Extraction service unreachable. Retry shortly.",
            )
        if code == "SDK_UPLOAD_ERROR":
            raise HTTPException(
                status_code=502,
                detail="File upload to extraction service failed.",
            )
        raise HTTPException(
            status_code=500,
            detail=f"Extraction failed: {error_body['message']}",
        )

This gives callers actionable information: a 503 with a retry flag means "back off and try again," a 422 means "fix your request," and a 502 means the upstream extraction service had a transient issue.

Input Validation Before Extraction

Validate files before they reach the SDK. This avoids wasting time on uploads that will be rejected and gives callers immediate feedback with proper HTTP 422 responses.

from fastapi import UploadFile, HTTPException

PDF_MAX_SIZE = 150 * 1024 * 1024      # 150 MB
IMAGE_MAX_SIZE = 5 * 1024 * 1024       # 5 MB
BATCH_MAX_SIZE = 2 * 1024 * 1024 * 1024  # 2 GB
ALLOWED_TYPES = {
    "application/pdf": PDF_MAX_SIZE,
    "image/jpeg": IMAGE_MAX_SIZE,
    "image/png": IMAGE_MAX_SIZE,
}

async def validate_upload(file: UploadFile):
    if file.content_type not in ALLOWED_TYPES:
        raise HTTPException(
            status_code=422,
            detail=f"Unsupported file type: {file.content_type}. "
                   f"Accepted: PDF, JPEG, PNG.",
        )
    contents = await file.read()
    await file.seek(0)
    if len(contents) == 0:
        raise HTTPException(status_code=422, detail="Empty file.")
    max_size = ALLOWED_TYPES[file.content_type]
    if len(contents) > max_size:
        limit_mb = max_size // (1024 * 1024)
        raise HTTPException(
            status_code=422,
            detail=f"File exceeds {limit_mb} MB limit for "
                   f"{file.content_type}.",
        )
    return contents

For batch endpoints, accumulate the total size across all files in the request and reject the batch if it exceeds the 2 GB limit. Doing this at the FastAPI layer means invalid requests never leave your server.

Rate Limiting Awareness

The extraction API enforces per-key rate limits: 600 upload requests/min, 30 submit requests/min, 120 poll requests/min, and 30 download requests/min. For a single-user tool, these limits are generous. For a high-throughput service handling concurrent users, the submit limit of 30 per minute is the bottleneck you will hit first.

For high-throughput services, queue or throttle extraction requests to stay within limits, either through an async queue (asyncio.Queue, Celery) or timestamp-based tracking per endpoint category. If you do exceed a rate limit, the SDK raises an ApiResponseError with retryable set to true, so the error handling above already covers it.

Deploying with Uvicorn

Invoice extraction endpoints are I/O-bound. Your FastAPI handlers spend most of their time waiting on HTTP calls to the extraction API, not doing CPU work. This means async handlers paired with multiple Uvicorn workers give you the best throughput:

uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

Four workers can handle four concurrent extraction requests that are each awaiting SDK responses. Adjust the count based on your expected concurrency and the rate limits above. On Linux, Gunicorn with Uvicorn workers provides process management (automatic restarts, graceful shutdown):

gunicorn app.main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

Once deployed, verify the endpoint at /docs, where FastAPI's auto-generated Swagger UI lets you upload files, trigger extractions, and inspect responses. This same interface serves as live documentation for other developers consuming your API.

About the author

DH

David Harding

Founder, Invoice Data Extraction

David Harding is the founder of Invoice Data Extraction and a software developer with experience building finance-related systems. He oversees the product and the site's editorial process, with a focus on practical invoice workflows, document automation, and software-specific processing guidance.

Editorial process

This page is reviewed as part of Invoice Data Extraction's editorial process.

If this page discusses tax, legal, or regulatory requirements, treat it as general information only and confirm current requirements with official guidance before acting. The updated date shown above is the latest editorial review date for this page.

Continue Reading

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours