Extraction Guide
Upload your documents, describe what you need, and download structured Excel data — it only takes a few clicks
Quick Start
Upload files
PDFs or images containing the data you want to extract.
Describe what to extract
Optionally provide a prompt in your own words — or let the AI decide.
Download your spreadsheet
Your extracted data, structured in a downloadable Excel file.
Uploads
Upload PDFs and images for extraction — from a single document to thousands at once.
Supported File Types
PDF Files
- Regular & scanned PDFs
- Up to 150MB per file
- Up to 5,000 pages long
- Can contain multiple invoices
Images
- JPG and PNG files
- Up to 5MB per file
- One invoice per image
Batch Processing
Process up to 6000 documents in a single extraction task. Need larger batches? Message us and we can accommodate your needs.
Upload batches exactly as you receive them. The AI can handle invoices, credit notes, and statements in the same job, applying different rules to each. Email cover pages, remittance advice, and other non-invoice pages can be automatically filtered out via your prompt. See Prompt Capabilities for examples.
Prompt
Tell the AI what to extract — from a simple list of fields to detailed business rules.
Let AI Write Your Prompt
Not sure where to start? Upload your files, click Suggest prompt, and the AI generates a tailored extraction prompt for you.
How it works
- The AI examines your documents to identify key data points
- A tailored extraction prompt is generated automatically
- Review and adjust, then save to your library for reuse
Writing Prompts
Start simple and add detail when you need precise control. When your prompt leaves something unspecified, the system uses conservative, accounting-friendly judgment.
Simple Prompts
List the fields you need. The AI selects formats and handles document structure.
“Extract invoice number, invoice date, vendor name, net amount, tax, total”
→ AI returns exactly these fields, correctly formatted
“I’m processing invoices for payment. Extract invoice number, date, vendor, amount due, payment terms”
→ Goal helps AI handle edge cases; listed fields define the output
“Extract line items: description, quantity, unit price, line total”
→ AI creates one row per line item with invoice details repeated
Detailed Prompts
Define exact fields, formats, and business rules for repeatable, auditable workflows.
“I'm preparing AP data for our month-end close.
Extract:
- Invoice Number (alphanumeric, top-right)
- Invoice Date (YYYY-MM-DD)
- Vendor Legal Name (prefer extracting from footer)
- Net Amount (pre-tax invoice total)
- VAT Rate (if no VAT is listed use 0, use Excel type percentage)
- VAT Amount (if no VAT is present use 0)
- Total Amount (invoice final total)
- Document Type (classify as Invoice or Credit Note)
- - For credit notes prefix Invoice Number with 'CR-' and show amounts as negative.
- - One row for each invoice or credit note.
- - Skip any pages that are email cover sheets or summary pages.”
Describe Your Goal for Smarter Results
Adding context about your finance process helps the AI handle edge cases you haven't anticipated.
“Extract invoice number, date, and total”
“I'm processing supplier invoices for payment. Extract invoice number, date, and total”
Why this helps
Describing your goal — such as the finance process you're performing — helps the AI make smarter decisions about edge cases (like how to handle bundled documents or ambiguous values) so you don't have to anticipate every scenario in your prompt.
Example goals
Prompt Library
Save prompts for one-click reuse — ensuring consistent, repeatable results across future batches.
Save and reuse
Save prompts by workflow, client, or document type. Apply any saved prompt with one click to ensure every batch follows the same extraction rules.
When to use separate prompts
You don't need separate prompts for different vendors or layouts — the AI adapts. Use separate prompts when the extraction logic itself differs:
- Different tasks (VAT reporting vs. expense tracking)
- Different document types (invoices vs. bank statements)
- Client-specific output formats
- Vendor-specific handling rules
Results
Review, download, and verify your extracted data.
Download Formats
Your output can be downloaded as Excel (.xlsx), CSV (.csv), or JSON (.json). Excel files use native spreadsheet cell types where applicable. CSV and JSON return extracted values as text.
Reviewing Your Output
Review failed pages
Failed pages are flagged in the Status & Results column. Click View Pages to see details and re-attempt if needed.
Check Review Needed warnings
If the AI is not confident enough in a row or value, it adds a warning requiring manual verification. These are flagged with a Review Needed badge in View Results and, when enabled, added as an extra column in your extracted data.
Check prompt notes
If the AI made assumptions because your prompt left something open, a badge appears in the Prompt Notes column. Each note includes copyable prompt suggestions so you can be more explicit next time.
Source file column
A Source File column shows which uploaded file and page each row was extracted from.
Spot-check for accuracy
Review random samples from the spreadsheet against your original documents to verify extraction accuracy.
Review Needed
Review Needed marks extraction results that require manual verification before they are used downstream. When the AI is not confident enough in a value, it records a warning instead of leaving that uncertainty hidden in the spreadsheet.
Flags low-confidence results
When the AI is not confident enough in an extracted value, it marks that result as requiring manual verification.
Explains what to check
Each warning describes the issue, identifies the affected field, and includes source file context so reviewers can understand and verify it quickly.
How to use Review Needed
Review Needed warnings are available in View Results. By default, they are also added as an extra Review Needed column in your extracted data, so you can review the warnings in the dashboard or directly in your spreadsheet.
Open the Review Needed badge in View Results, or review the Review Needed column in your spreadsheet output.
Use the affected field and source file reference to verify the extracted value against the original document, then confirm or correct it before using the data.
If the spreadsheet column is included, delete it after review if you want a clean extracted-data file for import or recordkeeping.
A Review Needed warning does not mean the task failed and it does not change the extracted data. It means the AI was not confident enough for that result to be used without human verification. In Preferences, you can disable the Review Needed spreadsheet column and control XLSX highlighting for Review Needed cells and affected data cells.
Example Review Needed warning
Affected field
Quantity
Warning
The extracted Quantity is 9, matching the printed value, but a handwritten value of 10 appears beside it. Verify whether the extracted Quantity should use the printed value or the handwritten value.
Source file
[Page 1] supplier-invoice.pdf
This is the type of issue Review Needed is designed to catch: the extracted value may be correct, but the source document needs human judgment before the data can be relied upon.
AI Prompt Notes
When your prompt leaves room for interpretation, the AI tells you what it assumed and suggests how to make your prompt more explicit next time. No notes means no prompt assumptions were reported.
Multiple possible matches
Your prompt says ‘Total’ but your documents have both a line item total and an invoice total — the AI tells you which interpretation it used.
Inexact field names
Your prompt says ‘Net’ but your documents sometimes label it ‘Subtotal’ — the AI tells you how it handled this.
Mixed document types
Your files contain invoices with attached purchase orders — the AI tells you which pages it extracted from and which it ignored.
Unspecified scenarios
Your prompt didn’t mention credit notes but the AI encountered some — it tells you how it handled them, such as treating amounts as negative.
AI Prompt Notes
Prompt uncertainty report · 1 observation · 2 prompt suggestions
During this extraction, your prompt left some details open. I made interpretations based on your instructions and documents. Review these notes if you want to make future prompts more explicit.
Documents to extract from
Your files often contain a ‘Tax Invoice’ with an attached ‘Delivery Note’. I treated the ‘Tax Invoice’ pages as the main source of data, and ignored the attached ‘Delivery Note’ pages as supporting context.
To confirm this handling:
Extract from ‘Tax Invoice’ onlyTo extract from both:
Extract from ‘Tax Invoice’ and ‘Delivery Note’Advanced prompting
You can use the software by simply telling the AI what to do. For more detailed information about prompt capabilities or using the Structured Prompt input, see below.
Structured Prompt
An alternative way to provide extraction instructions — same capabilities as the free-text prompt, with guaranteed column headers and order.
In the extraction workspace, choose Structured Prompt from the prompting method switcher.
When to Use Structured Prompt
With Structured Prompt, your spreadsheet is guaranteed to use your exact column headers in your exact order. With the free-text Prompt, column names and order are typically consistent but the AI may make adjustments unless you explicitly specify otherwise.
Components of a Structured Prompt
Column Headers
Each column header becomes an exact column name in your spreadsheet. Use clear, descriptive names that convey the data point's meaning.
Good column names
Avoid
Per-Column Prompt (Optional)
Add specific guidance for individual columns. Use these to clarify ambiguities or specify formatting for that particular data point.
"The date the invoice was issued, NOT the payment due date"
"Do not include currency symbol, use 2 decimal places"
"Extract from the table beneath Description"
"If crossed out and handwritten, use the handwritten value"
Additional Task-Wide Prompt (Optional)
Add instructions that apply to the entire extraction, not just a single column. This is where you describe your goal, specify what each row should represent, and set task-wide formatting or handling rules.
Examples of task-wide instructions:
Tips
Describe your goal in the Task-Wide Prompt
When the AI understands your workflow, it makes smarter decisions about edge cases. For example, knowing you're “processing for payment” tells the AI that bundled documents should be treated as one transaction. Add context like “I need this for quarterly VAT return” or “I'm reconciling against POs.”
Review prompt notes to refine your prompt
After extraction, check prompt notes for assumptions the AI made where your prompt left room for interpretation. Use this feedback to add clarifying instructions to your structured prompt for future extractions.
Prompt Controls & Capabilities
Define exact fields, formats, and rules to control how your data is extracted and structured.
Fields & Output Structure
Define which fields to capture, how to name columns, and what each row represents.
“Extract 'Invoice Number', 'Invoice Date', and 'Total'”
“Use the column header 'Supplier_Name'”
“Create one row per invoice”
“One row per line item, repeat 'Invoice Number' on each row”
Business Logic & Rules
Set hints, default values, and conditional logic to handle real-world variations.
Hint“'Product Code' is in the 'Description' column, begins with 'SKU-'”
Default“If 'Tax Amount' is missing, set to 0”
Fallback“Find 'PO Number' in header, else use 'Reference'”
Conditional“If 'Currency' is 'USD', use 'State Tax'; if 'EUR', use 'VAT'”
Document & Page Handling
Apply rules to specific document types or filter out unwanted pages.
“Ignore pages titled 'Email Cover Sheet'”
“For credit notes, prefix Invoice Number with 'CR-' and show amounts as negative”
“From Statements of Account, extract each invoice as a separate row”
Data Formatting & Classification
Control how values are stored in Excel and automatically categorize transactions.
“Format all dates as YYYY-MM-DD”
“Ensure all currency fields have 2 decimal places”
“Add an 'Expense Category' column — classify as 'Office Supplies', 'Software', 'Travel', or 'Utilities'”
“Add a 'Payment Priority' column — 'Urgent' if overdue or due within 7 days, else 'Standard'”
Note: Your local Excel settings may display native Excel types (i.e. numbers, dates) according to your settings.
Ready to extract data from your documents?
50 pages free every month·No credit card required