A fake receipt is a receipt document that has been forged, altered, or generated from scratch, increasingly by AI image tools, to misrepresent an expense. The forgery might be a doctored photo of a real coffee-shop slip with the total nudged up by fifty dollars, a PDF invented in a word processor and dressed up to look like a chain-restaurant printout, or, more recently, a convincingly grubby thermal-roll image produced by a text-to-image model in under a minute. The common thread is intent: the document exists to extract money, tax relief, or reimbursement that is not owed.
Knowing how to spot fake receipts in 2026 means accepting that the skill has changed. According to Accounting Today's reporting on a Medius survey of finance professionals, in a poll of 1,000 practitioners across the US and UK, 32 percent said they would not be able to recognise a fake receipt generated by an AI image tool, and 30 percent reported a rise in fake receipts since the start of the prior year. That 32 percent gap is the gap between what experienced reviewers can see with their eyes and what now actually lands in their inbox. Visual inspection alone no longer closes it.
Receipts sit inside a well-defined fraud category. Expense reimbursement fraud is one of the asset misappropriation schemes that the Association of Certified Fraud Examiners identifies as the largest category of occupational fraud by frequency, and receipt documents, whether paper scans, PDFs, or phone photos, are the primary evidence the scheme relies on. Treat fake receipt detection as the same verification problem you would apply the same layered detection framework to bank statements for, tuned to the artefacts receipts actually carry.
This article works through a four-tier framework for fraudulent receipts and AI-generated fake receipts — visual red flags, content-logic checks, file forensics, and structured-data validation at scale — that holds for a pasted-together Photoshop job and for a freshly rendered AI image, whether you are reviewing a single expense claim this afternoon or reconciling a month of company-wide submissions against receipt OCR output.
Tier 1 — Visual Red Flags on the Receipt Itself
Before running any math or forensics, scan the receipt for the surface-level defects that betray most forged receipts within seconds. This tier is the reviewer's first-pass filter, not a final verdict. A clean pass here means nothing on its own, but a clear fail usually justifies kicking the submission back before you spend any more time on it.
Fonts. Real point-of-sale receipts draw from a narrow family of monospaced or semi-monospaced typefaces, and they use them consistently from the header to the total line. The most common fake receipt red flags here are font mixing within a single receipt (for example, a monospaced POS font for the line items sitting next to a cleanly typeset serif on the total), inconsistent weight or kerning between adjacent lines, and fonts that simply do not match what the named merchant's POS system would print. A designer-grade typeface, a proportional font that any word processor ships with, or a crisp serif for the grand total are all tells. If the receipt looks typeset rather than printed by a register, treat it with suspicion.
Spacing and alignment. POS printers lay out receipts on a fixed character grid, so columns stay tight all the way down. Look for misaligned columns where item description, quantity, and price do not line up consistently, line items that hold alignment at the top and drift partway down, and subtotal, tax, and total rows that hang off a different right margin than the line items above them. Fractional pixel offsets and centred totals where the POS would have right-aligned them are common in forged receipts produced in a word processor or image editor.
Logos and merchant branding. Pixelated or low-resolution logos, colour-shifted versions, stretched aspect ratios, and logos placed in atypical positions for the chain are all worth flagging. A national chain's logo centred at the top of a receipt where it would normally sit at the bottom above a tagline is a classic mistake. Where the claimed amount is material, cross-reference the merchant's current logo variant on their website or a recent genuine receipt rather than relying on memory, since chains rebrand and claimants sometimes copy an old logo from a stale image search.
Thermal paper artefacts. Genuine thermal paper receipts behave in ways that are difficult to reproduce. Real prints carry faint horizontal banding from the thermal head, fade unevenly depending on how the roll was cut and how the paper was handled, and almost always show a slight curl or edge wear by the time they reach an expense submission. A scan that is uniformly clean, perfectly flat, and evenly exposed with no banding, no darker patches near the thumb-grip area, and no creases at the fold is often not a real thermal receipt at all but a digital render photographed or exported to look like one.
AI-generated uniformity. Free fake receipt generator detection starts at this tier too: watch for mechanically perfect spacing, absent print defects, and symmetrically applied wear that a real receipt never accumulates. A growing number of browser-based generators now produce images that clear casual visual inspection, so the absence of an obvious visual red flag should not be read as a pass. The concrete forensic tells for AI fabrications — producer strings, font embedding, and metadata signatures — sit in a dedicated section below.
Tier 2 — Content-Logic Checks and Receipt Math
Once a receipt has survived a visual scan, the next layer of scrutiny is the content itself. Does the arithmetic hold? Does the tax rate match the jurisdiction printed on the receipt? Does the merchant actually exist, and was it open when the receipt claims the transaction took place? These are the checks that expose fabricated documents that looked fine at first glance.
The Three Identities a Legitimate Receipt Must Satisfy
Every real point-of-sale receipt obeys three arithmetic relationships. A fabricated one, assembled in a hurry or generated by a model that did not think through its own numbers, tends to break at least one of them.
- Sum of line-item amounts equals the subtotal. Add the printed line items. The result must match the subtotal displayed on the receipt.
- Subtotal multiplied by the applicable tax rate equals the tax amount, within normal rounding. A real POS rounds each line or the final tax to the cent in a predictable way.
- Subtotal plus tax plus any listed tips or fees equals the total. No orphan amounts, no unexplained gap.
The failure modes are recognisable once you know what to look for. Tax stated as exactly 10.00 percent of a non-round subtotal, producing a suspiciously clean 7.50 on a 75.00 subtotal when the real local rate is 8.25 percent. A subtotal that does not reconcile with the displayed line items, because the fabricator adjusted one number without recomputing the others. A total that differs from subtotal-plus-tax by three or four cents in a way a genuine POS, which rounds deterministically, would never produce. Running the math takes less than a minute per receipt and catches a significant share of fraudulent receipts before any other check.
Jurisdictional Tax-Rate Sanity
The applicable rate depends entirely on where the transaction happened. US receipts carry state sales tax and, in many locations, county or city surcharges on top, so a receipt issued in Los Angeles will not show the same rate as one issued in Portland, Oregon, which has no state sales tax at all. UK receipts should show the standard VAT rate or, for qualifying categories, the reduced rate. EU receipts carry the VAT rate of the member state where the supply was made.
Two cross-checks cover almost every case you will encounter in practice. For US receipts, look up the combined state-and-local sales tax rate for the address printed on the receipt and compare it to the rate implied by the tax and subtotal amounts. A receipt showing 10 percent tax in a 6 percent state is not plausible. For UK and EU receipts, VAT verification involves confirming the stated rate matches the jurisdiction and that any VAT registration number on the receipt resolves to the merchant on the relevant national register (HMRC for the UK, VIES for EU cross-border checks). These two cross-checks, jurisdictional sales tax and VAT verification, are the ones a reviewer will actually run, and together they catch most rate-related fabrications.
Merchant Verification
A receipt is a claim that a specific merchant served a specific customer at a specific time. Test the claim. Confirm the merchant name and address resolve to a real business at that location using business-registry data, a Google Business listing, or the merchant's own website. A merchant that does not exist at the address printed on the receipt is the strongest possible signal of fabrication.
Then check the timestamp against the merchant's operating hours. A receipt from a café that closes at 18:00 cannot legitimately carry a 02:15 timestamp. A receipt from a restaurant that is closed on Mondays cannot carry a Monday date. Contradictions between the timestamp and the merchant's actual hours are high-signal because they are rarely accidental. Mismatches between a receipt and its paired invoice are a common failure mode of fabricated expense packages, which is why it helps to spot tampering on vendor invoices before payment when the two documents cover the same transaction.
The 75-Dollar Threshold and Which Receipts to Scrutinise First
For US readers, IRS Publication 463 sets a 75-dollar documentation threshold for most business travel and entertainment expenses: below that amount, a receipt is not strictly required to substantiate the deduction, while at or above it a receipt is. That threshold is a useful heuristic for triage. Receipts above 75 dollars are precisely the ones most likely to be fabricated, because they are the ones that require documentation to support a deduction or reimbursement in the first place. Scrutinise those first.
Round-Number and Threshold Tells
Fabricated receipts and inflated reimbursement claims cluster at round-dollar totals in a way real transactions do not. Real meals, real taxi rides, and real hotel incidentals rarely come out to exactly 25.00, 50.00, or 100.00. When a reviewer sees a run of receipts hitting clean round numbers, or a pattern of claims that repeatedly land one or two dollars below the per-claim approval threshold without ever crossing it, that is a statistical tell rather than proof on any single receipt. Treat it as a pattern check across a claimant's history, not a per-receipt rule.
Duplicate Patterns and Date Contradictions
Verify receipt authenticity against the context the claim sits in. A receipt whose date does not match the associated travel itinerary, calendar entries, or work location for that day is a candidate for fraud even when the document itself is spotless. So is a receipt whose merchant-plus-date-plus-amount combination has already appeared in a prior expense report from the same claimant or from a colleague, the classic signature of a receipt submitted twice or shared between employees. These content-level contradictions catch fraudulent receipts that would otherwise sail through because they were never fake in the first place, only recycled.
Tier 3 — PDF and File-Level Forensics
Once a receipt arrives as a digital file, the reviewer gains a set of checks that are simply not available on paper. The file itself carries a forensic record of how it was made, and for fabricated receipts that record rarely matches a point-of-sale origin. This tier is how you detect fake receipts that look visually plausible and survive arithmetic checks but were built in a design tool rather than printed by a real till.
The PDF Producer String
The single most productive file-level check is the PDF producer string. Every PDF embeds the name of the software that wrote it. Legitimate receipts rendered as PDF come out of a narrow set of tools: chain-specific POS export modules (Square, Toast, Lightspeed, Oracle MICROS, Shopify POS), thermal-printer-to-PDF drivers, or the email-receipt systems run by major retailers. The producer strings for those systems are consistent and recognisable, and a given merchant almost always produces the same string across thousands of its receipts.
Fabricated PDFs give themselves away here more often than anywhere else. Common tells include:
- Adobe Illustrator or Adobe Photoshop — a designer built the file.
- Canva — the most frequent source of amateur receipt fabrication in 2025 and 2026.
- Microsoft Word or LibreOffice / OpenOffice — a Word template was exported.
- Google Docs — the same pattern via the Docs export pipeline.
- macOS Quartz PDFContext — usually a Pages or Preview export. Not conclusive alone, but suspicious on a restaurant receipt that should have come from a POS.
- Generic converters like iLovePDF, Smallpdf, wkhtmltopdf, or Chromium Print to PDF — the receipt was assembled in a browser or HTML tool and printed to PDF.
To view the producer string in Adobe Acrobat or Acrobat Reader, open the file and go to File > Properties > Description; the PDF Producer and Application fields are listed there. In most browsers' built-in PDF viewers you can right-click and select Document Properties. If you work at the command line, pdfinfo receipt.pdf (part of Poppler) prints the Producer, Creator, CreationDate, and ModDate in two lines of output, which is the fastest way to triage a batch.
The check is simple: does the producer string belong on a receipt from this merchant? A Starbucks e-receipt produced by Canva is a fabrication. A Hilton folio produced by Adobe Illustrator is a fabrication. A taxi receipt produced by Microsoft Word is a fabrication. Build a short mental catalogue of what legitimate producers look like for the vendors you see most often and the outliers will stand out immediately.
Font Embedding
PDFs declare every font they use. On a real thermal POS receipt exported to PDF, the embedded font list is short — typically one or two monospace fonts tied to the printer family, with names like PrinterFontA, ReceiptFont, EPSON-FONT-ROMAN, Monaco, Courier, Lucida Console, or a specific font ID assigned by the POS vendor. The glyph set is narrow, the character spacing is fixed, and there are no decorative weights.
Fabricated receipts routinely fail this check. A long list of embedded fonts, or the presence of Helvetica Neue, Inter, Arial, Calibri, Open Sans, Roboto, or any designer display face, is inconsistent with a thermal-printer origin. Canva receipts in particular tend to embed three to six fonts including at least one sans-serif display family that no POS terminal has ever shipped with.
To see the font list in Acrobat, open File > Properties > Fonts. In pdfinfo you can run pdffonts receipt.pdf and it will list every embedded face along with its type, encoding, and whether it is embedded or referenced. If the list reads like a designer's font menu rather than a printer's ROM, you are not looking at a till receipt.
Creation and Modification Timestamps
Every PDF carries two timestamps in its metadata: CreationDate (when the file was first written) and ModDate (when it was last modified). Both should be consistent with the transaction as claimed.
The specific contradictions to look for:
- Creation date far later than the claimed transaction date. A receipt that says the meal was on 3 March at 14:37, but whose CreationDate is 24 March, is a strong signal the file was built well after the fact. Real e-receipts are generated within seconds to minutes of the transaction, not weeks later.
- Creation date earlier than the claimed transaction date. Less common, but almost always indicates a template was filled in and the timestamp was never regenerated.
- Modification date hours or days after creation. POS-generated PDFs are usually written once and never touched again, so CreationDate and ModDate match or differ by milliseconds. A ModDate that trails CreationDate by hours or days almost always means somebody opened the file in an editor and changed something — a number, a date, a line item.
- Timestamps that sit outside the merchant's stated business hours. A receipt claimed at lunchtime but created at 02:14 local time is worth a second look.
Read these off the File > Properties > Description panel in Acrobat, or from the CreationDate and ModDate lines in pdfinfo output. Treat them as part of the same evidence set as the transaction date printed on the face of the receipt.
Image-Only Versus Text-Layer PDFs
A fast, no-tooling heuristic: try to select text in the PDF. Three outcomes, each of which tells you something.
- Selectable text in a POS-style monospace font. Consistent with a native POS export. Good.
- Nothing selectable — the whole page behaves like an image. Consistent with a scanned paper receipt or a photograph of a receipt embedded in a PDF. Neutral on its own; combine with other tiers.
- Selectable text in a designer font, with precise kerning and decorative layout. Consistent with Canva, Illustrator, or a Word template. For a receipt claimed to come from a thermal printer, this is a fabrication signal.
The third case is the one that catches AI-image-to-PDF pipelines and desktop-publishing forgeries. Thermal receipts do not produce kerned, proportional type.
EXIF Data on Photographed Receipts
When the receipt arrives as a JPEG, HEIC, or PNG photograph rather than a PDF, the relevant metadata lives in EXIF data instead. EXIF fields of interest include:
- Make and Model (camera or phone make and model)
- DateTimeOriginal (when the image was captured)
- GPSLatitude / GPSLongitude (capture location, if the device recorded it)
- Software (editing application, if the image has been touched)
A receipt claimed from a restaurant in Manchester on 12 April whose EXIF shows an iPhone in Madrid on 9 April is a problem. So is a receipt whose Software field reads Photoshop or a mobile retouch app — the image has been edited since capture. A missing or zeroed EXIF block is not itself proof of fraud (many messaging apps and cloud pipelines strip EXIF routinely), but when combined with other red flags it adds weight.
On Windows, right-click the image, choose Properties > Details. On macOS, open in Preview and choose Tools > Show Inspector > Exif. At the command line, exiftool receipt.jpg prints the full block. Because EXIF is trivially strippable and can be forged with readily available tools, treat it as supporting evidence rather than a single point of truth.
The POS-Versus-Desktop-Publishing Pattern
Pull the individual markers together and a clear pattern emerges for how to detect fake receipts at the file level. Genuine POS output shares a tight cluster of characteristics:
- A producer string from a known POS vendor, thermal driver, or merchant email pipeline.
- One or two embedded monospace fonts, narrow glyph set.
- Page dimensions that match a thermal roll (typically 72 to 80 mm wide, variable length) or a standard receipt preview.
- A selectable text layer in the POS font, or a clean image-only scan.
- Creation and modification timestamps within minutes of each other and close to the claimed transaction time.
- No embedded logos as vector art, no decorative graphics, no designer layout assets.
Desktop-publishing output reverses nearly all of those signals: an Adobe, Canva, Office, or browser producer string; multiple designer fonts; A4 or letter page dimensions; proportional typography; timestamps divorced from the claimed transaction; embedded logos, shapes, and layout artifacts. Whenever a file's characteristics cluster on the desktop-publishing side rather than the POS side, that is a strong forensic signal regardless of how convincing the face of the receipt looks.
The receipt PDF metadata check — producer string, font list, creation and modification timestamps, text-layer origin, EXIF where relevant — is a named checkpoint within this tier. Run it on every digital receipt before the file moves further through the workflow.
AI-Generated Receipts: Concrete Forensic Tells
The general warning that AI can now fabricate receipts is no longer useful on its own. GPT-4o's image mode, purpose-built receipt generators, and the ten-plus free receipt-creation tools now indexed on the public web have moved fake-receipt production from a skilled forgery task to a prompt-level one. Anyone with a browser tab can produce a visually plausible receipt in under a minute. The review layer has to match that shift, and the good news is that AI-generated receipts leave specific, verifiable marks that a reviewer can check without specialist tooling. AI-generated receipt fraud fits inside the broader AP controls for AI-generated document fraud that now covers invoices, bank statements, and pay stubs — the techniques below are the receipt-specific surface of that problem.
Producer strings and the absence of camera metadata. AI-generated receipts saved as PDFs typically carry producer strings tied to the generation pipeline rather than to a POS system or a phone-scanning app. You will see values like Skia/PDF, Chromium, Canva, Adobe Illustrator, Pillow, or the name of the generating AI service itself, often with a recent creation timestamp and no corresponding modification history. When an AI-generated receipt is saved as a JPG or PNG, the EXIF block usually tells an even cleaner story: no Make, no Model, no lens or exposure data, no GPS, no capture timestamp. Where EXIF does exist, it points to the image-production pipeline rather than to a physical camera — a desktop export tool, a server-side renderer, or a screenshot utility. A phone-captured receipt should carry a phone camera's EXIF. A vendor-emailed receipt should carry a consistent producer string for that vendor's billing system across multiple receipts. AI fabrications fail both tests.
Typesetting that is too clean for thermal paper. Real thermal POS receipts produce pixel-quantised monospaced output. Characters sit on a fixed grid, kerning is absent, and edges show the dot pattern of a 203 or 300 dpi thermal head. AI generators render text with modern typesetting conventions: proportional kerning, smooth anti-aliased edges, and even tracking across every line. Perfectly kerned text across what should be a thermal printout is itself a signal. Zoom to 400 percent on the merchant name, the line-item block, and the total. Real thermal receipts show visible dot structure and occasional missing pixels where the head skipped; AI receipts show clean vector-quality glyphs.
Uniform aging and symmetric wear. When an AI tool has been asked to produce a "realistic" receipt with wear, the wear is usually added as a filter rather than generated where a physical receipt would actually wear. Real receipts fade near cut edges, crumple in the middle where they were folded into a pocket, and darken unevenly along thermal-exposure lines. AI versions tend to fade uniformly top-to-bottom, add creases that are spatially symmetric, or overlay coffee stains at the same opacity on every corner. If the receipt looks aged but the damage is too well-distributed, treat the aging itself as suspect.
POS-specific footer data that looks plausible but does not match. A real receipt from a given chain carries register numbers, operator initials, transaction IDs in that chain's exact formatting, loyalty barcodes, and a tax-registration number footer that follows the jurisdiction's pattern. AI generators produce footer text that looks the right shape but fails specific checks. You will see transaction IDs with the right digit count but the wrong prefix, loyalty barcodes that do not decode to the printed number, and tax-registration numbers that do not validate against their issuing authority. For UK receipts, check that a printed VAT number passes the HMRC VAT checker. For US receipts claiming to be from a chain, check whether the register-number format matches known POS output for that chain.
Hallucinated addresses and tax rates that are internally consistent but geographically wrong. This is the single highest-yield check for AI-generated receipts. AI image models frequently invent merchant addresses that do not resolve to any actual location the claimed chain operates at, or invent tax rates that are applied cleanly against the subtotal but do not match the jurisdiction the address sits in. A receipt showing a Seattle Starbucks address with a 6 percent tax rate is mathematically consistent with its own subtotal and still wrong, because Washington combined sales tax at that address sits closer to 10.25 percent. Cross-check the printed address against the chain's store locator, then cross-check the printed tax rate against the jurisdiction that address belongs to. A single mismatch on either axis is strong evidence of fabrication.
Industry signal. This is not a fringe problem. The ICAEW has published practitioner guidance for accountants on AI-receipt fraud within its wider work on AI-enabled expense abuse, and expense-audit platforms including AppZen have reported rising volumes of flagged AI-generated fake receipts in recent reporting cycles as these tools have become freely available. The direction of travel is clear enough to plan around: AI-generated receipt fraud is moving from a curiosity into ordinary expense-review caseload, and the specific tells above are what you are looking for when you triage a suspicious claim.
Tier 4 — Structured-Data Validation at Scale
Per-receipt review collapses at Monday-morning batch volume, and more importantly it misses the patterns that only appear in aggregate. A single $48.72 dinner receipt from a plausible-looking bistro will pass every visual check. Ten of those receipts, submitted by four different employees across three weeks, with totals clustered at round-dollar amounts and tax rates that do not match any real jurisdiction, will not — but only if you are looking at the ten together rather than the one. Fabricated batches also tend to share fingerprints: the same AI prompt or receipt generator produces internally consistent math with an implausible tax rate, applies that rate everywhere, and scatters near-duplicates across claimants. The detection signal is in the distribution.
This is where structured extraction earns its second job. The same pipeline that captures merchant, date, and total for your accounting system also produces the columns you need for programmatic verification. Before any of these checks can run, every receipt has to become structured data — how receipt OCR captures the fields used for these checks is the prerequisite step, and the quality of extraction determines the ceiling on what you can verify downstream.
The columns every receipt should land in
Every receipt in the batch gets broken into the same flat record:
- Merchant name and merchant address (street, city, state or region, postcode, country)
- Transaction date and transaction time
- Line items — one row per item with description, quantity, unit price, and line total
- Subtotal, tax rate, tax amount, total
- Payment method (card type, last four digits if present, cash, mobile wallet)
- Transaction or receipt ID
- Source file name and page number for cross-reference back to the original
Once every receipt in the batch occupies a consistent row (or a consistent cluster of rows, for line-item extraction), the verification checks are no longer manual — they are column rules.
Math validation as a column rule
The three arithmetic identities from Tier 2 become formulas that run against every row at once:
- Sum of line totals equals subtotal. Group by receipt ID, sum line totals, compare to the extracted subtotal column. Flag any row where the delta exceeds a small rounding tolerance (typically one or two cents, scaled for multi-line receipts).
- Subtotal times tax rate equals tax amount. Multiply the two extracted columns, compare against the extracted tax-amount column, flag discrepancies.
- Subtotal plus tax equals total. The simplest check. The one most commonly fudged by generators that type in a final number without re-verifying the components.
Any receipt that fails any of the three by more than a cent or two goes straight to the human review queue. In practice this is the single highest-yield automated check for fabricated receipts, because genuine POS systems never get this wrong — the math is computed by the register, not typed in by the forger.
Jurisdictional tax-rate validation
With merchant address and tax rate in adjacent columns, you can join the batch against a reference table of jurisdictional rates — US state and local combined sales tax, UK VAT at 20% standard and 5% reduced, EU VAT rates by country, Canadian GST and HST, Japanese consumption tax, and so on. Any receipt whose applied rate does not match its claimed jurisdiction gets flagged.
This is the check that catches AI-generated receipts which are internally consistent but externally wrong — a "Denver, Colorado" receipt applying a flat 8% when Denver's combined rate is currently closer to 8.81%, or a London receipt showing 19% VAT (a German rate) instead of 20%. The math will reconcile; the geography will not.
Merchant cross-reference
Extracted merchant name and address join against a business registry, a mapping API, or your own historical merchant database. You are confirming three things: the business exists, it exists at the address printed on the receipt, and its business type is consistent with the claim ("Acme Hardware" billing for a steakhouse tab is its own flag). Addresses that do not resolve to a real location, or that map to a residential block, or that belong to a business that closed two years before the transaction date, come up as fake receipt detection candidates without anyone opening the file.
Duplicate and near-duplicate detection
Group the batch by merchant, date, and total. Then widen the window and repeat across the prior three, six, and twelve months of claims history.
- Exact matches on merchant, date, and total across two employees or two submissions are almost always collision cases — the same dinner expensed twice, the same Uber split incorrectly, or a duplicate submission of a recycled receipt.
- Near-duplicates — same merchant, totals differing by a few cents, dates within a few days — are the signature of manipulated receipts where a claimant edited a real document to generate a second one. Near-duplicate clustering is one of the strongest signals in the entire workflow, because it is invisible at the per-receipt level and obvious once the batch is grouped.
Outlier and round-number distributions
Plot the total-amount distribution across the batch, per employee, and per merchant. Genuine receipt totals form a messy long-tail distribution with cents scattered across the final digit. Fabricated claims cluster — at $50, $75, $100, $150 — because humans inventing numbers gravitate to round figures, and because generators often use round-number templates. Individual employees whose personal claim distribution shifts noticeably after a policy change (for example, whose average claim size jumps right after a threshold-based approval rule is introduced) also surface in the per-claimant distribution.
None of these anomalies are visible in any single receipt. They only appear once you have the batch as a table.
Executing Tier 4 with Invoice Data Extraction
Invoice Data Extraction is built to produce exactly the structured substrate this tier depends on. The workflow is the same one you would use for routine AP processing, repurposed as a verification control: you run structured receipt extraction and validation at scale by uploading the batch, writing a prompt describing the fields, and receiving a spreadsheet the checks run against.
The extraction interface is a single prompt field over a file upload area. Receipts and expense claims are explicitly supported document types, alongside invoices, payslips, purchase orders, bank statements, and the rest of the financial-document set. Batch processing handles up to 6,000 files per job; single PDFs can run up to 5,000 pages, which covers both a month of individual receipt images and a pre-consolidated PDF of hundreds of expense submissions.
A realistic verification prompt looks like this:
"Extract receipts for fraud-detection review. One row per receipt. Fields: Merchant Name, Merchant Street, Merchant City, Merchant State, Merchant Postcode, Merchant Country, Transaction Date (YYYY-MM-DD), Transaction Time (HH:MM, 24-hour), Subtotal, Tax Rate (percentage), Tax Amount, Total, Payment Method, Card Last Four, Transaction ID. Also extract line items into a second sheet: Receipt ID, Description, Quantity, Unit Price, Line Total. If Tax Rate is not stated on the receipt but Tax Amount and Subtotal are, compute Tax Rate as Tax Amount divided by Subtotal. If any of Subtotal, Tax Amount, or Total is missing, set its value to 0. Add a column Math Check: flag as 'FAIL' where Subtotal + Tax Amount differs from Total by more than 0.01, or where Subtotal times Tax Rate differs from Tax Amount by more than 0.01, otherwise 'PASS'. Format all dates as YYYY-MM-DD and ensure all currency fields have 2 decimal places."
The prompt-based configuration supports this kind of business-logic layering directly — conditionals, defaults, fallbacks, and classification columns described in the platform's prompt controls. A conditional like "if currency is USD, extract tax from the state tax field" and a default like "if tax amount is missing, set its value to 0" belong to the same family of rules as "flag rows where subtotal times tax rate does not equal tax amount within one cent." Encoding the math check as a column inside the extraction instruction means the flag arrives on the same row as the receipt it applies to, with the source file and page number already attached for click-through verification. Line-item extraction produces the child rows required for the sum-of-line-totals identity, using the standard one-row-per-line-item pattern with the receipt identifier repeated on every line.
The output is a structured Excel, CSV, or JSON file. The math check runs during extraction. The tax-rate join, merchant cross-reference, duplicate grouping, and distributional analysis run afterward against the columns, in whatever environment your team already uses — Excel with a reference sheet of jurisdictional rates, Power Query, a Python notebook, or an analytics tool pointed at the CSV. Every row carries a reference to the source file and page number, so a flagged receipt can be opened in one click to confirm the underlying document against the claim.
Triage, Escalation, and a Reviewer's Checklist
Flagged receipts are rarely binary, so the response should be proportionate to the evidence and documented enough to be reviewable later.
Three graduated responses
1. Re-check. One tier-1 or tier-2 concern on an otherwise clean receipt — slightly irregular spacing, a total that rounds suspiciously, a tax amount that's a little off but plausibly attributable to rounding. Return the receipt to the employee with a specific request: a clearer scan, a photo of the physical receipt, or a supporting artefact that corroborates the purchase (the calendar invite for the client dinner, the itinerary that includes the hotel, the corporate-card line item). Most legitimate claims produce one within a day.
2. Escalate. Two or more tiers show concerns, or any single tier-3 or AI-specific tell appears (producer string mismatched with the claimed merchant, embedded fonts typical of desktop publishing, metadata authored after the claimed purchase date, rendering quirks consistent with generative output). Pull the receipt from the batch and hand it to internal audit or compliance with documented evidence: which checks failed, screenshots of the metadata, the extracted data row with the failing fields highlighted, and any duplicate-match results from the batch-level validation.
3. Reject or deny reimbursement. A documented tier-3 or tier-4 failure that cannot be reconciled against a legitimate cause. Examples: a producer string that is demonstrably inconsistent with the claimed merchant's point-of-sale system; math that does not reconcile and cannot be traced to a tip, service fee, or rounding; a duplicate submission that matches a claim already paid on a prior expense report. Record the specific basis for the rejection in the case file and retain the receipt, the extracted data, and the evidence for the period required by your organisation's retention policy.
Whichever response is taken, attach the evidence to the case. A re-check that resolves in the employee's favour should still carry the note explaining what triggered it; if the same employee triggers another re-check next month on the same type of signal, that pattern is only visible if the earlier case was documented.
Where receipt checks sit in the wider control environment
Receipt verification is one control among several, and it is most useful when the others are also operating. Corporate-card transaction matching — cross-referencing every claimed purchase against the card feed — is the most effective single control you can pair it with, because a fabricated receipt for a purchase that never happened has no corresponding card charge. Approver hierarchies that automatically escalate unusual claims (amounts above a threshold, vendors not on the approved list, claims from employees with recent anomalies) give a second pair of eyes on higher-risk items. Reconciliation against employee calendars and travel records catches the claim that matches a card charge in amount but was never really incurred in the course of work.
Expense-management platforms such as Ramp handle the card-feed matching layer reliably; what they don't see is the fabricated receipt itself — a receipt generated in a templating tool or by an AI model for a purchase that either never happened or happened at a different amount or merchant than claimed. The receipt-verification framework sits on top of the card feed and catches exactly that gap. This is the distinct shape of expense reimbursement receipt fraud: the document is the lever, because the document is what the reimbursement decision relies on when the purchase isn't reflected on a corporate card. Expense reimbursement fraud as a broader category includes inflated mileage, duplicate claims, and ghost vendors; receipt authenticity is one of its most common attack surfaces.
The same four-tier pattern — visual, content-logic, file-level, structured-data — extends cleanly to pay-stub verification with year-to-date math and payroll logic and to invoice authenticity; the document type changes, the tiers don't.
Reviewer's checklist
Print this and keep it at the review desk. Each bullet is a check a reviewer should be able to complete in seconds.
Tier 1 — Visual
- Compare font family and weight against known POS output for this chain.
- Check alignment of columns, line spacing, and character kerning for inconsistencies.
- Look for logo distortion, low-resolution artefacts, or colour mismatches.
- Verify paper-type cues (thermal-paper fade patterns, perforation edges) on photographed originals.
- Confirm receipt layout matches the merchant's current format, not a format retired years ago.
Tier 2 — Content logic
- Compute subtotal plus tax plus tip and compare against the stated total; investigate any variance.
- Compute subtotal multiplied by the jurisdiction's tax rate and compare against the stated tax.
- Cross-check merchant name, address, and tax ID against a registry or the merchant's own website.
- Verify the date falls on an operating day (not a closed holiday or before the merchant existed).
- Confirm the time of day is plausible for the merchant's operating hours.
- Check that line items are consistent with what this merchant actually sells.
Tier 3 — File forensics
- Inspect the PDF producer string against the expected POS producer for this merchant.
- Review creation and modification timestamps against the claimed purchase date.
- Check for embedded fonts typical of desktop publishing rather than POS output.
- Flag producer strings associated with generative AI tools or generic templating software.
- Examine image metadata (EXIF) on photographed receipts for inconsistencies with the claim.
- Look for re-saved-JPEG signatures or layer artefacts that suggest editing.
Tier 4 — Structured-data validation
- Run the extracted batch through programmatic math reconciliation across every row.
- Apply jurisdictional tax-rate sanity checks to every tax line.
- Cross-reference extracted merchant names against a known-merchant list.
- Detect duplicates across the current batch and against previously paid claims.
- Match claimed amounts and merchants against the corporate-card transaction feed.
- Route any row that fails two or more programmatic checks into the escalation queue automatically.
Related Articles
Explore adjacent guides and reference articles on this topic.
AI-Generated Invoice Fraud: Detection and AP Controls
AI-generated invoice fraud demands more than visual review. Learn the AP controls that matter: provenance checks, logic tests, and vendor verification.
How to Spot a Fake Pay Stub: Red Flags and Math Checks
Learn how to spot a fake pay stub using red flags, payroll math, YTD checks, and employer verification before you rely on proof of income.
How to Detect Fake Invoices: Red Flags Before Payment
Practical guide for AP teams to spot fake invoices using visual checks, math validation, PDF metadata review, and structured extraction.
Extract invoice data to Excel with natural language prompts
Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.