How to Extract Line Items from UK Builders' Merchant Invoices

Extract line items from Travis Perkins, Jewson, Howdens, Selco, and MKM invoices into Excel or CSV for bookkeeping, VAT checks, and job costing.

Published
Updated
Reading Time
9 min
Topics:
Industry GuidesConstructionUKExceljob costingbuilders merchants

To extract line items from builders merchant invoice files from Travis Perkins, Jewson, Howdens, Selco, or MKM, the useful output is one row per product line, with the invoice-level fields repeated on every row: supplier, invoice number, invoice date, account reference, VAT treatment, and invoice totals. For UK trade suppliers, the export that saves the most work is an Excel or CSV file that keeps SKUs, quantities, unit prices, line totals, and delivery or order references intact, because those are the fields bookkeepers need for VAT checks, job costing, and spend analysis.

That is why a Travis Perkins invoice to spreadsheet workflow or a Jewson invoice PDF to Excel workflow usually falls apart when it only captures header totals. A builder or construction bookkeeper does not just need proof that an invoice existed. They need to see what was bought, how much was bought, what VAT treatment applied, and which branch, delivery note, or account reference ties the purchase back to a supplier account or project.

This article stays focused on that extraction problem. It is about turning recurring builders' merchant purchase invoices into structured rows that are ready for bookkeeping, project-cost tracking, and accounting prep. It is not a guide to logging into merchant portals to download PDFs, and it is not about subcontractor CIS sales invoices, which follow a different documentation workflow.

The fields that repeat across Travis Perkins, Jewson, Howdens, Selco, and MKM invoices

Across the main UK builders' merchants, the page design changes but the bookkeeping structure barely does. A usable extraction has to keep the invoice-level identifiers, including supplier name, invoice number, invoice date, delivery date where shown, customer or account reference, branch or depot, order or delivery reference, subtotal, VAT amount, and grand total. If those fields are missing, the line items lose their audit value even when the product descriptions are technically present.

The line-item rows need their own discipline. A Howdens invoice line items extract is only useful if it keeps the product code or SKU, description, quantity, unit price, line total, and any line-level VAT treatment that appears on the document. The same rule applies to Travis Perkins timber lines, Jewson civils materials, Selco sundries, or MKM plumbing and heating supplies. Merchant layouts vary, but they all carry enough repeated structure to support a repeatable extraction approach.

Part numbers, abbreviations, and depot references are not clutter. In builders' merchant workflows they are often the only precise way to tell one material line from another, especially when descriptions are shortened to fit narrow columns. That matters for supplier queries, stock checks, job-cost reviews, and month-end analysis where "timber" or "fittings" is too vague to trust.

HMRC explicitly recognizes that builders' merchant invoices do not always use long plain-English descriptions. In its HMRC guidance on builders' merchant invoice descriptions, HMRC says coded descriptions may be accepted on VAT invoices in builders' merchant trade where customers order from illustrated catalogues and invoices show part numbers as the description of the goods, provided an up-to-date catalogue is available for inspection. That is a strong reason to preserve merchant codes and descriptions exactly as they appear instead of flattening them during extraction.

Build the spreadsheet around one row per line item

The safest spreadsheet design is simple: one row per line item, with the invoice-level identifiers repeated on every row. That means every materials row should still carry the supplier, invoice number, invoice date, account reference, branch or depot, and invoice totals. Repeating those fields is not wasteful. It is what makes the data filterable, reconcilable, and ready for bookkeeping without another round of manual stitching.

For most builders' merchant invoices, the core columns are supplier, invoice number, invoice date, delivery date where available, account reference, branch or depot, order or delivery reference, product code, description, quantity, unit price, line total, VAT rate, VAT amount, invoice subtotal, and invoice total. If a job code, plot number, or site reference appears on the source invoice, that should stay with the exported row as well. It is also worth keeping a source filename or source-page reference column when the workflow allows it, because that makes audit checks far easier once the sheet has been filtered or split by project. The point is to keep each material line self-contained so the sheet still makes sense after sorting, filtering, or splitting by project.

Normalization matters just as much as field selection. Dates should be standardized, quantities and prices should land as numbers, column names should stay stable across merchants, and line totals should remain distinct from invoice totals. That is what turns a merchant invoice to CSV construction workflow into something that works in pivot tables and review sheets instead of becoming another cleanup task.

This is also why invoice line item extraction is more than a technical feature name. In practice, it is the row design that lets a Selco invoice export Excel file support supplier analysis and lets an MKM invoice bookkeeping sheet roll straight into reconciliation, month-end review, or per-job material costing.

Use a prompt that captures merchant detail without manual rekeying

A prompt-based workflow works best when it tells the extractor exactly how the rows should be built. For recurring trade supplier invoice OCR UK jobs, the instruction pattern is straightforward: create one row for each line item, repeat the invoice number on each row, extract quantity, unit price, line total, VAT rate, VAT amount, and standardize the invoice date. A practical prompt can be as direct as: "Create one row for each line item. Repeat the invoice number, invoice date, and supplier on every row. Extract quantity, unit price, line total, VAT rate, VAT amount, account reference, and delivery reference. Ignore summary pages and statements. Format dates as YYYY-MM-DD." If the invoice shows an account reference, branch or depot, or delivery reference, those should be captured because they are often the fields a bookkeeper needs later to reconcile supplier queries or assign spend to a job.

The prompt should also tell the extractor what not to treat as invoice data. Builders' merchant batches sometimes include summary pages, statement pages, email covers, or supporting attachments. If those pages are left in scope, the output fills up with false rows. A good prompt removes that ambiguity up front and keeps the spreadsheet limited to the invoice lines that belong in the books.

This is where invoice data extraction for recurring supplier PDFs becomes practical rather than theoretical. The useful workflow is not "convert PDF to spreadsheet" in the abstract. It is to upload the recurring merchant invoices, describe the fields and row structure needed in plain language, and get back a structured file that already matches the bookkeeping job.

Invoice Data Extraction is built around exactly that prompt-first workflow. The product lets users upload invoice PDFs, describe what to extract in natural language, and download structured Excel, CSV, or JSON output. For recurring merchant formats, saved prompts are especially useful because the same extraction rules can be reused month after month even when line counts, page counts, and product mixes change.

Validate VAT, delivery references, and merchant quirks before posting

Once the rows are extracted, the next question is whether they can be trusted. Start with the arithmetic. The line totals should roll up to the invoice subtotal and total, and the VAT treatment should still make sense after export. On builders' merchant invoices that often means checking whether 20%, 5%, and 0% lines have stayed attached to the right materials rather than being collapsed into a single summary number.

The second check is document context. Delivery references, order references, branch codes, and account numbers need to survive the extraction if the output will be used for job costing or supplier follow-up. Builders' merchants also shorten descriptions, wrap product lines across multiple rows, and mix summary or statement pages into the same PDF set. If those quirks are not handled carefully, the spreadsheet looks full but still fails the real bookkeeping test.

VAT detail deserves its own review because this is where a superficially neat export can still create rework later. The safest habit is to compare a few exported rows back to the source PDF and confirm that the VAT coding, taxable amounts, and invoice totals match what would be expected under normal UK VAT invoice requirements. For self-build projects, keeping those invoice fields intact also makes it easier to maintain a VAT431NB invoice schedule for self-build VAT claims. That is especially important when the same supplier invoice includes different material types or non-stock charges on one document.

It is also worth keeping the workflow boundary clear. Merchant purchase invoices are not the same as subcontractor CIS invoices or applications for payment. They sit on the purchasing side of the books, so the validation routine should be built around materials spend, VAT support, and traceable supplier references rather than subcontractor deduction rules.

Put the extracted data to work in month-end and project-costing workflows

The value of builders merchant invoice data extraction UK work shows up after the spreadsheet is finished. A clean line-item export gives the bookkeeper or accountant a usable staging file for Xero, QuickBooks, or Sage prep, with supplier references, dates, VAT fields, and material lines already separated. That reduces manual rekeying, but more importantly it makes the month-end review faster because the data can be filtered by supplier, site, branch, invoice, or product code instead of being trapped inside PDFs, which also makes it easier to support builders' merchant statement matching in Xero, QuickBooks, or Sage.

The same export also supports operational checks that header-only invoice capture cannot. A builder can see material spend by job, compare repeat purchases across merchants, flag unusual unit prices, and hand a clearer pack to the person posting the entries. For firms buying from Travis Perkins, Jewson, Howdens, Selco, or MKM every month, that consistency matters more than a one-off conversion because the same suppliers recur across multiple jobs and reporting periods.

That is the practical standard to aim for. If the output preserves the invoice identifiers, line-item detail, VAT context, and merchant references needed for reconciliation, it can move straight into construction invoice processing workflows and month-end review. If it only produces invoice headers, the manual work has mostly just shifted downstream.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading