Convert a Bill of Quantities PDF to Excel (NRM2 Tender BOQ)

UK QS workflow for converting an NRM2 tender BOQ PDF into pricing-ready Excel — preserving Work Section hierarchy and reconciling to the collection sheet.

Published
Updated
Reading Time
23 min
Topics:
Industry GuidesConstructionUKExcelNRM2bill of quantitiestender pricingquantity surveying

Converting a bill of quantities pdf to excel, for an NRM2 tender, comes down to five structural moves the spreadsheet has to land before pricing starts. One row per measured item. The Work Section / Work Group / Item hierarchy preserved as Excel grouping rather than flattened into the description column. Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks kept in distinct sections, because each is priced under different rules. The Rate and Amount columns left blank so build-up rates can drop in without disturbing the original quantities. And per-Work-Section quantity totals reconciled back to the BOQ's collection sheet as the verification step.

Everything that follows in this article is detail on those five moves, applied to a real tender BOQ rather than a template.

The orientation is short, because the audience already has it. An NRM2 BOQ is organised hierarchically — Work Section at the top, Work Group beneath it, then individual measured items. Item descriptions sit in the conventional [code] [description] form. The item rows carry quantity, unit, rate, and amount columns. Five distinct section types live inside the document — Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks — and the rules of measurement treat each one differently. As factual anchor: NRM2 (RICS New Rules of Measurement: Detailed measurement for building works) became operative on 1 January 2013 and formally replaced SMM7 on 1 July 2013, organising building works into 41 indexed Work Sections from Preliminaries (Section 1) through Builder's Work in Connection (Section 41) — the RICS NRM2 Work Section structure is the spine the converted spreadsheet has to reproduce.

The audience this is written for is the QS converting a real tender BOQ this afternoon — not the QS looking for a downloadable template, and not the QS evaluating a takeoff package they'll buy next quarter. The tender pack is on the desk. The BOQ inside it runs to several hundred pages. The pricing has to start tomorrow morning, and what the QS needs is a pricing-ready Excel that respects the structure of the document it came from. Older tender BOQs that turn up on framework agreements or carried-over refurbishment work may be SMM7-compliant rather than NRM2-compliant; the section numbering shifts, but the structural shape of the document is similar enough that the same conversion approach holds.

Specifying the extraction so the structure survives

Most BOQ-to-Excel conversions fail before they begin, because the output isn't specified. The QS opens the PDF, copies a page of items, pastes into Excel, finds that the Work Section heading has collapsed into the first item's description and that the columns have drifted, fixes a few rows by hand, and within twenty minutes is rebuilding the BOQ row by row. The fix is to specify the output before any extraction runs.

Eight columns, at minimum:

  • Work Section — as its own column, or as an Excel outline / grouping level. Not concatenated into the description.
  • Work Group — the sub-grouping under the Work Section, again as its own column.
  • Item Code — the item identifier the BOQ uses (for example 4.1.A, D20.1.1).
  • Item Description — the prose description, complete and rejoined where the PDF split it across pages.
  • Quantity — the BOQ's stated quantity, unrounded.
  • Unit — normalised to a clean enumerated set (see below).
  • Rate — left blank, ready for the QS's build-up rate.
  • Amount — left blank, or pre-populated with the placeholder formula =quantity * rate so the row calculates as soon as the rate goes in.

The single biggest reason copy-and-paste BOQ extractions become unusable is that the Work Section disappears into the description string. Section 4 of an NRM2 BOQ might span seventy pages and include several hundred items; once those items live in a flat list with the section identifier buried in the description prose, the QS can no longer filter, group, or roll up by section. The conversion has to put the Work Section in its own column. Excel then handles grouping, subtotalling, and outline-level collapsing as a property of that column.

NRM2 BOQs use a recognisable enumerated set of units of measurement: m, , , item, t, no., kg, hr, sum. PDF rendering produces surface variation on most of these — M2, sqm, square metres, and m^2 are the same unit, but Excel sees four distinct strings unless the conversion normalises them. Coerce the unit column into the canonical set as part of the extraction. The result is a Unit column the QS can pivot on, filter on, and use as a sense-check on the rate column further down.

One row per measured item. The same rule applies to preliminary items, provisional sum items, prime cost sum items, and dayworks rate-schedule items — each row carries its section context as data in the Work Section / Work Group columns, not as a header buried in a heading row. Don't aggregate items, don't concatenate rows, and don't split a single item into multiple rows because the description wrapped across two visual lines on the page.

NRM2 item descriptions are conventionally written as [code] [description]. Splitting code and description into separate columns is worth the few extra minutes it takes during the conversion specification, because the priced BOQ gets queried by code at every downstream stage — variation pricing references the BOQ rate by item code, drawings get reconciled against codes, and the post-tender clarifications come back referencing item codes from the issued document. A code column the QS can filter on saves the search-the-description-string workaround forever.

The QS who would rather not transcribe several hundred pages of items by hand can run the same specification through an extraction tool. The interaction model is the one a QS would already recognise from any modern AI tool: a single prompt-and-upload field, the BOQ PDF goes in, and a natural-language prompt names the columns, the section grouping, and the unit set. Something on the order of: "Extract one row per item from this NRM2 BOQ. Columns: Work Section, Work Group, Item Code, Item Description, Quantity, Unit, Rate (leave blank), Amount (leave blank). Normalise units to m, m², m³, item, t, no., kg, hr, sum. Keep Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks in distinct sections." That prompt is the configuration. Both native and scanned BOQ PDFs go through the same interface, single PDFs run up to 5,000 pages, and tender packs containing multiple BOQ files can be processed as a batch of up to 6,000 documents in a single job — which matters when the tender pack arrives as one BOQ per Work Section split across separate PDFs. The output comes back as a structured .xlsx (or .csv, or .json), and a QS who runs repeat tender conversions for the same client or the same procuring authority can save the prompt to a prompt library and re-apply it the next time round. This is one place the extract structured data from construction PDFs option is genuinely earning its place in the workflow — the extraction follows the QS's specification, rather than producing a generic flat dump.

The same extraction discipline applies to the supplier-side documents the construction QS handles in parallel. The wider treatment of invoice data extraction for the construction industry covers how the same prompt-driven approach lands on builders' merchant invoices, subcontractor applications, and CIS-deducted payments — which the QS converting a tender BOQ this afternoon will be handling on the live job from the same desk next week.

Preliminaries, Provisional Sums, Prime Cost Sums, and Dayworks — why each section needs its own treatment

The eight-column layout from the previous section is the measured-work layout. It does not fit Preliminaries, it does not fit Provisional Sums or Prime Cost Sums in the same way, and it does not fit Dayworks at all. Forcing every BOQ section into one flat table is the second-most-common reason a converted BOQ comes out broken — the prelims get mangled, the PS items get priced as if they were measured items, and the dayworks rate schedule disappears into a column it doesn't belong in. Each of the five section types has to be extracted in a shape the rules of measurement actually allow.

Preliminaries sit at Section 1 in NRM2 and are priced one of three ways depending on how the BOQ has set them up. Sometimes the prelims are a single fixed lump sum at the bottom of a list. Sometimes they're a structured preliminaries pricing schedule with separate columns for time-related charges (site management, scaffolding hire, welfare facilities) and fixed charges (mobilisation, demobilisation, one-off attendances). Sometimes it's a hybrid. The conversion has to respect whichever shape the BOQ used. If the BOQ has separate time-related and fixed-charge columns, those columns survive into the spreadsheet as columns, not as rows mashed into one Amount field. If it's a single sum, it stays as a single sum row. What it cannot become is a per-item rate-times-quantity layout, because that's not how prelims are priced — there's no contract quantity for "site agent for 36 weeks" the way there is for of brickwork.

Measured Work is what the eight-column layout was built for. The bulk of the BOQ lives here. One row per measured item, full Quantity, Unit, Rate, and Amount columns, Work Section grouping preserved, codes split from descriptions. This is where the structural rules from the previous section get their full workout, and it's the part of the spreadsheet the QS spends most of the pricing window inside.

Provisional Sums need a Type column. NRM2 distinguishes "defined" and "undefined" provisional sums, and the BOQ usually labels each — defined PS items have known scope, programme implications, and design intent, and the contractor is expected to allow for them in the programme; undefined PS items have none of that, and the contractor isn't required to make programme allowance. The conversion extracts each PS item as its own row, marks it in a Type column with a value like Provisional Sum (Defined) or Provisional Sum (Undefined), and leaves the Rate column blank. The QS prices a PS only when scope is firmed up, and the contract administrator's instruction is what triggers the work in the first place. Keeping PS items visible but unpriced is the discipline that stops the QS from accidentally pricing them as measured items because they happen to share the row layout.

Prime Cost Sums also belong in the Type column, marked as Prime Cost Sum. PC sums are the BOQ's placeholder for work to be carried out by a nominated subcontractor or supplier whose price isn't yet known — typically specialist mechanical and electrical packages, or specified-supplier kitchen and joinery items where the design team has nominated the supplier but not yet agreed the price. The PC value in the BOQ is a sum the contractor includes in the tender, and that sum is replaced with the actual subcontractor's price when the sub-tender comes back. The conversion has to extract PC items into their own rows, flag them as Prime Cost Sum, and pass them through to the PC-sum sub-tender process — never silently absorb a PC sum into the measured-work pricing, because the PC value isn't the contractor's price to commit to.

Dayworks are different again. The dayworks section in an NRM2 BOQ is a rate schedule, not a list of measured items: labour rates by trade and grade, plant rates by item or hour, and materials handling percentages. The conversion extracts the rate schedule itself, not measured items, and the result lives in its own sheet inside the workbook. Dayworks rates aren't priced against a contract quantity; they sit waiting for the contract administrator to instruct work that doesn't fit the measured-work rates and can't be valued on the BOQ rates either. A clean dayworks rate schedule is what makes that valuation possible without going back to the original PDF mid-job.

A usable converted BOQ is therefore not a single flat list of rows. It's a workbook: a measured-work block with the eight-column layout running across each Work Section, a preliminaries block whose shape mirrors the BOQ's prelims pricing schedule, PS and PC items flagged in a Type column so they're visible but not mistakenly priced, and a dayworks rate schedule sitting in its own block ready to be referenced. Each block is in the same workbook so the QS can move between them, but each respects the rules the section is priced under.

Page break realities — continuation pages, mid-item splits, and headers and footers

Three failure modes account for almost every broken BOQ extraction in practice. Carry-forward subtotals get picked up alongside the items and the quantity totals double-count. Item descriptions split across page breaks and the second half of the description ends up glued to the next item's code. Repeating page furniture — project name, employer, BOQ section name, page numbers — gets extracted as data and turns up as a row inserted between every page's worth of items. Each one has a clean rule.

Carry-forward and brought-forward subtotals. NRM2 BOQs use Carried Forward totals at the bottom of each page and Brought Forward totals at the top of the next page so that the page-by-page totals add up cleanly when the document is read in print. They are running subtotals, not items. The reliable identifier is the description column — text containing Carried Forward, Brought Forward, c/f, or b/f, usually paired with a missing or running-total quantity column. The extraction has to extract item rows only and exclude every carry-forward and brought-forward line, otherwise the spreadsheet's quantity column will count the same items twice (once as items, once as the section's running subtotal). The check on this is the collection-sheet reconciliation in the next section — if the section total is exactly the BOQ total plus a stray subtotal value, this rule was missed.

Mid-item page breaks. An NRM2 item description can run several lines, especially in measured work for finishes, M&E, and joinery where the description carries SMM-style supplementary information about the work being measured. A long description can split across a page break — the first half on the bottom of one page, the second half on the top of the next. The PDF text layer carries no cue that it's the same item; the visual layout is doing all the work. The extraction has to delimit items by the item code prefix, not by the page break — an item starts when a code-prefixed line appears, and continues until the next code-prefixed line, even if a page break sits between them. Rejoin the description halves so the row in the spreadsheet carries the complete description, otherwise variation pricing later in the project gets done against half a description.

Headers, footers, and repeating page furniture. Tender BOQs typically run the project name, the contractor or employer name, the BOQ section identifier, and a page number across the top and bottom of every page. None of that belongs in the spreadsheet. The reliable identifier is positional repetition — text that appears in the same location on every page in a section is page furniture, not data. The extraction recognises the repeated pattern and drops it before assembling rows. Without this step, the spreadsheet ends up with a project-name row inserted every twenty or thirty rows, and every one of those false rows pollutes filters, sums, and section grouping.

Scanned BOQs. Older tender packs, and many tender returns coming back from main contractors, arrive as scanned PDFs rather than native PDFs — the document was printed, signed, scanned, and re-issued, and the text layer is OCR output rather than the publishing tool's original text. The structural rules above don't change. Carry-forward exclusion still applies. Mid-item rejoin still applies. Page-furniture drop still applies. The difference is in input quality: OCR introduces degraded characters where the scan was light or the original was poor, drifted column boundaries on pages that came through the scanner skewed, and the occasional run-together row where two visual lines got merged. The extraction has to handle that quality and produce the same clean output. A scanned tender BOQ isn't a different job; it's the same conversion specification applied to a more difficult input.

These rules sound mechanical, and they are. The reason to internalise them rather than fix issues row-by-row in the spreadsheet is that the verification step in the next section catches every one of them as a quantity divergence. Treat the carry-forward, mid-item, and page-furniture rules as the proof the extraction was correctly specified, and the collection-sheet reconciliation as the proof those rules actually landed.

Pricing-ready output and reconciling to the collection sheet

A converted BOQ is the pricing spreadsheet the QS is about to start working in. The structure has to be set up so build-up rates can drop in cleanly, the original quantities stay untouched, and the conversion can be proved correct before the first rate is keyed.

Rate and Amount columns. The Rate column starts blank, ready for the QS's build-up rate. The Amount column either starts blank too, or carries the placeholder formula =quantity * rate so that the row calculates the moment the rate goes in. Either approach is fine, and the choice usually comes down to the QS's preference and the pricing workflow they're going to use. What matters is that the Rate column doesn't carry any number that could be mistaken for a contractor's rate — no carried-through values from a similar previous tender, no sample rates from a price book, nothing. A blank Rate column on a freshly converted BOQ is the only safe state, because it forces the QS to make a deliberate decision on every rate rather than inheriting one by accident.

Quantities and units are sacrosanct. The Quantity and Unit columns reproduce exactly what the BOQ specified. These are the contract quantities the priced rates will be applied to once the contract is awarded — they are part of the legal document, not editorial detail. If the BOQ says 12.345 m², the spreadsheet says 12.345 m². No rounding to two decimals "for tidiness", no recalculation from a takeoff, no "tidying up" of unusual fractional values. An extraction that silently rounds a 12.345 to 12.35 isn't a faithful conversion; it's a quietly wrong tender response, and the error compounds at every rate-times-quantity calculation downstream. The discipline is to leave the Quantity and Unit columns alone, ideally with workbook protection on those two columns so they can't be over-typed by accident during pricing.

Section subtotals. Add a subtotal row at the bottom of each Work Section that sums the Amount column for that section — =SUM(Fxxx:Fyyy) over the section's Amount range. This is the row the QS scans first when reviewing pricing for sense ("does Section 4 brickwork look right against the floor area?"), and it mirrors the BOQ's own collection structure. Do the same at Work Group level if the BOQ runs deep enough to make that useful — the goal is that the spreadsheet's structural totals roll up the way the BOQ's printed totals do.

Collection-sheet reconciliation. Most NRM2 BOQs end with a collection sheet — sometimes labelled Summary or Section Collections — that lists each Work Section and its summed quantities, summed amounts, or both. This is the verification step. Total each Work Section's extracted quantities (and amounts, where the BOQ's collection sheet shows them) and compare against the collection sheet, section by section. If everything matches, the structural rules from the previous section landed correctly and the conversion is trustworthy. If anything diverges, the cause is almost always one of three things:

  • A Carried Forward or Brought Forward subtotal got picked up as a row, so the section total is now the BOQ total plus a stray running subtotal.
  • A description that split across a page break didn't get rejoined, so one item is now two rows with half a description each, and the quantity column is counting the same physical work twice.
  • A header or footer row got extracted as data, so the section is carrying an item it doesn't have.

The audit-trail move is to flag the section that diverges, identify which of the three causes is responsible, fix it, and reconcile again — before pricing starts. Catching a quantity divergence at this stage is a five-minute fix; catching it after the bid has gone in is a returned tender or a contractual exposure.

The converted BOQ is a working document. Once the reconciliation passes, the spreadsheet is going to grow. The QS will add Rate columns and build-up rate workings on adjacent sheets, contingency lines for items the design isn't yet firm on, addendum rows from tender clarifications, and a tender summary at the front. Set the workbook up so those edits don't require the original extraction to be redone — keep the extracted columns labelled clearly, protect the Quantity and Unit columns against overtype, name ranges where it helps, and resist the urge to delete the conversion's structural columns even if they look redundant once pricing is in. The BOQ will be the contract financial reference document for the duration of the project, and the structure that survived the conversion is what makes it useful at every stage that follows.

From priced BOQ to AfP, variations, and the final account

The reason the conversion has to be done this carefully is what happens next. Once the bid wins and the contract is awarded, the converted, priced BOQ becomes the financial spine of the project — every application for payment is assessed against it, every variation is valued from it, and the final account is built on it. Convert the tender pack BOQ to Excel for pricing today and the structural decisions become the ones the QS lives with for the next eighteen months.

Subcontract applications for payment. Once subcontract packages are let against the priced BOQ, the agreed subcontract sum for each package is the figure the subcontractor's monthly application is assessed against. The Work Section structure of the priced BOQ is what each application gets broken down into for percentage-complete or measured-progress assessment — the subcontractor claims a percentage against each Work Section, the QS walks the site and forms a view, and the difference between the claim and the assessment is what drives the payment notice and any pay less notice that follows. A clean per-Work-Section subtotal in the spreadsheet is what makes that assessment quick to do and quick to defend where the subcontractor's claimed percentage doesn't match the work physically on the ground. The procedural detail of how to assess a subcontractor's application for payment and issue a pay less notice sits in its own treatment; the point here is that the Work Section structure the conversion preserved is the structure the assessment runs in. Those assessed AfPs, alongside the client-side payment certificates and the purchase ledger postings against each package, are what feed the contractor's monthly package-level cost value reconciliation workbook — which is the report the commercial team uses to see whether each Work Section is actually earning the margin the priced BOQ assumed.

Variation valuation. Under JCT contracts, variations are valued first on the BOQ rates where the variation is similar in character and conditions to the measured work, then on a fair valuation derived from the BOQ rates where the conditions differ enough to make the BOQ rate unsuitable, and then on dayworks where there is no other reasonable basis. Under NEC option B and option D — the priced contracts that use a BOQ as the basis for compensation event valuation — the same priced BOQ rates feed the compensation event quote, subject to the contract data and the nature of the change. Either contract family treats the priced BOQ from the conversion as the starting reference document, and the rate column the QS dropped in during pricing is what every variation is valued against. The Item Code column the conversion preserved earns its keep here too — variation instructions reference BOQ items by code, and a code column the QS can filter on saves the description-string search workaround at the moment a variation is hottest. The contractual distinction between the application for payment that flows through the contract administrator and a tax invoice that flows through the QS's accounts payable function — covered in the treatment of JCT and NEC applications for payment vs VAT invoices — is worth keeping in mind whenever a variation valuation is being prepared, because the document that gets issued isn't the same as the document that gets paid against.

Final account. The final account starts from the priced contract sum — the BOQ's bottom line at the moment the contract was awarded — and is adjusted by agreed variations, prime-cost sum reconciliations against actual nominated subcontractor prices, provisional-sum expenditure against contract administrator instructions, fluctuations where the contract allows them, and any contractual entitlements that have crystallised over the course of the project. The Work Section structure that survived the conversion is what the final account is presented in — agreeing the final account is, in part, agreeing the per-Work-Section figures with the contractor's QS or the employer's QS at a meeting eighteen months after the bid went in. A converted BOQ that lost its hierarchy at the start is a final account that has to be reconstructed by hand at the worst possible moment, with the original PDF in one window and the spreadsheet in another, working backwards from the bid total to a Work Section breakdown that should have been there from day one.

The supplier-invoice side. The same extraction discipline the QS just applied to the BOQ also applies to the supplier invoices flowing against the live contract: builders' merchant invoices for materials and consumables, plant hire invoices, subcontractor applications coming in alongside the assessment process above, and CIS-deducted payments going out the other way. The QS who handles both pricing the tender and processing the AP that follows it can step across to the related treatment of how to extract line items from UK builders' merchant invoices for the supplier-side equivalent of the workflow this article has covered.

SMM7. Older priced BOQs the QS may inherit on long-running framework agreements or refurbishment work with carried-over tender documentation will be SMM7-compliant rather than NRM2-compliant. The structural shape of the document is similar enough that the conversion approach above holds — only the section numbering and a few measurement conventions differ.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading