Convert a Bill of Quantities PDF to Excel (NRM2 Tender BOQ)

UK QS workflow for converting an NRM2 tender BOQ PDF into pricing-ready Excel — preserving Work Section hierarchy and reconciling to the collection sheet.

Published
Updated
Reading Time
18 min
Topics:
Industry GuidesConstructionUKExcelNRM2bill of quantitiestender pricingquantity surveying

Converting an NRM2 bill of quantities PDF into Excel comes down to five structural moves the spreadsheet has to land before pricing starts. One row per measured item. The Work Section / Work Group / Item hierarchy preserved as Excel grouping rather than flattened into the description column. Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks kept in distinct sections, because each is priced under different rules. The Rate and Amount columns left blank so build-up rates can drop in without disturbing the original quantities. And per-Work-Section quantity totals reconciled back to the BOQ's collection sheet as the verification step.

An NRM2 BOQ is organised as Work Section, Work Group, then measured item. Item rows normally carry code, description, quantity, unit, rate, and amount, while Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks each need different treatment. As factual anchor: NRM2 became operative on 1 January 2013 and formally replaced SMM7 on 1 July 2013, organising building works into 41 indexed Work Sections from Preliminaries through Builder's Work in Connection — the RICS NRM2 Work Section structure is the spine the converted spreadsheet has to reproduce. Older SMM7 tender BOQs differ in section numbering and some measurement conventions, but the same conversion approach holds.

Specifying the extraction so the structure survives

Most BOQ-to-Excel conversions fail before they begin, because the output isn't specified. The QS opens the PDF, copies a page of items, pastes into Excel, finds that the Work Section heading has collapsed into the first item's description and that the columns have drifted, fixes a few rows by hand, and within twenty minutes is rebuilding the BOQ row by row. The fix is to specify the output before any extraction runs.

Eight columns, at minimum:

  • Work Section — as its own column, or as an Excel outline / grouping level. Not concatenated into the description.
  • Work Group — the sub-grouping under the Work Section, again as its own column.
  • Item Code — the item identifier the BOQ uses (for example 4.1.A, D20.1.1).
  • Item Description — the prose description, complete and rejoined where the PDF split it across pages.
  • Quantity — the BOQ's stated quantity, unrounded.
  • Unit — normalised to a clean enumerated set (see below).
  • Rate — left blank, ready for the QS's build-up rate.
  • Amount — left blank, or pre-populated with the placeholder formula =quantity * rate so the row calculates as soon as the rate goes in.

The single biggest reason copy-and-paste BOQ extractions become unusable is that the Work Section disappears into the description string. Section 4 of an NRM2 BOQ might span seventy pages and include several hundred items; once those items live in a flat list with the section identifier buried in the description prose, the QS can no longer filter, group, or roll up by section. The conversion has to put the Work Section in its own column. Excel then handles grouping, subtotalling, and outline-level collapsing as a property of that column.

NRM2 BOQs use a recognisable enumerated set of units of measurement: m, , , item, t, no., kg, hr, sum. PDF rendering produces surface variation on most of these — M2, sqm, square metres, and m^2 are the same unit, but Excel sees four distinct strings unless the conversion normalises them. Coerce the unit column into the canonical set as part of the extraction. The result is a Unit column the QS can pivot on, filter on, and use as a sense-check on the rate column further down.

One row per measured item. The same rule applies to preliminary items, provisional sum items, prime cost sum items, and dayworks rate-schedule items — each row carries its section context as data in the Work Section / Work Group columns, not as a header buried in a heading row. Don't aggregate items, don't concatenate rows, and don't split a single item into multiple rows because the description wrapped across two visual lines on the page.

NRM2 item descriptions are conventionally written as [code] [description]. Splitting code and description into separate columns is worth the few extra minutes it takes during the conversion specification, because the priced BOQ gets queried by code at every downstream stage — variation pricing references the BOQ rate by item code, drawings get reconciled against codes, and the post-tender clarifications come back referencing item codes from the issued document. A code column the QS can filter on saves the search-the-description-string workaround forever.

For an extraction tool, treat the prompt as the conversion specification: name the columns, the grouping levels, the unit set, and the section types that must stay separate. A usable prompt is: "Extract one row per item from this NRM2 BOQ. Columns: Work Section, Work Group, Item Code, Item Description, Quantity, Unit, Rate (leave blank), Amount (leave blank). Normalise units to m, m², m³, item, t, no., kg, hr, sum. Keep Preliminaries, Measured Work, Provisional Sums, Prime Cost Sums, and Dayworks in distinct sections." Both native and scanned BOQ PDFs can produce structured .xlsx, .csv, or .json output; the extract structured data from construction PDFs workflow earns its place when the extraction follows the QS's specification rather than producing a generic flat dump.

The same extraction discipline applies to the supplier-side documents the construction QS handles in parallel. The wider treatment of invoice data extraction for the construction industry covers how the same prompt-driven approach lands on builders' merchant invoices, subcontractor applications, and CIS-deducted payments — which the QS converting a tender BOQ this afternoon will be handling on the live job from the same desk next week.

Preliminaries, Provisional Sums, Prime Cost Sums, and Dayworks — why each section needs its own treatment

The eight-column layout from the previous section is the measured-work layout. It does not fit Preliminaries, it does not fit Provisional Sums or Prime Cost Sums in the same way, and it does not fit Dayworks at all. Forcing every BOQ section into one flat table is the second-most-common reason a converted BOQ comes out broken — the prelims get mangled, the PS items get priced as if they were measured items, and the dayworks rate schedule disappears into a column it doesn't belong in. Each of the five section types has to be extracted in a shape the rules of measurement actually allow.

Preliminaries sit at Section 1 in NRM2 and are priced one of three ways depending on how the BOQ has set them up. Sometimes the prelims are a single fixed lump sum at the bottom of a list. Sometimes they're a structured preliminaries pricing schedule with separate columns for time-related charges (site management, scaffolding hire, welfare facilities) and fixed charges (mobilisation, demobilisation, one-off attendances). Sometimes it's a hybrid. The conversion has to respect whichever shape the BOQ used. If the BOQ has separate time-related and fixed-charge columns, those columns survive into the spreadsheet as columns, not as rows mashed into one Amount field. If it's a single sum, it stays as a single sum row. What it cannot become is a per-item rate-times-quantity layout, because that's not how prelims are priced — there's no contract quantity for "site agent for 36 weeks" the way there is for of brickwork.

Measured Work is what the eight-column layout was built for. The bulk of the BOQ lives here. One row per measured item, full Quantity, Unit, Rate, and Amount columns, Work Section grouping preserved, codes split from descriptions. This is where the structural rules from the previous section get their full workout, and it's the part of the spreadsheet the QS spends most of the pricing window inside.

Provisional Sums need a Type column. NRM2 distinguishes "defined" and "undefined" provisional sums, and the BOQ usually labels each — defined PS items have known scope, programme implications, and design intent, and the contractor is expected to allow for them in the programme; undefined PS items have none of that, and the contractor isn't required to make programme allowance. The conversion extracts each PS item as its own row, marks it in a Type column with a value like Provisional Sum (Defined) or Provisional Sum (Undefined), and leaves the Rate column blank. The QS prices a PS only when scope is firmed up, and the contract administrator's instruction is what triggers the work in the first place. Keeping PS items visible but unpriced is the discipline that stops the QS from accidentally pricing them as measured items because they happen to share the row layout.

Prime Cost Sums also belong in the Type column, marked as Prime Cost Sum. PC sums are the BOQ's placeholder for work to be carried out by a nominated subcontractor or supplier whose price isn't yet known — typically specialist mechanical and electrical packages, or specified-supplier kitchen and joinery items where the design team has nominated the supplier but not yet agreed the price. The PC value in the BOQ is a sum the contractor includes in the tender, and that sum is replaced with the actual subcontractor's price when the sub-tender comes back. The conversion has to extract PC items into their own rows, flag them as Prime Cost Sum, and pass them through to the PC-sum sub-tender process — never silently absorb a PC sum into the measured-work pricing, because the PC value isn't the contractor's price to commit to.

Dayworks are different again. The dayworks section in an NRM2 BOQ is a rate schedule, not a list of measured items: labour rates by trade and grade, plant rates by item or hour, and materials handling percentages. The conversion extracts the rate schedule itself, not measured items, and the result lives in its own sheet inside the workbook. Dayworks rates aren't priced against a contract quantity; they sit waiting for the contract administrator to instruct work that doesn't fit the measured-work rates and can't be valued on the BOQ rates either. A clean dayworks rate schedule is what makes that valuation possible without going back to the original PDF mid-job.

A usable converted BOQ is therefore not a single flat list of rows. It's a workbook: a measured-work block with the eight-column layout running across each Work Section, a preliminaries block whose shape mirrors the BOQ's prelims pricing schedule, PS and PC items flagged in a Type column so they're visible but not mistakenly priced, and a dayworks rate schedule sitting in its own block ready to be referenced. Each block is in the same workbook so the QS can move between them, but each respects the rules the section is priced under.

Page break realities — continuation pages, mid-item splits, and headers and footers

Three failure modes account for almost every broken BOQ extraction in practice. Carry-forward subtotals get picked up alongside the items and the quantity totals double-count. Item descriptions split across page breaks and the second half of the description ends up glued to the next item's code. Repeating page furniture — project name, employer, BOQ section name, page numbers — gets extracted as data and turns up as a row inserted between every page's worth of items. Each one has a clean rule.

Carry-forward and brought-forward subtotals. NRM2 BOQs use Carried Forward totals at the bottom of each page and Brought Forward totals at the top of the next page so that the page-by-page totals add up cleanly when the document is read in print. They are running subtotals, not items. The reliable identifier is the description column — text containing Carried Forward, Brought Forward, c/f, or b/f, usually paired with a missing or running-total quantity column. The extraction has to extract item rows only and exclude every carry-forward and brought-forward line, otherwise the spreadsheet's quantity column will count the same items twice (once as items, once as the section's running subtotal). The check on this is the collection-sheet reconciliation in the next section — if the section total is exactly the BOQ total plus a stray subtotal value, this rule was missed.

Mid-item page breaks. An NRM2 item description can run several lines, especially in measured work for finishes, M&E, and joinery where the description carries SMM-style supplementary information about the work being measured. A long description can split across a page break — the first half on the bottom of one page, the second half on the top of the next. The PDF text layer carries no cue that it's the same item; the visual layout is doing all the work. The extraction has to delimit items by the item code prefix, not by the page break — an item starts when a code-prefixed line appears, and continues until the next code-prefixed line, even if a page break sits between them. Rejoin the description halves so the row in the spreadsheet carries the complete description, otherwise variation pricing later in the project gets done against half a description.

Headers, footers, and repeating page furniture. Tender BOQs typically run the project name, the contractor or employer name, the BOQ section identifier, and a page number across the top and bottom of every page. None of that belongs in the spreadsheet. The reliable identifier is positional repetition — text that appears in the same location on every page in a section is page furniture, not data. The extraction recognises the repeated pattern and drops it before assembling rows. Without this step, the spreadsheet ends up with a project-name row inserted every twenty or thirty rows, and every one of those false rows pollutes filters, sums, and section grouping.

Scanned BOQs. Older tender packs, and many tender returns coming back from main contractors, arrive as scanned PDFs rather than native PDFs — the document was printed, signed, scanned, and re-issued, and the text layer is OCR output rather than the publishing tool's original text. The structural rules above don't change. Carry-forward exclusion still applies. Mid-item rejoin still applies. Page-furniture drop still applies. The difference is in input quality: OCR introduces degraded characters where the scan was light or the original was poor, drifted column boundaries on pages that came through the scanner skewed, and the occasional run-together row where two visual lines got merged. The extraction has to handle that quality and produce the same clean output. A scanned tender BOQ isn't a different job; it's the same conversion specification applied to a more difficult input.

These rules sound mechanical, and they are. The reason to internalise them rather than fix issues row-by-row in the spreadsheet is that the verification step in the next section catches every one of them as a quantity divergence. Treat the carry-forward, mid-item, and page-furniture rules as the proof the extraction was correctly specified, and the collection-sheet reconciliation as the proof those rules actually landed.

Pricing-ready output and reconciling to the collection sheet

A converted BOQ is the pricing spreadsheet the QS is about to start working in. The structure has to be set up so build-up rates can drop in cleanly, the original quantities stay untouched, and the conversion can be proved correct before the first rate is keyed.

Rate and Amount columns. The Rate column starts blank, ready for the QS's build-up rate. The Amount column either starts blank too, or carries the placeholder formula =quantity * rate so that the row calculates the moment the rate goes in. Either approach is fine, and the choice usually comes down to the QS's preference and the pricing workflow they're going to use. What matters is that the Rate column doesn't carry any number that could be mistaken for a contractor's rate — no carried-through values from a similar previous tender, no sample rates from a price book, nothing. A blank Rate column on a freshly converted BOQ is the only safe state, because it forces the QS to make a deliberate decision on every rate rather than inheriting one by accident.

Quantities and units are sacrosanct. The Quantity and Unit columns reproduce exactly what the BOQ specified. These are the contract quantities the priced rates will be applied to once the contract is awarded — they are part of the legal document, not editorial detail. If the BOQ says 12.345 m², the spreadsheet says 12.345 m². No rounding to two decimals "for tidiness", no recalculation from a takeoff, no "tidying up" of unusual fractional values. An extraction that silently rounds a 12.345 to 12.35 isn't a faithful conversion; it's a quietly wrong tender response, and the error compounds at every rate-times-quantity calculation downstream. The discipline is to leave the Quantity and Unit columns alone, ideally with workbook protection on those two columns so they can't be over-typed by accident during pricing.

Section subtotals. Add a subtotal row at the bottom of each Work Section that sums the Amount column for that section — =SUM(Fxxx:Fyyy) over the section's Amount range. This is the row the QS scans first when reviewing pricing for sense ("does Section 4 brickwork look right against the floor area?"), and it mirrors the BOQ's own collection structure. Do the same at Work Group level if the BOQ runs deep enough to make that useful — the goal is that the spreadsheet's structural totals roll up the way the BOQ's printed totals do.

Collection-sheet reconciliation. Most NRM2 BOQs end with a collection sheet — sometimes labelled Summary or Section Collections — that lists each Work Section and its summed quantities, summed amounts, or both. This is the verification step. Total each Work Section's extracted quantities (and amounts, where the BOQ's collection sheet shows them) and compare against the collection sheet, section by section. If everything matches, the structural rules from the previous section landed correctly and the conversion is trustworthy. If anything diverges, the cause is almost always one of three things:

  • A Carried Forward or Brought Forward subtotal got picked up as a row, so the section total is now the BOQ total plus a stray running subtotal.
  • A description that split across a page break didn't get rejoined, so one item is now two rows with half a description each, and the quantity column is counting the same physical work twice.
  • A header or footer row got extracted as data, so the section is carrying an item it doesn't have.

The audit-trail move is to flag the section that diverges, identify which of the three causes is responsible, fix it, and reconcile again — before pricing starts. Catching a quantity divergence at this stage is a five-minute fix; catching it after the bid has gone in is a returned tender or a contractual exposure.

The converted BOQ is a working document. Once the reconciliation passes, the spreadsheet is going to grow. The QS will add Rate columns and build-up rate workings on adjacent sheets, contingency lines for items the design isn't yet firm on, addendum rows from tender clarifications, and a tender summary at the front. Set the workbook up so those edits don't require the original extraction to be redone — keep the extracted columns labelled clearly, protect the Quantity and Unit columns against overtype, name ranges where it helps, and resist the urge to delete the conversion's structural columns even if they look redundant once pricing is in. The BOQ will be the contract financial reference document for the duration of the project, and the preserved Work Section structure is what makes it useful at every stage that follows.

From priced BOQ to AfP, variations, and the final account

Once the bid wins and the contract is awarded, the priced BOQ becomes the reference for AfPs, variations, CVR reporting, and final account work. That is why the conversion has to preserve the hierarchy instead of producing a flat item dump.

That same Work Section structure supports later subcontractor AfP assessments, variation pricing, CVR reporting, and final account assembly. Subcontractor applications are assessed against the priced Work Sections. Variations are valued against the BOQ rates or against a fair valuation derived from them. The monthly package-level cost value reconciliation workbook uses the same section structure to test whether the package is earning the margin the tender assumed. Final account work starts from the priced contract sum and adjusts it for agreed variations, prime-cost sum reconciliations, provisional-sum expenditure, fluctuations, and crystallised entitlements.

A converted BOQ that kept Work Section, Work Group, Item Code, Quantity, Unit, Rate, and Amount intact can support all of that without reconstruction. A conversion that lost the hierarchy has to be rebuilt later with the original PDF in one window and the spreadsheet in another, at exactly the stage when the QS has the least time to fix it.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading