Extract Construction Supplier Invoices and E-Way Bills (India)

Pair Indian construction supplier tax invoices with their e-way bills and extract a per-truck project-tracker row keyed on vehicle number.

Published
Updated
Reading Time
24 min
Topics:
Industry GuidesConstructionIndiae-way billGSTproject tracker

It is Friday afternoon. On the desk sits last week's bundle: roughly thirty paired PDFs from the cement plant, the TMT supplier, two aggregate quarries, the M-sand vendor, and the ready-mix concrete batching plant. Each pair is a tax invoice plus the e-way bill that travelled with the truck. The site engineer's weekly project meeting is on Monday morning, and the project tracker needs a per-truck row for every one of those deliveries before then.

To extract construction supplier invoice with e-way bill India workflows like this one and produce a project-tracker row, the work is concrete. Pairing a construction supplier's tax invoice with its e-way bill for a project tracker means lifting the Rule 46 invoice fields (HSN, taxable value, GST) and the Part B transit fields (vehicle number, transporter ID) from the two PDFs and joining them on the vehicle number. The output is one row per truck delivery, keyed on vehicle number and date, ready for project costing, on-site delivery acknowledgement, and GSTR-2B reconciliation.

The vehicle number is the load-bearing detail. It is the natural primary key tying the supplier's tax invoice, the e-way bill that accompanied the consignment, and the on-site delivery acknowledgement signed at the project gate into the same tracker row. Generic invoice OCR pulls the tax invoice but loses the vehicle number, which defeats the entire point of project-level tracking — the reader cannot answer "which truck delivered the cement on Wednesday" from a tracker that has no Vehicle-number column.

The walk-through that follows is organised around producing that row.

The three documents that arrive with every site delivery

Every truck that pulls into a project gate in India brings — or should bring — three documents. The reader handles all three already; what changes once they are framed as a working set is that each one carries different columns of the project tracker, and none of them is dispensable.

The supplier's tax invoice. Issued by the cement plant, the steel mill, the quarry, or whichever supplier raised the consignment, the tax invoice is where the project costing starts. It carries the supplier's GSTIN, the buyer's GSTIN, the line items, the HSN, the taxable value, the GST split (CGST and SGST on intra-state, IGST on inter-state), and the place of supply. For the Rule 46 field list the tax invoice is required to carry under GST, see Rule 46 mandatory fields on an Indian GST tax invoice — this article does not redefine that field list, it uses it.

The e-way bill. Generated on the e-way bill portal by the supplier, the recipient, or the transporter (whoever is moving the goods), the e-way bill is in two parts. Part A is the consignment data — supplier and recipient GSTIN, document number, value of goods, HSN, place of dispatch and delivery, reason for transportation — and mostly mirrors the invoice header. Part B is the transit data — vehicle number, transporter ID, mode of transport, and the rail/airway/bill-of-lading number where applicable. For the regulatory framework around Part A, Part B, the consignment-value threshold, and the validity rules tied to distance, the companion piece on the India e-way bill system: Part A, Part B, thresholds and validity has the definitions; the work that follows here picks up after them.

The on-site acknowledgement. This is the document the regulatory pages tend to skip and the practitioner cannot. It takes different forms by material:

  • A weighbridge slip for cement and aggregate, with gross-tare-net weights captured at the site weighbridge as the truck enters and again as it leaves.
  • A stockyard receipt for steel, with bar count, length, and tonnage signed off at the steel yard against the consignment.
  • A delivery challan signed at the site gate for finished goods like tiles, electricals, plumbing fittings, and ready-mix concrete, with the security register countersigning the truck in and out.

This is where the delivery is confirmed against the invoice and the e-way bill, and where the construction site material received report from invoice India workflows lives — the row is not real until the goods crossed the gate.

The project tracker needs all three because each one carries something the others do not. The invoice carries the financial data — HSN, taxable value, GST, the supplier's legal identity. The e-way bill Part B carries the truck — vehicle number and transporter. The on-site acknowledgement carries the closing column — was this consignment received on site, yes or no.

The pattern of pairing follows the documents themselves. Every supplier tax invoice for a consignment above the applicable e-way bill threshold has a paired e-way bill (the threshold variation by state is covered later). Every dispatch on a road consignment has a paired on-site acknowledgement at the receiving project site. The bundle on the desk is, in practice, a stack of these triples — one tax invoice, one e-way bill, one weighbridge slip or signed challan — repeated thirty times.

Part A fields the extraction has to lift from the e-way bill

Part A is a defined set of fields, but the practical question is not what Part A contains — the reader knows that — it is which of those fields the extraction must lift and where each one lands in the project-tracker row. The list below pairs the Part A fields to the tracker columns they produce.

  • GSTIN of the recipient. Confirms the consignment is addressed to the right legal entity. This matters when a builder runs separate GST registrations per state, or when head office and a project-site SPV bill under different GSTINs. A misrouted GSTIN here is the difference between input tax credit landing in the right entity's GSTR-2B and getting stranded.
  • Document number and document date. The supplier's tax invoice number and date, mirrored from the invoice into Part A. These feed the Tax-invoice-number column and the Date column on the tracker. Where Part A's document number does not match the tax invoice number on the paired PDF, that is a supplier-side data slip worth surfacing immediately — the e-way bill and the invoice will not reconcile downstream until it is resolved.
  • Value of the consignment. The invoice value as recorded on the e-way bill. Useful as a cross-check against the taxable value on the tax invoice; small mismatches surface supplier-side data-entry slips well before they reach GSTR-2B.
  • HSN code. The supplier's HSN at the consignment level. This feeds the HSN column on the tracker. In construction, a single dominant HSN per consignment is the norm — the cement truck has 2523, the steel truck has 7213 or 7214 — but multi-HSN consignments do exist for mixed dispatches from a hardware merchant and for finishing-trade suppliers carrying tiles, fittings, and adhesives in one load.
  • Place of dispatch (PIN). The source PIN of the consignment, which maps to the supplier's plant, depot, or stockyard. Useful for distinguishing same-supplier deliveries that originate from different plants — a cement major dispatching from two different plants for two different sites in the same week is common.
  • Place of delivery (PIN). The destination PIN. This is how the tracker assigns the delivery to a project site when a builder runs multiple sites in the same state and the supplier ships to whichever site is on the day's schedule. The PIN is the routing key — get it right at extraction and the Project-site column populates itself.
  • Reason for transportation. Supply, sales return, job work, line sale, export, SKD/CKD, recipient-not-known, or others. Useful for separating fresh purchases from job-work returns and for filtering the tracker to the rows that genuinely add to project cost in a given period.

One workflow detail drives extraction discipline more than any other: Part A cannot be edited once the e-way bill is generated. Whatever the supplier keyed at generation is what the audit trail reflects. The extraction has to capture Part A as it stands on the PDF, not silently reconcile it back to the invoice — a quiet correction during extraction breaks the audit chain even when it produces a tidier tracker.

The single most common slip a careful read of Part A catches is a place-of-delivery PIN that maps to a different project site than the one the invoice was raised against. The supplier's billing address and the ship-to address are different fields and they get crossed at the supplier's data-entry stage with surprising regularity. The result, on the tracker, is a row misclassified to the wrong site — and a quantity surveyor at the right site wondering why his weighbridge shows an extra delivery the costing does not.

Part B fields and why the vehicle number is the join key

Part B is shorter than Part A and carries the fields the project tracker is actually keyed on. Four to lift, in order of how often they get used.

  • Vehicle number. The truck's registration, in the standard format of state code, RTO district code, registration series, and registration number. For road consignments, this is the single most important Part B field. Everything else in the tracker hangs off it.
  • Transporter ID and transporter name. A GSTIN-linked transporter ID where the carrier is GST-registered, or an enrolled transporter ID issued by the e-way bill portal where the transporter is unregistered. Feeds the Transporter-ID column on the tracker. Useful for transporter-level analysis — which carriers are running the project's deliveries, where freight is concentrated, where short-deliveries cluster.
  • Mode of transport. Road, rail, air, or ship. Indian construction supply is overwhelmingly road. Rail surfaces for long-haul cement and structural-steel movements between regions, particularly into the north-east and the deep south. Air and ship are rare for site-bound construction inputs but do appear for imported finishings, specialised fittings, and the occasional consignment of fast-track project equipment.
  • RR / Airway Bill / Bill of Lading number. Relevant when the mode is rail, air, or ship respectively. Capture this in a parallel column rather than overloading the road-only Vehicle-number field — the tracker reads cleaner when the rail RR number is in its own column rather than crammed into the same cell as a truck registration.

Now, the join. Reconciling a truck delivery against its invoice and e-way bill works because the same vehicle number appears on multiple documents, not just on Part B. The same truck registration shows up:

  • On Part B of the e-way bill, where the supplier or transporter keyed it at the start of the trip.
  • On the supplier's delivery challan or dispatch note that travels with the consignment.
  • On the weighbridge slip when the truck registers at the project gate.
  • On the security register the gate maintains for every vehicle in and out.

That common string is what allows the project tracker to reconcile a financial document (the tax invoice), a transit document (the e-way bill), and a physical receipt (the weighbridge slip or stockyard receipt) into a single row keyed on one identifier. Date and consignment value are secondary checks that catch supplier-side data-entry slips when the vehicle number alone is ambiguous — the same registered truck may cross a state border twice in a week with the same supplier, and the date plus the consignment value separate the two trips into two distinct rows.

Part B is the one part of the e-way bill that can be updated in transit. Vehicles change at trans-shipment points; transporters hand over loads; trucks break down and the consignment moves to a different vehicle. The e-way bill is incomplete without Part B, and the latest Part B is what the PDF in hand should reflect. The extraction should pull the Part B as it appears on that PDF, accepting that the vehicle number on a single consignment may have been updated more than once before the truck reached the project site.

The operational consequence of getting this wrong is direct. Generic invoice OCR pulls the tax invoice but loses the vehicle number — it has no concept of the e-way bill as a paired document — and the project-tracker row that comes out has a Vehicle-number column that is blank. With that column blank, the site engineer's "which truck delivered the cement on Wednesday" question has no answer in the tracker, the weighbridge reconciliation breaks, and the row becomes an audit liability rather than an audit asset. Extracting Part B alongside the invoice is what separates a project tracker that supports site work from one that fails it.

The 14-column project-tracker row this extraction produces

The destination is a single row per truck delivery, with fourteen columns. Most readers will recognise every column from their existing tracker; what is new is having the schema written down so each column points back to the document it sources from. The shape is the same whether the supplier is an aggregate quarry feeding a project tracker, a cement plant, a TMT mill, or a sand vendor — the row schema absorbs them all.

#ColumnCarriesSource document
1DateDispatch date on the e-way bill, or invoice date on the tax invoice — whichever the builder's tracker uses as the canonical delivery dateE-way bill Part A / tax invoice
2Project siteSite name or code derived from the place-of-delivery PIN, with manual override where the supplier ships to head office and onwardE-way bill Part A
3Vehicle numberTruck registration number — the join keyE-way bill Part B
4SupplierSupplier legal name and GSTINTax invoice header
5MaterialLine-item description (cement OPC 53, TMT bars Fe 500, M-sand, ready-mix concrete grade M25)Tax invoice line
6QuantityBags, tonnes, cubic metres, brass — cross-checked against the weighbridge slip or stockyard receiptTax invoice line + on-site acknowledgement
7RatePer-unit rateTax invoice line
8Taxable valuePre-GST line valueTax invoice line
9GSTCGST + SGST on intra-state, IGST on inter-stateTax invoice
10HSNConsignment-level HSN (cement 2523, steel 7213/7214, aggregate 2517, sand 2505)E-way bill Part A
11Transporter IDGSTIN-linked or enrolled transporter IDE-way bill Part B
12E-way bill numberThe 12-digit EWB numberE-way bill Part A
13Tax invoice numberSupplier's invoice number, mirrored to Part ATax invoice / e-way bill Part A
14Acknowledged on site (Y/N)Whether the consignment crossed the project gateWeighbridge slip, stockyard receipt, or signed delivery challan

Read across the row and the data-flow is visible. Columns 1 and 2 anchor the row in time and place. Column 3 is the join key. Columns 4 through 9 come from the tax invoice and carry the financial substance. Columns 10 through 13 come from the e-way bill and carry the transit substance, with the HSN and EWB number sitting on Part A and the transporter sitting on Part B. Column 14 is the closing column, populated from whichever on-site document is in use for that material, and it is what turns the row from a paper reconciliation into an audit-readiness record.

The extraction step that produces this row is mechanical once the schema is fixed. Invoice Data Extraction takes a batch of paired tax-invoice and e-way bill PDFs, the user describes the row schema in a natural language prompt — invoice number, date, supplier, GSTIN, material, quantity, rate, taxable value, GST, HSN, vehicle number, transporter ID, EWB number, one row per e-way bill — and the platform returns a structured Excel, CSV, or JSON file with one row per truck delivery. The prompt can be saved to the prompt library and reused so subsequent weekly bundles produce the same shape automatically. The Acknowledged-on-site column is filled in from the weighbridge or gate register at the project office; the rest comes from the paired PDFs. To extract paired Indian construction invoices and e-way bills into this fourteen-column row is a single batch task with the same prompt producing the same shape across every truck.

Multi-truck consolidation: three trucks, three rows

A single cement supplier dispatches three trucks to the same project site on the same morning. The supplier raises one tax invoice per truckload (sometimes one invoice per dispatch lot, but the per-truckload pattern is the more common one for bagged cement, ready-mix concrete, and graded aggregate). One e-way bill is generated per truck. Three trucks means three e-way bills, three weighbridge slips at the gate, and three rows in the tracker — one per vehicle number — not a single consolidated line for the day's cement delivery. This multi-truck delivery invoice consolidation India construction pattern is everyday reality at any active project site.

The reason per-truck granularity matters is operational, not pedantic. Site reconciliation runs at the truck level: the site engineer compares each weighbridge slip against the delivered tonnage and the supplier-claimed tonnage, truck by truck, and any variance gets logged against that specific vehicle. A consolidated single row makes that comparison impossible. Disputes — short-deliveries, tare-weight challenges, quality rejections at the gate — are also raised at the truck level, against a specific truck registration on a specific date, and the row that documents the dispute has to identify the truck or the dispute has nothing to point at. Costing rolls up cleanly from per-truck rows to per-day, per-week, or per-bill-of-quantities totals, but it cannot be split back into per-truck data from a consolidated row without re-reading the source PDFs — work that defeats the point of having extracted the data in the first place.

The extraction implication follows directly. A prompt or rule that defaults to one row per invoice will collapse multi-truck consignments into a single line whenever a supplier raises one invoice for several truckloads. The grain the tracker needs is not the invoice grain but the e-way bill grain: one row per e-way bill, keyed on vehicle number, with the supplier's invoice number repeated across the rows it covers. When the relationship inverts — one truck carrying line items from several supplier invoices, which is rare for cement and steel but does happen with consolidated electrical and plumbing dispatches — the rule still holds, with the tax-invoice-number column carrying both invoice numbers separated by a comma rather than the row collapsing.

Construction supplier mix and the HSN codes you'll see

The supplier categories on a typical month's bundle are recognisable to any project accountant working in Indian construction: cement, structural steel and reinforcement, aggregate, sand, ready-mix concrete, and a long tail of finishings — electricals, plumbing, tiles, paint, and hardware. The HSN codes that ride on Part A of the e-way bill cluster around a small landmark set, and knowing them shortens the time it takes to spot a misclassified consignment. The bulk of the project's spend sits on cement and steel, which is where HSN landmarking on supplier invoice extraction pays back fastest.

  • Cement. HSN 2523 covers Portland cement, alumina cement, slag cement, and similar hydraulic cements. The OPC 53 / OPC 43 / PPC / PSC distinction shows in the line-item description, but the HSN stays at 2523 across all of them. White cement and aluminous cement variants sit within the same heading.
  • TMT bars and structural steel. HSN 7213 covers bars and rods, hot-rolled, in irregularly wound coils. HSN 7214 covers other bars and rods of iron or non-alloy steel. The Fe 500 / Fe 500D / Fe 550D grade distinction shows in the line item; the HSN stays in the 7213 to 7214 band. Structural steel sections (angles, channels, beams) move into the 7216 family — worth checking when a steel supplier raises a mixed consignment of TMT and structurals.
  • Aggregate. HSN 2517 covers pebbles, gravel, broken or crushed stone of a kind commonly used for concrete aggregates. The 20 mm, 10 mm, and 6 mm grading typically shows in the line description; HSN sits at 2517 throughout.
  • Sand. HSN 2505 covers natural sands of all kinds. Manufactured sand (M-sand), which is increasingly the default at project sites where natural sand is regulated or scarce, often surfaces under 2517 instead of 2505 because it is mechanically a crushed stone product. The supplier you have today and the supplier you have next month may classify M-sand differently — the extraction needs to capture whatever the supplier filed, and the tracker needs to be ready to normalise the two HSNs into a single material category at reporting time.
  • Ready-mix concrete. RMC most commonly shows under HSN 3824 (other chemical products and preparations of the chemical or allied industries), although some suppliers raise it under 6810 (articles of cement, concrete, or artificial stone). The tracker should accept both and let the material-category column do the grouping.
  • Electricals, plumbing, tiles. A broader range — HSN 8544 for insulated electrical conductors and cables, 3917 for plastic tubes and pipes, 7411 for copper tubes, 6907 and 6908 for ceramic and porcelain tiles, 3208 to 3209 for paints and varnishes. These are typically lower-volume in tonnage but higher-variety in HSN; the tracker's HSN column earns its keep on the finishing-trade rows even more than on the bulk-material rows.

For the global construction-vertical extraction picture — common challenges across construction document workflows in any geography, and how line-item extraction lands across the different document types contractors handle — the broader piece on invoice data extraction for the construction industry sits alongside this article. This article addresses the en-IN paired-extraction layer specifically: the e-way bill and the vehicle-number join key that the global piece does not need to cover.

The workflow value of HSN landmarking is in the reporting view. When the project accountant pulls the tracker for a project-cost-by-material-class report — bulk-material spend versus structural-steel spend versus finishings spend, week by week — the HSN column groups cement, steel, aggregate, sand, and finishings cleanly without anyone having to map line-item descriptions to categories by hand. A tracker that captures HSN consistently is one query away from that report; a tracker that does not is several hours of manual classification away from it.

State threshold sidebar and the audit-readiness check

Under Rule 138 of the CGST Rules 2017, an e-way bill is required for the movement of goods of consignment value above ₹50,000. That threshold applies to all inter-state movements nationally, and is the default for intra-state movements in most states.

Several states diverge on the intra-state side. Several Indian states, including Tamil Nadu, West Bengal, Delhi and Bihar, set the intra-state e-way bill consignment-value threshold at one lakh rupees, higher than the fifty-thousand-rupee threshold that applies in most other states and to all inter-state movements under Rule 138 of the CGST Rules — per the ICAI Handbook on E-Way Bill under GST. As of May 2026, that ₹1 lakh intra-state threshold has held for some years in those states, but states change these thresholds periodically through commercial-tax department circulars; the practitioner should reconfirm against the latest circular for the originating state before relying on the threshold for a specific consignment. A few states also carry product-specific exemptions and threshold variations that the central rule does not — worth a five-minute check against the state circular when a recurring supplier moves into a new state.

The operational consequence for the tracker is that not every consignment will have a paired e-way bill. Above the applicable threshold, the e-way bill is mandatory and the tracker row will have all fourteen columns populated. Below the threshold — for instance an intra-state delivery of a small lot of fittings under ₹50,000 in a state on the central threshold, or under ₹1 lakh in Tamil Nadu, West Bengal, Delhi, or Bihar — the supplier may issue only the tax invoice and a delivery challan with no e-way bill at all. The tracker has to accommodate those rows: the Vehicle-number column gets sourced from the delivery challan or the gate register instead of Part B, the Transporter-ID column stays blank, and the EWB-number column stays blank. That is fine; what is not fine is a row above the threshold where the EWB-number column is blank, which is a compliance gap rather than a workflow exception.

The project tracker is, in this light, the practitioner's audit-readiness file. The Acknowledged-on-site column closes the loop on whether the consignment was actually received. The EWB-number column documents that the threshold was met where it should have been. Together they answer the two questions a GST audit and a project-finance review will both ask: did the goods arrive, and where required by law was an e-way bill issued. A tracker that systematically pairs the invoice with the e-way bill — and that documents the no-e-way-bill exception against the threshold logic for sub-threshold consignments — is doing both jobs at once, with no separate audit file to maintain.

From the tracker to GSTR-2B, project costing, and the Tally voucher import

Once the per-truck rows are sitting in the project tracker, four downstream paths matter. None of them needs the data reshaped — the row schema feeds each one directly.

GSTR-2B reconciliation. The project tracker becomes the primary purchase-register source for the builder's GSTR-2B matching. The supplier's GSTR-1 flows into the recipient's GSTR-2B; the tracker's per-truck rows roll up to per-supplier-per-period totals that match against the GSTR-2B view at supplier GSTIN, invoice number, and taxable value. Mismatches surface as supplier-side data slips — wrong recipient GSTIN, transposed invoice number, taxable value off by a paise — before they cost the builder its input tax credit. The tracker's HSN and tax-invoice-number columns are what carry the load here. For the matching mechanic in detail, the existing piece on GSTR-2B input tax credit reconciliation against the purchase register covers the workflow end-to-end; the project tracker plugs into it as the purchase-register input.

Project costing spreadsheet. The same per-truck rows feed the project-costing workbook, typically organised per-site, per-line-item, per-week. The HSN, taxable value, and quantity columns let the quantity surveyor reconcile actual material consumption against the bill of quantities — cement bags issued to slab against bags purchased, TMT tonnage against the structural drawing, M-sand cubic metres against the plaster estimate. The Vehicle-number column lets the QS pull a list of every truck that delivered to a site in a given week, by supplier, without re-reading any source PDFs. The e-way bill JSON to project costing spreadsheet path runs through the same row schema: the JSON output has the e-way bill fields and the invoice fields in one record, and the costing workbook reads that JSON straight into a per-site sheet with no intermediate reshaping.

TallyPrime voucher import. Builders running Tally as the books of account import the rows as purchase vouchers. Conventions differ — some firms create one voucher per tax invoice and book the multi-truck consignment as a single voucher with several Item lines, others create one voucher per e-way bill to mirror the per-truck grain in Tally itself. Either way, the HSN, taxable value, and GST columns map onto Tally's Item and Tax fields directly. The piece on import the extracted purchase vouchers into TallyPrime covers the field mapping and the import format Tally accepts.

MIS dashboard. The same Excel or CSV output drops into the builder's MIS dashboard for management reporting — material spend by site, supplier concentration, average rate per material class, e-way bill compliance percentage by supplier, weighbridge-acknowledged percentage by site. No additional transformation is needed beyond the row schema; the dashboard reads the file and the pivot does the rest.

The export targets are Excel (.xlsx) for the project tracker and the costing workbook, CSV (.csv) for the Tally voucher import staging file, and JSON (.json) for the MIS dashboard and any direct system feed. Pulling cement delivery invoice and e-way bill data to Excel is the workhorse path; CSV and JSON are alternatives where the downstream system reads them more cleanly. By Monday morning, the bundle that started as thirty paired PDFs is fourteen columns wide and one row per truck — ready for the site engineer's meeting, the Tally voucher import, and the GSTR-2B match.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading