To extract apparel wholesale invoices to Excel in India without losing the size matrix, keep the style or colour row as the parent record, expand each size column into a separate line-item row, and carry HSN, quantity, rate, taxable value, GST rate, place of supply, and CGST/SGST or IGST with every row. That gives you a spreadsheet you can review row by row before GSTR-1 preparation, Tally import, or customer ledger update.
This matters because garment wholesale invoices are not simple item lists. A Surat or Tiruppur distributor may issue one tax invoice to a retailer where a single style has quantities under S, M, L, XL, and XXL, one rate, one HSN, one discount line, and a tax split based on the buyer's state. If an OCR tool reads those size columns as loose numbers, the Excel output may show quantities without style context. If it collapses the row into one item, the size-wise quantity disappears.
That row shape is what makes the file usable under month-end pressure. On the 9th or 10th of the month, when a distributor's accountant has 80 PDF invoices and GSTR-1 is due soon after, the review job is not to admire an OCR table. It is to check whether customer GSTINs, state codes, HSNs, taxable values, and tax splits still tie back to the original invoices — and when an invoice contains 12 pieces of a shirt in M, 8 in L, and 5 in XL, those should land as three reviewable rows rather than one merged description or three orphaned quantity cells.
Why Generic OCR Breaks Garment Wholesale Invoices
Most generic invoice OCR expects a line table where each row already represents one item. Apparel wholesale invoices often do not work that way. A single line may read like a complete product row, then spread the actual quantities across size columns — S, M, L, XL, XXL, or a local size run used by the distributor — with HSN, rate, discount, and tax fields sitting at the end of the row and applying to every size cell. Where the invoice instead prints one row per size variant, the style, colour, HSN, and rate are often printed once and visually carried down by blank cells; a human accountant reads the blank as "same as above" while a weak extractor reads it as missing data. The result is the same two failure modes: either the matrix flattens into loose size quantities without style or tax context, or the row collapses into a single item that preserves the total pieces but loses the size dimension the retailer, warehouse, and accounts team need later.
The extraction also has to keep invoice-level details tied to every expanded row. Invoice number, date, distributor name, customer GSTIN, place of supply, and taxable value are part of the tax invoice record, and they are the fields the accountant uses to trace each Excel row back to the PDF. For the broader field list that Indian GST invoices need to carry, use India GST invoice field requirements as the reference point, then add the apparel-specific size and style fields on top.
The Target Row: One Line Per Style-Size, With HSN And Tax At Line Level
Before you run conversion, define the row grain. The parent row is the style or SKU with its colour or shade; the size cells are child quantities; the extractor should expand each child quantity into its own row while repeating the parent context. The safest grain is one row per invoice, customer, style, colour, size, HSN, quantity, rate, taxable value, GST rate, tax split, and place of supply — the source invoice is doing several jobs at once (sales record, stock movement record, tax document, customer balance evidence) and the row has to carry all of them.
Header and trace fields — invoice number, invoice date, distributor GSTIN, customer name, customer GSTIN, billing state, shipping state, place of supply, source PDF name, page number — repeat across every expanded size row from the same invoice so a reviewer can filter the spreadsheet and still know where each line came from. Line fields describe the garment sale: style or SKU, product description, colour or shade, size, HSN, quantity, unit rate, discount, taxable value, GST rate, CGST amount, SGST amount, IGST amount, and line total. Apparel invoices often tuck colour ("navy", "rust", "shade 14", fabric finish) beside the description rather than in a neat column — keep it as a structured field, because stock teams and customer-account work usually care about it even when GSTR-1 does not. For combo packs sold as one commercial unit with internal size quantities, expose the pack structure (pack description, size, quantity in pieces, billable unit) rather than guessing; if the invoice simply lists separate sizes under one style, one row per size is cleaner. A source_file and page_number column is not optional in practice — when a taxable value total does not match, or a customer disputes a size breakup, the reviewer needs to jump back to the original PDF.
HSN and GST rate belong with the line, not the invoice total. A distributor selling mixed garment categories on one invoice may use HSN 6203 for woven men's garments, 6204 for woven women's garments, 6103 or 6104 for the knitted equivalents. The 56th GST Council rate-rationalisation release lists apparel and clothing accessories under Chapters 61 and 62 with separate per-piece sale-value bands: from 22 September 2025, lines not exceeding Rs. 2,500 per piece stay at 5%, while lines exceeding Rs. 2,500 per piece move from 12% to 18%. The threshold is per piece, not invoice total, so one customer invoice can carry both rate bands and the spreadsheet must keep the rate attached to each line. Place of supply needs the same line-level discipline: a Gujarat distributor (state code 24) selling to a Maharashtra retailer (state code 27) triggers IGST; within-Gujarat sales trigger CGST plus SGST. Capture the buyer GSTIN, billing or shipping state, place of supply, and the printed tax split so the accountant can catch cases where the invoice's tax treatment does not match the customer state before the data feeds GSTR-1.
Extract A Batch With Instructions For Size Expansion
Once the target Excel shape is clear, the extraction instruction should say exactly how the size matrix must be expanded. A generic prompt such as "extract invoice data" is too loose for apparel wholesale invoices. It may capture invoice totals and customer names but still miss the business-critical row structure.
For AI invoice data extraction for apparel invoice PDFs, write the prompt around the output row you want. Invoice Data Extraction lets users upload PDF, JPG, or PNG invoice files, describe the required data in a natural-language prompt, and download structured Excel, CSV, or JSON. For a Surat textile distributor invoice extraction task, the instruction can be direct: "Extract each apparel tax invoice into one row per style-size combination. For every row, include invoice number, invoice date, customer name, customer GSTIN, place of supply, style or SKU, product description, colour or shade, size, HSN, quantity, unit rate, discount, taxable value, GST rate, CGST, SGST, IGST, line total, source file name, and page number. If a product row has size columns such as S, M, L, XL, and XXL, create one output row for each non-zero size quantity and repeat the style, colour, HSN, rate, and tax fields."
That instruction tells the extractor not to treat the matrix as a visual table only. It explains the accounting grain. For a Tiruppur garment supplier invoice to spreadsheet workflow, you can adjust the size labels, product wording, and extra fields, but the core rule stays the same: one output row per size-bearing garment line, with the tax and source context repeated.
The review step still belongs to the accountant. The extracted Excel or CSV is the working file before GSTR-1 upload, Tally import, or customer-balance update. Review totals by invoice, check HSN and GST rate exceptions, and confirm that inter-state and intra-state tax splits survived the conversion. The extraction step should remove repetitive data entry, not remove professional review.
Handle Discounts, Freight, Samples, And Packing Lines Without Losing The Matrix
The difficult rows are usually not the main garment lines. They are the adjustments around them. A line-level discount should sit on the style-size row it affects. An invoice-level discount is different: it may need allocation across taxable lines, or it may need to be flagged separately so the accountant can decide how to treat it before trusting taxable value totals.
Freight and transport charges should not be silently merged into garment values. If the invoice prints freight as a separate taxable or non-taxable charge, extract it as its own row type with the printed HSN or SAC and tax treatment. The same applies to returnable packing, cartons, hangers, and other charges that appear below the garment matrix. These lines may not have sizes, but they still affect invoice totals and reconciliation.
Free-supply and sample lines need explicit flags. Apparel distributors may include sample pieces, replacement items, or promotional supply lines at nil or discounted value. If the spreadsheet only captures positive taxable rows, those lines can disappear even though they explain a quantity movement or a customer conversation later. A sample_flag or line_type column gives reviewers a way to separate commercial sales from exceptions.
Recipient-side work is the mirror image. The retailer receiving the invoice may need the same structured rows to match purchase records and tax credits, so GSTR-2B ITC reconciliation for recipient retailers becomes easier when the distributor invoice was extracted with customer GSTIN, HSN, taxable value, and tax amount intact. The same retailer's AP team usually handles other purchase categories alongside stock — store fit-out items, billing terminals, and back-office laptops — and the discipline of preserving HSN, GST splits, and section-194Q flags carries across; the workflow for pulling Indian IT hardware purchase invoices into an asset-aware Excel sheet follows the same row-grain logic applied to fixed assets instead of garments. For businesses where these same size-matrix invoices also feed stock valuation or landed-cost analysis, the related apparel size-matrix landed-cost workflows show why preserving style, size, and quantity at the earliest extraction stage matters beyond GST filing.
Review The Excel File Before GSTR-1 Or Tally Import
Start the review with control totals, not individual cells. Count the PDFs processed, count the invoice numbers extracted, and check whether any file produced no rows. Then total taxable value and tax amount by invoice and compare those totals with the original PDFs. If the invoice-level totals do not tie, fix that before spending time on size details.
Next, review customer GSTIN and place of supply. Sort or filter by state code and tax split so inter-state IGST invoices do not sit among within-state CGST and SGST invoices. For an India apparel wholesaler, this catches many return-preparation errors early because the same customer may have stores or billing addresses in different states.
Then test the size expansion. Pick several high-value invoices and compare the original matrix with the Excel rows. The total pieces for each style should equal the sum of the size rows. Blank size cells should not become phantom rows. Zero quantities should either be omitted or clearly flagged, depending on how the business wants to review exceptions.
After that, filter by HSN and GST rate. The review should make mixed-rate invoices visible, especially where per-piece value affects the rate band. Check whether freight, packing, sample, and discount rows have line types that keep them separate from normal garment sales. If the spreadsheet will become a textile wholesaler invoice to GSTR-1 Excel working file, these exception flags help the preparer decide what belongs in the return and what needs manual treatment.
For accounting import, keep the extracted file as a reviewed source sheet before reshaping it into the final import format. If the team is importing invoice data into TallyPrime, the garment extraction file may need column renaming, ledger mapping, voucher type decisions, and tax ledger alignment before import. The clean extraction file is still valuable because it gives the reviewer a structured, traceable base instead of forcing them back into each PDF.
For repeat distributor formats, save the prompt wording that worked well and reuse it next month. Speed comes from a stable row grain and field set; the review checklist stays the same.
Extract invoice data to Excel with natural language prompts
Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.
Related Articles
Explore adjacent guides and reference articles on this topic.
Amazon, Flipkart & Meesho Invoices to GSTR-1
Build a GSTR-1-ready Excel workflow for Amazon MTR, Flipkart Sales Reports, Meesho TCS reports, and marketplace tax invoice PDFs.
GSTR-1 Table 12 HSN Summary from Sales Invoices
Build the GSTR-1 Table 12 HSN-wise summary from PDF sales invoices: extract HSN, UQC, quantity, taxable value and tax, then pivot and split B2B/B2C.
Pharmacy Purchase Invoice to Excel: Batch & Expiry
Extract Indian pharmacy stockist invoices to Excel: one row per line item with batch, expiry as a date, MRP, PTR, HSN and GST for purchase entry.