Extract Apparel Commercial Invoices for HTSUS Classification

How US apparel importers extract HTSUS classification fields from supplier commercial invoices: fiber composition, knit vs woven, shell vs lining, and origin.

Published
Updated
Reading Time
30 min
Topics:
Industry GuidesApparelUSHTSUS classificationcommercial invoiceimport compliancecustoms entry

Eighty PDFs sit in the inbox: thirty from a Vietnam knitwear factory, thirty from a Bangladesh woven-bottoms vendor, twenty from a Cambodia outerwear supplier. The CBP entry filing is due in two days. The generic OCR tool already pulled invoice numbers, totals, and SKUs cleanly. What it missed is the field that decides whether half the lines belong in HTSUS Chapter 61 or Chapter 62, the field that splits a 16% duty rate from a 32% one on a women's dress, and the field that 19 CFR 141.89 specifically requires for textile wearing apparel: the fiber composition broken out separately for the outer shell and the lining.

To extract apparel commercial invoice for HTSUS classification, the per-line output must carry fiber composition with shell and lining as separate fields, knit-vs-woven construction, country of manufacture, FOB unit value, and a garment description granular enough to land the 10-digit US tariff suffix. The shell-vs-lining fiber breakdown is mandated by 19 CFR 141.89 for textile wearing apparel and drives the choice between HTSUS Chapter 61 (knit garments) and Chapter 62 (woven garments). When a supplier types a 6-digit HS code on the invoice, that code is advisory only — CBP requires the importer to classify from the underlying source data, and the supplier's number carries no binding weight at entry summary.

That gap, between what the supplier prints and what CBP needs to see on the entry, is the work. The rest of this article walks the field set CBP demands under 19 CFR 141.86 and the textile addenda at 141.89, the two distinct invoice layouts apparel manufacturers actually send, the decision tree the extracted fields feed from chapter down to statistical suffix, the layered May 2026 tariff stack that has elevated the cost of a wrong classification, the apparel-specific edge cases that catch importers, and the three handoff targets the normalised file produces — ABI self-filing, broker handoff packet, or in-house pre-classification review.

The CBP Field Set Your Extraction Must Produce

Before the supplier PDFs come anywhere near a classifier, the extraction has to land a fixed column shape on the per-row output. The general commercial invoice data extraction fields under 19 CFR 141.86 — invoice number, date, terms of sale, supplier and consignee names and addresses, currency, gross totals — are necessary but not sufficient for apparel. The textile addenda at 19 CFR 141.89 sit on top of that base and add the fields that actually drive the HTSUS classification work.

The field set the per-row output must carry, line by line:

  • Fiber composition by percentage weight, with each fiber present at five percent or more disclosed individually. This is the central textile-specific field, and the foundation for every classification decision below the heading. 19 CFR § 141.89 textile invoice requirements requires textile commercial invoices for wearing apparel to disclose component fiber percentages by weight for fibers present at five percent or more, with separate breakdowns of the fibers in the outer shell and in the lining. Fibers below the five-percent threshold can be reported as "other fibers" in aggregate; fibers at or above five percent must be named individually with their weight percentage.

  • Separate shell and lining breakdowns for any garment where shell and lining are distinct constructions. Lined jackets, blazers, swimwear with built-in lining, lined dresses, most outerwear, and many women's woven separates all fall here. The supplier often prints a single composite percentage covering the whole garment ("100% polyester"); that value is wrong for classification purposes when the lining is a different fiber or construction. The extraction must read the underlying detail (often in a fabric-content footnote, a packing-list addendum, or a manufacturer affidavit attached to the invoice) and split shell and lining into two distinct fields on the output. This is the highest-leverage extraction problem in apparel, and the field most often missed by generic invoice-OCR tools that treat fiber composition as a single string.

  • FTC manufacturer Registered Identification Number (RN). The RN is issued by the Federal Trade Commission under 16 CFR Part 303 and identifies the firm responsible for the textile product on the US side. It is required on the garment's care label and is normally printed on the supplier's invoice or in the supplier's master data. The RN supports the textile labeling rules; it does not substitute for CBP's separate requirement to identify the manufacturer (the actual cut-and-sew factory) on the entry, but the two fields are commonly conflated by suppliers. Capture the RN as its own field; capture the manufacturer name and address as separate fields.

  • Country of manufacture at the level of physical garment assembly. CBP wants the country where the cut-and-sew operation happened, not the country of the fabric mill, the yarn spinner, or the brand's headquarters. Suppliers routinely confuse this — a Hong Kong-domiciled trading company may print "Hong Kong" as country of origin on an invoice for a garment cut and sewn in Cambodia, or a Mexican distributor may print "Mexico" when the garment was actually produced in Honduras under a joint program. The extraction should default to the cut-and-sew country when the invoice carries it explicitly and flag any line where origin is ambiguous so the importer can reconcile against the country-of-origin marking on the garment itself or against the manufacturer affidavit.

  • Garment description granular enough for a 10-digit HTSUS suffix. "Ladies' top" does not get there. "Women's knit pullover, 100% cotton, machine-knit, long sleeve, no lining, no ornamentation" gets there. The description has to carry the cues classification depends on: construction (knit, woven, knit-to-shape), gender and age category (men's, women's, boys', girls', babies'), garment type (pullover, T-shirt, dress, blouse, suit, ensemble, jacket, trouser), sleeve length, neckline, fabric weight or denier where it matters, lining presence, ornamentation type and placement, and any feature that drives a subheading split (water-resistance for outerwear, knit-to-shape construction, pile or terry surface). The extraction should preserve the supplier's own description verbatim and add a normalised description field where useful cues can be tagged.

  • FOB unit value per line. Customs values are calculated on transaction value, and FOB at the named port of export is the cleanest base. Suppliers regularly mix Incoterms across a single invoice — line items quoted FOB on the body, freight and insurance broken out separately, with a CIF or DDP grand total at the bottom — or quote everything CIF inclusive. The extraction has to identify which Incoterm governs each line and isolate the FOB component. Where the supplier quotes only CIF or DDP, the extraction captures the as-stated value and flags the line for back-calculation by the importer or broker. The full picture on Incoterms handling sits in the broader treatment of FOB and Incoterms on commercial invoices, but for this field the rule is simple: never collapse FOB and CIF into a single column.

  • Quantity per size variant, with pack/packaging detail preserved. Apparel ships in pieces; the entry summary line carries quantity in pieces. Where the supplier ships pre-pack assortments (fixed size ratios per shipping carton) the extraction must either expand the assortment into per-size rows or capture the assortment ratio as an explicit field — collapsing the ratio loses the per-size detail the entry summary line wants.

Two additional rules constrain the field set. First, the invoice must be in English, or accompanied by an accurate English translation that carries the same field-level detail; a non-English invoice without a parallel English version is non-compliant for entry. Second, virtually all textile shipments require formal entry regardless of value — the $250 informal-entry threshold that applies to many other commodities does not apply to textiles, so the field set is mandatory even on low-value sample shipments.

A note on construction. Knit-vs-woven is part of the garment description, but the supplier's description often does not state it explicitly — the supplier may simply write "polo shirt" or "blouse" and assume the construction is obvious. The extraction has to capture construction as its own derived field even when the supplier omits it, either by reading construction cues from the description (jersey, interlock, single-knit, fully fashioned, woven, twill, satin, oxford) or by flagging the line for human review when the cues are absent. Construction is what determines Chapter 61 versus Chapter 62; an output that has fiber composition right but construction blank is not finished.

The Two Apparel Invoice Layouts and How to Normalise Them

Apparel commercial invoices arrive in one of two structurally distinct shapes. The extraction work is fundamentally different per shape, but the per-row output schema is the same regardless of which layout came in.

Layout A: size-matrix-across-columns. One row per style and colour combination. Size labels (XS, S, M, L, XL, XXL, or numeric sizes such as 0–14 for women's, 28–40 for men's bottoms) run across the row as column headers. The cells under those headers carry quantities. FOB unit value typically appears once per row, on the right side of the matrix or in a totals column, implicitly applying to every size variant on that line. Fiber composition and country of manufacture are usually stated once per row in adjacent columns or in a separate fabric-content footnote keyed to the style code. This is the dominant layout from larger Asian manufacturers — Chinese, Vietnamese, Bangladeshi, and Cambodian apparel factories almost always invoice this way — because it mirrors how the production floor packs the goods (one carton-line per style, sized within the carton).

Layout B: one-row-per-size-variant. One row per style, colour, and size combination. Quantity is a column on the row; FOB unit value is a column on the row. The layout is taller and more verbose for the same shipment. Smaller manufacturers, EU and Turkish suppliers, and many Latin American vendors (Mexico, Honduras, Guatemala, El Salvador) default to this shape. It is easier to extract directly because each row already carries the per-size detail the output needs, but it produces longer invoices and requires careful handling when pre-pack assortments are involved — the supplier may collapse a pre-pack into a single line with quantity in packs rather than pieces.

Whichever layout came in, the per-row output is one row per style/colour/size combination. The schema is fixed:

  • style or SKU
  • colour
  • size
  • quantity (in pieces)
  • FOB unit value
  • fiber composition (shell as separate field, lining as separate field)
  • country of manufacture
  • FTC RN
  • supplier-quoted HS, captured for reference and flagged as advisory

For Layout A, normalisation means pivoting the size matrix into long format. Each cell in the size matrix becomes a row on the output, with the FOB unit value, fiber composition, country of manufacture, and FTC RN propagated across all the size variants for that style/colour line. For Layout B, normalisation is mostly a pass-through, with one verification step: the FOB unit value column has to be confirmed as FOB and not CIF, DDP, or LDP, against the Incoterms stated on the invoice or in the supplier's standing terms. Where the supplier mixes Incoterms across the invoice — some lines FOB, some CIF, freight and insurance lines broken out separately — the extraction has to read each line's Incoterm and isolate FOB cleanly.

The OCR pitfall on Layout A is structural. Generic invoice-OCR tools handle invoice headers and totals well because the data is laid out in predictable label-value pairs. They fail on size-matrix bodies because the body is a table whose cells are read by spatial position, not by adjacent labels — and apparel size matrices typically have narrow size columns, faint or absent gridlines, and densely packed numeric cells with no separators. A tool that reads the row text linearly drops or merges cells. A tool that detects the matrix structure but cannot tell that the column headers are size labels (rather than units, dates, or other quantities) attaches the quantities to the wrong field. The extraction has to read the matrix as a matrix — recognising that XS through XXL across the top are size dimensions of the same product line — and pivot accordingly.

Pre-pack assortments are the second consistent friction point. A supplier ships in fixed size ratios — 1-2-2-2-1 across S-M-L-XL-XXL per pack, or 1-1-2-2-2-1-1 across a fuller range — and the invoice may state the assortment ratio in a header note rather than expanding it on the body. The extraction either expands the assortment into per-size rows (multiplying pack count by the ratio for each size) or captures the assortment ratio as an explicit field on the row. Either approach is acceptable; the choice depends on whether the broker's ABI software accepts assortment data on the entry summary line or whether it needs per-size pieces. Confirm with the broker before committing to one shape.

In practice, the layout-normalisation problem becomes a one-time prompt definition rather than an invoice-by-invoice cleanup task. Invoice Data Extraction takes a batch of supplier PDF apparel invoices and an extraction prompt — written in plain English by the customs-ops staffer — that describes the per-row schema, the size-matrix expansion logic, and the FOB-isolation rule. The same prompt runs against the whole batch, applying the rules consistently across every invoice regardless of which supplier sent it, and produces the normalised Excel, CSV, or JSON file with one row per style/colour/size combination. The work shifts from per-invoice cleanup to per-supplier prompt refinement when a new supplier's layout appears, and the batch can run unattended once the prompt is stable.

The HTSUS Decision Tree the Extracted Data Drives

The classification work that follows extraction is a sequence of decisions, and each decision points back to a specific field on the per-row output. Walking the tree from top to bottom is the cleanest way to see why the field set is shaped the way it is.

Knit vs woven — Chapter 61 vs Chapter 62. The first major fork. Chapter 61 covers articles of apparel and clothing accessories, knitted or crocheted. Chapter 62 covers articles of apparel and clothing accessories, not knitted or crocheted — woven, in practice, plus a handful of non-knit non-woven constructions. The construction field on the per-row output drives the chapter selection directly. When the extraction has had to derive construction from descriptive cues rather than from an explicit supplier statement, this is the decision where that derived value bites; a wrong construction call here misroutes the rest of the classification down a different chapter. The apparel HTS Chapter 61 vs 62 from invoice question is the load-bearing one. Chapter 63 (other made-up textile articles) catches a small number of apparel-adjacent items such as certain shawls and accessories, but the apparel mass sits firmly in 61 and 62. Importers whose lines run into Chapter 64 instead — outer sole material, upper material, athletic-vs-non-athletic, FOB tier — should see the parallel treatment for footwear commercial invoice extraction for HTSUS classification, which walks the Chapter 64 field set and decision tree on the same shape.

The 85% fiber rule for the heading. Once the chapter is set, the extracted fiber composition determines the heading. When a single fiber accounts for 85% or more of the weight of the garment, the heading is determined by that fiber. A knit pullover that is 90% cotton lands at 6110.20 (sweaters, pullovers, sweatshirts, waistcoats and similar articles, knitted or crocheted, of cotton). A woven men's shirt that is 100% cotton lands at 6205.20 (men's or boys' shirts, of cotton). The heading rule is mechanical when the 85% threshold is met; the extraction's job is to make the fiber percentages unambiguous on the output so the rule applies cleanly.

Essential character determination when no single fiber reaches 85%. Most apparel blends sit below the 85% threshold. A 60% cotton / 40% polyester knit, a 55% wool / 45% acrylic knit, a 70% rayon / 30% nylon woven — none of these have a single fiber that takes the heading by force. The General Rules of Interpretation, particularly GRI 3(b), govern: the heading is determined by the fiber that imparts essential character to the garment. For most apparel that is the chief component by weight, but the analysis is more textured for blends with similar weight shares, for textile composites, and for cases where the visible surface fiber differs from the underlying ground. The classifier works from the per-row fiber breakdown the extraction produced; if shell and lining were collapsed into a single composite percentage, the essential-character determination is being made on the wrong number.

Gender, age category, and garment type. Both Chapter 61 and Chapter 62 subdivide first by gender and age (men's or boys', women's or girls', babies') and then by garment type — sweaters, T-shirts, dresses, suits, ensembles, jackets, trousers, shirts, blouses, skirts, and so on. The extracted garment description is what pins these dimensions down. A description that reads "women's woven dress, sleeveless, 70% rayon / 30% polyester, lined, no ornamentation" lands the gender, the construction, the heading-level garment type, and a pointer toward the lining-driven subheading. A description that reads "ladies' dress" leaves the classifier to fill in the rest from inference, which is where errors enter.

Subheading drivers. The four-digit heading splits into six-digit subheadings, and the extracted fields keep doing work at this depth. Lining presence and lining fiber drive subheading selection on jackets, blazers, and many women's wovens — which is precisely why 19 CFR 141.89 mandates the separate shell-and-lining breakdown in the first place. Ornamentation drives subheading splits on women's wovens when lace, embroidery, sequins, or similar ornamentation appears above HTSUS-defined thresholds; the extraction has to capture ornamentation cues in the description for the subheading determination to land. Water-resistance for outerwear is its own subheading split under HTSUS Additional U.S. Note 2 to Chapter 62, and qualifying water-resistance is established by a specific spray test, not by a marketing claim — the extraction captures any water-resistance attribute the supplier states, and the classifier verifies. Construction details such as knit-to-shape, pile, terry, or specific weave types continue to slice subheadings further.

The 10-digit US statistical suffix. The final two digits are the US-specific statistical breakout, sitting below the international six-digit HS subheading. The suffix captures attributes that the global HS does not — for example, a garment's specific cotton-vs-MMF blend at a finer level, the size range (women's regular vs misses' vs juniors' on certain headings), the specific subcategory of outerwear (anorak vs windbreaker), or the specific category of knit pullover (men's vs boys' under cotton). The extracted field set has to carry enough specificity for the suffix to land — which is why "ladies' top" fails and "women's knit pullover, 100% cotton, machine-knit, long sleeve" succeeds.

The authoritative references the classifier consults. This article is not the classification doctrine; CBP's Informed Compliance Publications are. CBP's "Classification: Apparel Terminology Under the HTSUS" covers the heading-level vocabulary and the gender, age, and construction distinctions. CBP's "Classification of Knit-to-Shape Garments under HTSUS" (ICP083) covers the knit-to-shape rules specifically. USITC's HTSUS Chapters 61 and 62 in the current revision carry the actual headings, subheadings, and statistical suffixes with their corresponding duty rates. The job here is to show how the extracted fields feed the tree; the ICPs and the HTSUS chapters are what the classifier opens when the actual decision has to land.

A worked example brings the thread together. A women's pullover invoice line comes through as: 70% cotton / 30% polyester, knit construction (jersey), no lining, made in Vietnam, FOB $4.20 per piece, quantity 1,200 pieces across S-M-L-XL. The extraction lands those fields cleanly on the per-row output. The classifier reads the construction (knit) and goes to Chapter 61. The garment type (pullover) points at heading 6110 (sweaters, pullovers, sweatshirts, waistcoats and similar articles, knitted or crocheted). The 85% fiber rule fails — neither cotton at 70% nor polyester at 30% reaches the threshold — so GRI 3(b) governs, and cotton imparts essential character as the chief component of a knit pullover. That lands subheading 6110.20 (of cotton). Gender (women's) and the US statistical breakout produce the 10-digit suffix per the current HTSUS revision. Every decision point in that chain pointed at a field the extraction produced; the classifier's judgment is what reads those fields against the chapter rules.

Why Classification Accuracy Matters More Right Now (As of May 2026)

As of May 2026, US apparel importers are paying duty in three layers. Getting the classification wrong does not cost the MFN rate spread alone; the error compounds across every layer above it.

MFN base rates. Under HTSUS Chapters 61 and 62 in the current revision, apparel duty rates fall between roughly 10% and 32% by fiber composition and garment type. The specific rate is what the classification produces — and small classification differences produce large rate differences. A women's woven dress that is correctly classified as 100% cotton (6204.42) sits at one MFN rate; the same garment misclassified at a polyester subheading (6204.43) lands at a meaningfully higher rate. Multiplied across a six-figure-quantity import, the spread becomes real money.

Section 301 List 4: +7.5% on Chinese-origin apparel. USTR's Section 301 List 4 imposes a 7.5% additional duty on a wide range of Chinese-origin goods, including most apparel under Chapters 61 and 62. The List 4 layer stacks on top of the MFN rate. The country-of-manufacture extraction directly determines whether a line carries the Section 301 add-on — which is why the country-of-origin reconciliation from supplier invoices work matters at the extraction stage, not after entry. A line that should have carried "China" as country of manufacture but was extracted as "Hong Kong" understates the duty stack; a line genuinely produced in Vietnam or Bangladesh that gets miscoded as Chinese overstates it. CBP's audit-trail expectation runs back to the underlying invoice and the manufacturer affidavit.

Section 122 stop-gap: +10% on most imports. Section 122 of the Trade Act has been implemented as a 10% additional duty on most imports, in effect through approximately 24 July 2026. The Section 122 layer stacks on top of MFN and Section 301. This is the headline freshness fact in May 2026 and the immediate reason classification stakes are elevated: a misclassification on a Chinese-origin women's woven dress now compounds the MFN rate error with the +7.5% Section 301 layer and the +10% Section 122 layer on top. A mis-stated chapter that would have cost a few percent in 2018 now costs that few percent plus the layered margin.

The 21 February 2026 Supreme Court ruling on IEEPA tariffs. The Supreme Court on 21 February 2026 struck down tariffs imposed under the International Emergency Economic Powers Act, closing one of the paths the prior administration had used to add layered duties on imports. Section 122 was implemented as the stop-gap response after that ruling — it is the bridge tariff while the trade-policy posture is reorganised under statutes the Court has not invalidated.

The March 2026 Section 301 investigations. USTR opened new Section 301 investigations in March 2026 that may produce successor duties replacing Section 122 when it expires. This is live policy in motion, not a forecast. The successor duties may target specific country-of-origin patterns, specific HTSUS chapters, or specific product categories; until USTR publishes the final action, the shape is unsettled. What is settled is that the post-Section-122 environment is unlikely to revert to the pre-2024 baseline.

The implication for the classification work is direct. A line that should have been Chapter 62 (woven) but was filed as Chapter 61 (knit), or vice versa, now carries three compounding errors: the MFN rate is wrong, the Section 301 application may be wrong (because some apparel categories under Section 301 are heading-specific), and the Section 122 base on which that error sits has scaled the entire duty bill up by 10%. A CBP Form 28 (Request for Information) or Form 29 (Notice of Action) issued against the entry forces a post-summary correction (PSC) and a duty true-up — but the layered duty environment means the true-up is larger than the same correction would have been two years ago. The classification has to be right on filing because the layered duty stack has amplified the cost of every error against the entry.

The same compounding applies downstream. Per-SKU apparel landed cost per SKU from pre-pack assortments calculation depends on duty paid as a major component, and a mis-stated duty stack at entry pushes the landed cost wrong on every reorder, every margin analysis, and every allocation of carrying cost that flows from the per-SKU number. Getting the entry right at the extraction stage holds the financial-reporting line straight across every downstream consumer of that data.

A practical note on staying current. Section 122 is a statutory measure with a defined expiry; the March 2026 Section 301 investigations have their own statutory timeline. Verify current state at the CBP Trade Remedies page or via the Congress.gov R48549 consolidated tariff actions tracker before filing on any given entry. Date-stamp the assumed duty stack on the per-row workpaper if the file will be reused for landed-cost analysis or post-summary correction; the snapshot above is May 2026, and the policy will keep moving.

Apparel Edge Cases the Extraction Has to Handle

A handful of apparel-specific structures break the standard extract-then-classify pipeline if the per-row output does not flag them at the extraction stage. The fix in each case is to make the extraction capture the underlying signal so the classifier can apply the right rule downstream.

Sets and ensembles. HTSUS distinguishes between an "ensemble" — a set of matched garments designed and put up for sale together as a coordinated outfit, with specific composition rules in Chapter 61 and Chapter 62 — and a "set" of articles put up together for retail sale that does not meet the ensemble definition. Ensembles classify under specific ensemble headings. Sets classify under GRI 3(b) by reference to the article that imparts essential character. The supplier invoice rarely flags this distinction; a multi-piece SKU might be a true ensemble, a retail set, or just two unrelated items that happened to ship in one carton. The extraction has to capture the line-level grouping — one row per multi-piece SKU with each component as a sub-field, or a parent-child row structure on the per-row output — so the classifier can read the components, apply the ensemble or set test, and route the line correctly.

Ornamentation thresholds for women's woven garments. Chapter 62 carves out specific subheadings for women's wovens with ornamentation above HTSUS-defined thresholds. Lace, braid, embroidery, sequins, beading, and similar ornamentation drive subheading splits when present at threshold levels. The supplier description treats these as decorative shorthand — "lace-trimmed neckline", "embroidered chest panel", "sequin detail at hem" — but for classification they are field values that determine duty rate. The extraction has to capture ornamentation cues from the description as a separate flag on the per-row output, not bury them inside a free-text description column where the classifier may miss them on review.

Knit-to-shape garments under CBP ICP083. Knit-to-shape construction — where a garment or panel is knitted to its final shape rather than cut from a roll of knit fabric — has its own classification rules covered in CBP's ICP083. The supplier description rarely uses the phrase "knit-to-shape" directly. The cues are construction-specific: "fully fashioned" sleeves and bodies, "linked" or "looped" seams, intarsia patterning, jacquard panels knit to garment shape, formed openings without cut-and-sew finishing. The extraction has to capture these construction terms as flags so the classifier can apply ICP083's rules. A line tagged simply "knit pullover" may be a cut-and-sew jersey or a fully fashioned piece, and the two classify differently.

Mixed-construction garments where shell and lining drive different chapters. A woven shell with a knit lining, or a knit shell with a woven lining, is the case where 19 CFR 141.89's separate-breakdown requirement bites hardest. The shell construction usually drives the chapter (the visible outer surface determines the garment's classification character), but mixed-construction cases require the classifier to read both shell and lining and apply the GRI rules with care. If the extraction collapses shell and lining into a single composite percentage — a common failure mode for generic invoice-OCR tools — the chapter split signal is gone before the classifier ever sees the line. Capture shell and lining as separate, parallel field sets on the per-row output every time the supplier's documentation provides them.

Samples and prototypes. Samples of negligible commercial value, prototypes, and pre-production garments may qualify for specific HTSUS treatment under Chapter 98 — most commonly as samples solely for use in taking orders, marked and mutilated where required, or as US Goods Returned for goods previously exported and brought back. The treatment depends on accurate description and value-marking on the invoice, which the supplier may not provide consistently. The extraction has to flag sample lines distinctly on the per-row output (a "line type" column with values for production, sample, prototype, or returned goods is the cleanest pattern) so they are not commingled with commercial production lines on the entry summary. A sample line filed as a production line is a misdeclaration; a production line filed as a sample is duty fraud. Either error is worth catching at extraction.

Supplier-pre-classified HS entries that conflict with CBP guidance. Many overseas manufacturers print a 6-digit HS code on the commercial invoice as a courtesy. The supplier's HS is helpful as a starting point — it shows what the supplier thought the garment was — but it is never binding on the importer. Under CBP's framework the importer is the classifier of record. Capture the supplier-quoted HS as its own column on the per-row output, flagged as advisory. When the supplier's HS conflicts with what the underlying field set supports — say, the supplier quoted 6204.43 (women's woven dresses, of synthetic fibers) but the extracted fiber breakdown shows 80% cotton with no synthetic fiber above five percent — the underlying field set wins. The supplier's number does not survive contact with the actual fiber composition.

The supplier-quoted HS column is what makes pre-classification review tractable as a workflow rather than as a per-line analysis exercise. Sort the per-row output by supplier-quoted HS, batch-review by suspected chapter, and use the sort to surface the lines where the supplier's HS does not match the construction or fiber breakdown captured on the same row. Anomalies bubble to the top of the file; clean lines flow through to handoff. This is the practical advantage of capturing the supplier's HS as a flagged column rather than treating it as either authoritative (which it is not) or as noise (which discards a useful sort key).

Handing the Normalised File Off — ABI, Broker, or In-House Pre-Classification

The per-row file feeds three downstream targets, and each shapes the deliverable slightly differently on top of the same underlying data. Understanding the three shapes is what turns the extraction work into a clean handoff rather than a re-format exercise.

ABI / customs-software import for importer-of-record self-filing. Where the importer is the importer of record and self-files its own entries, the extracted per-row file feeds directly into ABI-certified entry-filing software as the source-of-truth for entry summary line items. The column shape mirrors the entry-summary structure: HTSUS number (filled in by the classifier), country of origin, quantity, unit value (FOB), invoice line reference, manufacturer ID, and the supporting fiber and construction fields retained for audit. CBP Form 7501 (October 2025 revision) is the entry summary the data ultimately populates, line by line; the per-row file is the workpaper that produces the 7501. Maintain the linkage from each row on the file back to the source invoice and the source line — invoice number, supplier, line item — because CBP audit responses run from the entry summary back through the workpaper to the underlying document, and a clean trail collapses what would otherwise be a multi-day reconstruction.

Broker handoff packet for broker self-files on the importer's behalf. This is the most common shape: the importer extracts and normalises, the broker classifies and files. The deliverable is typically an Excel workbook (one tab per supplier invoice, or one consolidated tab keyed by invoice number), the underlying invoice PDFs as supporting documents, and any manufacturer affidavits — FTC RN attestation, country-of-manufacture attestation, USMCA short-supply declarations where applicable. Brokers vary in their preferred column order and naming conventions, but the underlying field set is consistent across them: per-line fiber breakdown with shell and lining separated, construction (knit or woven), country of manufacture, FOB unit value, quantity in pieces, granular garment description, and the supplier-quoted HS as an advisory column. Some brokers also ask for the supplier's full address and the manufacturer ID number on the same row. The broker's customs broker invoice processing automation on the receiving side typically loads the workbook into the broker's own classification and entry-filing system, which is why the column shape and naming need to match what the broker's system expects.

In-house pre-classification review before broker submission. Where the importer's compliance team pre-classifies in-house and gives the broker a clean classified file, the deliverable carries the same field set as the broker handoff packet plus a proposed HTSUS column the broker reviews and confirms. Pre-classification is not a separate process from extraction; it is what the per-row output enables. Sort the per-row file by supplier-quoted HS or by garment type, walk the rows against CBP's apparel ICPs and the current HTSUS chapters, fill the proposed HTSUS column, and surface anomalies — lines where the supplier-quoted HS conflicts with the construction or fiber breakdown, or where the description is too thin to land a confident subheading — for human review. The broker then reviews the proposed classifications, confirms or adjusts, and files. The extracted file is what makes batch pre-classification tractable; without the underlying field set on every row, pre-classification would have to walk back to the source invoice for each line.

The shape across all three targets is fundamentally the same. The extraction produces one normalised per-row file with the classification-relevant fields cleanly separated; the handoff format is a presentation-layer choice on top of that file. ABI software wants one column ordering; a broker wants another; in-house pre-classification wants the same with a proposed HTSUS column added. None of them re-do the extraction work. The article's payoff is here: the supplier-PDF-to-CBP-entry-data conversion is the work, and the work happens once.

That single piece of upstream work is where Invoice Data Extraction fits. The product takes a batch of supplier PDF apparel commercial invoices — up to 6,000 files in a single job, with single PDFs up to 5,000 pages — and a natural-language extraction prompt describing the per-row schema, and produces the structured Excel, CSV, or JSON file with the classification-relevant fields cleanly separated. To extract apparel commercial invoices into structured Excel or CSV is the upstream step; the manual classification work, the broker review, the entry filing itself, and any ABI software interaction stay with the importer's customs-ops staffer or broker. The product does not classify, does not file, and does not integrate with ABI software directly — what it produces is the normalised file the three handoff targets above consume. Same prompt across the batch, every row reference back to the source file and page for verification, and the apparel pre-classification HTS supplier invoice US workflow starts from a clean structured file rather than from a folder of PDFs.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading