A folder of Vietnamese supplier PDFs lands in your inbox on a Monday. Mixed running-shoe and casual-sneaker shipment, twelve styles across three factories, entry filing on Wednesday. Your generic OCR tool runs through the batch and gives you a clean spreadsheet of style numbers, FOB totals, and country of manufacture. None of that gets you to a Chapter 64 entry. The columns CBP needs — outer-sole constituent material with surface-area dominance, upper constituent material with surface-area dominance, gender, FOB unit value bracket, athletic indication, waterproof indication — are not on the supplier's commercial invoice in any form an OCR pass can lift. Your spreadsheet has totals; it does not have a classification.
Extracting Chapter 64 fields from supplier PDFs end-to-end means populating a structured target — the CF 5523 Interim Footwear Invoice column list — from the invoice batch plus targeted manufacturer affidavits where the supplier data is missing. Automated footwear commercial invoice extraction is the upstream step that produces the IFI column shape; the manual Chapter 64 work then begins from that normalized file rather than from PDFs. If you also handle apparel SKUs, the apparel-side counterpart on Chapter 61/62 fiber composition covers the parallel problem with a different column list and a different classification logic.
The CF 5523 Interim Footwear Invoice as the column list you are building
CBP form 5523, the Interim Footwear Invoice, is the structured data target the extraction pass produces. Every broker page tells you the form exists. None walk through how to populate it from what your Vietnamese, Chinese, Indonesian, Cambodian, Indian, Italian, Brazilian, or Mexican supplier actually puts on the commercial invoice. Treat CF 5523 not as a form to fill blindly but as the column list your per-style file is being built toward. The same file then feeds three downstream uses: in-house pre-classification, broker handoff, or ABI import.
The column set, in the order an extraction pass naturally produces them:
- Importer of record (your company name and IRS number)
- Manufacturer name and address
- Country of manufacture
- Style or SKU number
- Full footwear description
- Gender (men's, women's, unisex, boys', girls', infants')
- Outer-sole constituent material with surface-area dominance
- Upper constituent material with surface-area dominance
- Athletic or non-athletic indication
- Waterproof or water-resistant indication
- FOB unit value per pair
- FOB unit value tier (the bracket the value falls into)
- Supplier-quoted HTSUS subheading
- Manufacturer affidavit references for sole and upper composition
What the supplier almost always gives you on the commercial invoice itself: style/SKU, full description, country of manufacture, FOB unit value, and gender when it appears in the description. What the supplier rarely gives you on the invoice: outer-sole and upper material with surface-area dominance, athletic versus non-athletic indication beyond what the description implies, and waterproof or water-resistant indication where it is not in the product name. Those are the columns that drive the heading and the suffix. Those are the columns the extraction has to reach beyond the invoice to fill.
The bridge for the missing columns is a manufacturer affidavit. Most US footwear importers run a standing affidavit template with each factory: per style, the factory states the outer-sole material category and surface-area percentages, the upper material category and surface-area percentages, athletic versus non-athletic call with construction-feature justification, and waterproof or water-resistant statement with the testing standard if any. The affidavit is the authoritative source for the surface-area-dominance columns. The supplier commercial invoice is the source for everything else.
The extraction file should treat the affidavit gap as a workflow column rather than a hidden risk. For each per-style row, flag whether the surface-area-dominance columns came from the affidavit (resolved) or from the invoice description alone (provisional, needs affidavit). That flag is what tells your customs ops team which affidavit requests need to be raised before the broker packet is built rather than caught at entry, when the deadline pressure is highest and the cost of going back to the manufacturer is the highest.
Standardization is the other half of what the extraction does. Suppliers write outer-sole material as "rubber sole", "TPR sole", "PU/TPU sole", "thermoplastic outer sole", "phylon midsole with rubber outsole", and similar variants. The extraction normalizes these into the Chapter 64 outer-sole material category — rubber, plastics, leather, composition leather, textile, wood, or other — while preserving the supplier wording in a separate descriptor column for affidavit cross-reference. The same standardization applies on the upper side: "leather upper", "synthetic upper", "PU upper", "knit textile upper", "engineered mesh upper" map to the categories leather, rubber/plastic, or textile. Inconsistent supplier wording becomes a single canonical value in the per-style row.
Suppliers often add a 10-digit HTSUS code to the commercial invoice. The extraction captures that as a separate column for reconciliation, not as the final classification. The supplier-quoted code is advisory: it reflects how the supplier classified the same style for export, often based on similar styles classified before, and is a useful starting hypothesis for the importer's own determination. It is not authoritative and should not flow through to the entry as a classified row without being reconciled against the determined columns the extraction produces.
Most of this workflow shape is shared with how the generic commercial invoice fields a data extractor should capture on a non-footwear shipment. The Chapter 64 entry adds the IFI addenda — sole and upper surface-area dominance, athletic indication, waterproof indication, FOB tier — that the standard set does not include. From a prompting standpoint, the IFI columns become a list of named fields the extraction is asked to produce per row, with the affidavit-gap flag column included as a prompt-driven instruction for any row missing the surface-area-dominance fields. Once the prompt is written and saved, the same file shape comes out of every shipment without rebuilding the column list.
Extracting outer-sole material under Note 4(b)
The outer-sole column is where the IFI gets hard. Note 4(b) of HTSUS Chapter 64 sets the rule: the constituent material of the outer sole is the material with the greatest external surface area in contact with the ground. Spikes, bars, nails, protectors, and similar attachments are excluded from the surface-area measurement. CBP's Footwear Informed Compliance Publication and the trade-body restatements use this language consistently — when the outer sole is composed of more than one material, the constituent material for HTSUS Chapter 64 classification is the one with the greatest surface area in contact with the ground; spikes, bars, nails, protectors, and thin layers of textile material not embedded in the sole are excluded from the surface-area measurement, per FDRA's Key Footwear Definitions for outer sole constituent material.
That rule does most of the work, but it leaves a single edge case that drives the largest share of disputed footwear rulings: the textile outer sole. A textile-fabric outer sole only counts as the outer sole for classification if it is a separately identifiable component before being applied to the upper, and it has the durability and strength normally required of an outer sole. Otherwise, the next layer below the textile is treated as the outer sole. A thin textile layer cemented to a rubber midsole is not an outer sole; the rubber underneath is. A separately constructed felt or jute sole on a slipper, designed and durable enough to be worn out, is. The line between the two is fact-specific and a frequent source of CBP rulings.
Translate the Note 4(b) rule into extraction columns:
- Outer-sole material category, normalized to one of rubber, plastics, leather, composition leather, textile, wood, or other
- Multi-material indicator, with named percentages where the supplier or affidavit provides them
- Textile-outer-sole flag, set to yes for any row where the outer sole reads as textile fabric, requiring the durability and separately-identifiable-component review before the row can be classified
The flag is the deliverable. The extraction should not attempt to resolve the textile-outer-sole durability test from a supplier description alone — the test turns on construction details (separately constructed sole, durability suitable for outdoor wear, cementing or stitching pattern) that the description rarely captures. Set the flag, hand the row to broker review or to a manufacturer affidavit on sole construction, and let the determination come back as a resolved input.
The supplier-side data sources for outer-sole material rank by authority. The commercial invoice description is the weakest source — usually partial, often "rubber sole" or "TPR/PU outsole" with no surface-area breakdown. The manufacturer's bill of materials is more complete and frequently arrives alongside the commercial invoice or in the same supplier portal. The supplier's product spec sheet, which sits with the buying team, often carries panel-by-panel material call-outs that the commercial invoice omits. The manufacturer affidavit on sole composition, signed by the factory, is the authoritative source and the one the broker review will rely on for the resolved row. The extraction step pulls everything available from the commercial invoice and any spec sheets in the same batch, normalizes the outer-sole material category, and leaves the affidavit reference as the column that gets filled in last.
Standardization on supplier wording is direct. "Rubber", "TPR", "thermoplastic rubber", and "vulcanized rubber" all map to rubber. "PU", "TPU", "EVA", and "phylon" map to plastics. Leather and composition leather are kept distinct rather than collapsed, because they classify into different headings and, more importantly, into different subheadings within 6403 once the heading is selected. Where the supplier names a multi-material outer sole — "rubber outsole with EVA midsole, 70/30" — the extraction captures the percentages directly into the multi-material indicator column, since those numbers turn out to be the surface-area-dominance call.
The carve-outs at extraction time matter for the surface-area baseline. Spikes, cleats, bars, nails, protectors, and similar attachments are excluded from the surface-area measurement under Note 4(b). The extraction should capture these as descriptors in a separate column — useful for the broker review and for athletic-flag reasoning — but should not factor them into the constituent-material category. A studded soccer cleat with a rubber sole and TPU studs has rubber as the constituent material of the outer sole, not TPU, because the studs are excluded.
Extracting upper material and the reinforcement-versus-structural-panel line
The upper rule mirrors the outer-sole rule in shape but is harder in practice. The constituent material of the upper is the material with the greatest external surface area, ignoring accessories and reinforcements. The carve-out is the source of most upper-side classification disputes — what counts as a reinforcement that gets excluded from surface-area measurement, and what counts as a structural panel that is itself part of the upper, is fact-specific and unstable across suppliers.
The working definition CBP rulings have settled into:
- A reinforcement is added to the upper for support over an existing material. A toe cap stitched over a leather toe, an overlay tape at the heel, a rand stitched over an existing textile body. The reinforcement sits over another material that is itself the upper at that location. It is excluded from surface-area measurement.
- A structural panel is itself a constituent piece of the upper. The leather quarter, the textile vamp, the synthetic forefoot panel — each is the upper at the location it occupies, not an addition over another material. Structural panels are included in surface-area measurement.
The line is fact-specific because the same physical piece — a leather overlay across the eyestay — can be a reinforcement on one shoe and a structural panel on another, depending on whether there is a continuous material underneath it. CBP rulings on this line are where the reinforcement-versus-panel disputes get litigated, and they are why the upper column is rarely cleanly resolvable from a supplier description alone.
Translate the rule into extraction columns:
- Upper material category, normalized to one of leather, textile, rubber/plastic, or other
- Multi-material indicator with named percentages where available
- Named accessories and reinforcements visible from the supplier description or the product spec sheet (eyelets, toe caps, heel overlays, lace tabs, decorative stitching, padding inserts)
- Reinforcement-versus-panel flag, set for any row where the supplier describes overlays, panels, or material combinations that cross the line
The flag does the same job as the textile-outer-sole flag in the prior section: it surfaces ambiguous calls for broker review or manufacturer affidavit before the row is classified, rather than asking the extraction to resolve a fact-specific CBP-ruling determination from supplier text.
Supplier-side data sources for the upper material follow the same ranking. The commercial invoice description usually names the dominant material — "leather upper" or "textile upper" — but rarely the percentage breakdown, the panel-by-panel material call-outs, or the specific reinforcement components. The manufacturer's bill of materials and the product spec sheet often include panel-by-panel descriptions, particularly for athletic and lifestyle footwear where the upper is engineered from multiple textile and synthetic materials. The manufacturer affidavit on upper composition, signed by the factory and naming each panel with its material and surface-area percentage, is the authoritative source.
The apportioning problem on multi-material uppers is where the heading shifts. A combination upper that is 55% leather and 45% textile sits in 6403; the same shoe at 45% leather and 55% textile sits in 6404. The duty-rate spread between the two headings is often material on a meaningful shipment volume, which is why the named-percentage column is the column the broker review most often asks for. The extraction should capture the named percentages where the supplier provides them on the invoice or in the spec sheet, and should flag the row for affidavit-driven apportioning where the description is qualitative ("leather/textile combination upper") without numbers.
Standardization on the upper side has one critical distinction the extraction must preserve: leather versus PU. Real leather upper classifies into 6403 (and into the leather-upper subheadings below it). PU upper, often described by suppliers as "synthetic leather", "PU leather", or "vegan leather", classifies into 6402 or 6404 depending on the upper's other materials and the sole. The two read similarly in supplier descriptions and they are sometimes used interchangeably in supplier wording, but they classify into different headings. The extraction should keep "leather" and "PU/synthetic leather" as separate normalized values rather than collapsing them, and should flag any row where the supplier wording is ambiguous between the two.
Textile types — woven, knit, non-woven — do not change the heading at the 6-digit level. All three are textile uppers under 6404. They can affect statistical reporting and, for some footwear types, the 10-digit suffix selection within 6404. The extraction should preserve the textile type as a descriptor column rather than collapsing all textile uppers into a single value, since the broker may need it later in the workflow.
The 6-digit heading ladder from 6401 to 6405
With the outer-sole and upper material columns populated, the 6-digit heading is determined. Chapter 64's heading ladder runs:
- 6401 — waterproof footwear with outer sole and upper of rubber or plastics, where the uppers are neither fixed to the sole nor assembled by stitching, riveting, nailing, screwing, plugging, or similar processes. In practice, this is the one-piece molded or vulcanized waterproof boot heading — galoshes, rubber rain boots, classic molded duck boots.
- 6402 — other footwear with outer sole and upper of rubber or plastics. Most rubber-soled synthetic-upper sneakers, casual sandals, and non-waterproof rubber/plastic footwear land here.
- 6403 — footwear with outer sole of rubber, plastics, leather, or composition leather and upper of leather. Dress shoes, leather boots, and leather-upper athletic shoes sit in this heading.
- 6404 — footwear with outer sole of rubber, plastics, leather, or composition leather and upper of textile materials. Most athletic shoes with engineered mesh or knit textile uppers, canvas sneakers, and textile-upper casual shoes sit here.
- 6405 — other footwear. House slippers, footwear with outer soles of wood or cork, and miscellaneous constructions that do not fit 6401 to 6404.
The decision logic the extracted columns drive is direct. If both outer sole and upper are rubber/plastic and the shoe is waterproof under the heading definition, the row is 6401. If both are rubber/plastic and not waterproof, 6402. If the upper is leather (regardless of whether the outer sole is rubber, plastic, leather, or composition leather), 6403. If the upper is textile (with the same outer-sole flexibility), 6404. Anything that does not fit one of those four — wood or cork outer soles, house slippers, unusual constructions — falls into 6405. The waterproof flag is what splits 6401 from 6402, which is precisely why the IFI carries waterproof as its own column rather than burying it in the description.
The heading ladder explains why the IFI is structured the way it is. Two material columns and one waterproof flag pick the heading. None of the rest of the IFI columns — gender, athletic flag, FOB tier — affects the heading; they all affect the 10-digit suffix below it.
Foxing-band and welt language drive subheading selection rather than heading selection — a foxing-like band on a 6404 textile-upper-rubber-sole shoe routes to different subheading rates, and a Goodyear-welted boot in 6403 sits in different subheadings than a cement-construction leather boot. Preserve the supplier's foxing-band and welt language as a descriptor column and let the broker review apply the rule at the subheading level.
The supplier-quoted HS code's role at the heading level is reconciliation, not adoption. Suppose the supplier quotes 6404.11.20 on the commercial invoice for a shoe whose extracted upper material reads as leather. The 6404 heading requires a textile upper. The discrepancy is a flag the extraction surfaces immediately: either the supplier-quoted code is wrong, or the upper material extraction is wrong (perhaps the supplier described "leather upper" colloquially when the actual material is PU/synthetic leather). The broker review resolves the discrepancy from the manufacturer affidavit and the product spec, and the importer's determination — not the supplier's — flows through to the entry. The supplier-quoted code is a starting hypothesis; the extracted columns are what the importer is actually responsible for.
House slippers and footwear with outer soles of wood or cork are the most common 6405 rows in a US footwear importer's mix. Indoor-only constructions with felt or jute outer soles fall into 6405 even when the upper is leather or textile, because 6405 is the catch-all for outer-sole categories that fall outside the rubber/plastic/leather/composition-leather set the other headings require. The extraction should carry a "house slipper" or "indoor-only footwear" flag drawn from the supplier description, since the heading determination is driven as much by the supplier's stated use as by the material columns themselves at the 6405 boundary.
Athletic, gender, FOB tier, and waterproof as 10-digit suffix columns
Once the heading is determined, the 10-digit suffix selection is what distinguishes a duty rate of 8.5% from one of 37.5%. Four columns drive the suffix layer, and all four read as separate extraction columns rather than text embedded inside HTS lookup-tool subheading descriptions: athletic versus non-athletic, gender, FOB unit value bracket, and waterproof versus water-resistant.
Athletic versus non-athletic. The athletic determination is the most disputed of the four because it splits the duty rate inside 6404 (and to a lesser degree inside 6402 and 6403) by a meaningful spread. Within 6404, an athletic shoe with a textile upper and rubber outer sole sits in athletic-suffix subheadings; a similar-looking lifestyle sneaker without the construction features sits in general-purpose-suffix subheadings, often at a different rate. CBP rulings draw the line on construction features rather than on the supplier's product name. A cleated outsole, ankle support, lace-eyelet count, sport-specific design (running, court, training, trail) — these are what CBP looks at. "Athletic-style" or "athleisure" in the supplier description is not enough on its own.
The extraction should pull two separate columns:
- The supplier's stated use, captured as written ("running shoe", "casual sneaker", "court shoe", "trail runner", "lifestyle sneaker", "athleisure")
- An athletic flag set to yes only where construction features in the supplier description or the spec sheet support the call (cleated outsole, named sport, performance midsole)
For the ambiguous middle — a "court shoe" with a cup-sole construction, or a "running-style" shoe with a flat outsole — the broker-review flag column carries the row through to a manual call rather than committing the extraction to athletic versus non-athletic on insufficient signal.
Gender. Gender drives suffix selection across most 6402, 6403, and 6404 subheadings. The extraction pulls "men's", "women's", "unisex", "boys'", "girls'", or "infants'" from the supplier description, the size run, and the style number where the gender prefix is encoded. Suppliers are inconsistent here. A description of "EUR 36-46 mixed sizes" usually means a unisex assortment running from women's to men's, but the gender call requires interpretation against the size run rather than the description alone. Infants' footwear often arrives as a small assortment within a larger style — five pairs of infants' size 18 inside a 200-pair shipment of toddlers' and children's sizes — and the extraction should carry the size-run column alongside the gender column so the broker review can verify the split.
FOB unit value bracket. The FOB unit value column drives suffix bracket selection at thresholds that have historically included $3, $6.50, $12 per pair, and similar boundaries within several subheadings. The extraction captures the FOB unit value per pair from the invoice. The reclassification risk is the bracket boundary: when a supplier discount, a quantity-discount tier, or a year-end rebate on the invoice pushes the unit value below a bracket boundary, the suffix changes and the duty rate changes with it. Pulling both the gross FOB unit value and any per-pair discount applied — separately — into the per-style row lets the broker review confirm which bracket the entered value falls into and whether a downstream supplier-rebate adjustment will require a post-summary correction.
The Incoterms framing on the supplier invoice affects this column directly. The IFI needs the FOB value, which is the unit value at the named loading port before freight and insurance. When a supplier invoice is quoted CIF (cost, insurance, freight) or DDP (delivered duty paid), the invoice unit value includes freight, insurance, and sometimes duty already, none of which belong in the FOB figure CBP wants. Backing out those layers requires the freight and insurance lines from the invoice or a supplier breakdown, and the Incoterm itself stated on the invoice header. The extraction should capture the Incoterm explicitly, the invoice-stated unit value, and any visible freight or insurance lines so the broker review can derive the FOB unit value cleanly. If you process under multiple Incoterms across suppliers, how Incoterms wording on the commercial invoice affects FOB unit value covers the derivation problem in more detail.
Waterproof and water-resistant. Waterproof is the column that splits 6401 from 6402 at the heading level, and at the 10-digit level it also distinguishes water-resistant suffixes within several headings. The supplier description carries the signal: "waterproof", "water-resistant", "GORE-TEX-lined", "membrane-lined", "seam-sealed", "treated leather". The extraction normalizes these into a three-value flag — waterproof, water-resistant, or neither — and preserves the supplier wording in a separate descriptor column. Waterproof determinations turn on construction and testing rather than supplier marketing language, so where supplier wording is partial or marketing-led, the row gets the broker-review flag rather than an auto-yes call.
Sized-for-adults versus youth versus infants enters as a fifth dimension in some subheadings, derived from the size run rather than from a separate column. The extraction's size-run column feeds this — a per-style row with sizes 1Y to 6Y reads as youth, sizes 36 to 46 reads as adult, sizes 0 to 5 (infants' US) reads as infants — and the broker review applies the size cut-off to the suffix selection.
Suffix accuracy is more expensive when wrong in 2026 than it was a few years ago because Chapter 64 MFN rates (roughly 0–37.5%, most subheadings between 8.5% and 20%) now stack with Section 301 List 4A on Chinese-origin shoes and a Section 122 stop-gap layer on most imports. Verify the current layers against the USITC Harmonized Tariff Schedule and USTR's Section 301 page before filing. A wrong athletic flag, gender, FOB tier, or origin call on a 10,000-pair container compounds across every layer above the MFN rate — and the resulting per-SKU error flows downstream into landed-cost reporting, the same way per-SKU landed cost from pre-pack apparel assortments describes on the apparel side.
The three handoff targets and the broker-review flag column
The same per-style extraction file feeds three different downstream uses, and the IFI column shape is designed to serve all three without rebuild.
Broker handoff Excel. The customs broker receives the per-style extraction as an Excel file alongside the original supplier invoice PDFs and any manufacturer affidavits, with one row per style/SKU and the IFI columns populated. The broker review-flag column is what drives which rows the broker classifies and which the importer has already classified internally. Resolved rows pass through; flagged rows get the broker's attention. This is the same handoff pattern broker-side commercial invoice processing for customs entry filing describes in generic form, with the footwear-specific column set carrying the IFI addenda the broker needs for Chapter 64.
ABI or customs-software import file. For importers of record running self-filing through ABI or a customs-software platform, the same per-style file feeds the line-level entry data. The 10-digit HTSUS code per style is the key field; the supporting columns — sole material category, upper material category, FOB tier, athletic flag, waterproof flag, gender, country of manufacture — feed the supporting documentation submission and the post-entry audit trail. CBP audit requests on Chapter 64 entries routinely ask for the surface-area-dominance basis behind the heading determination; carrying those columns alongside the HTSUS code in the entry file means the audit response is already assembled.
In-house pre-classification or PSC workpaper. For importers running pre-classification reviews ahead of entry, or working a post-summary correction after entry has cleared, the same file format serves as the workpaper. Each row carries the IFI columns plus a classification-rationale field drawn from the affidavit references and any broker-review notes. A PSC workpaper that reuses the same extraction shape as the original entry file makes the correction traceable line-by-line to the supporting evidence.
The broker-review flag column is what makes the same file work for all three handoffs. The flag carries the edge cases the extraction should hand off rather than auto-resolve:
- Footwear with uppers of multiple materials where panel-area apportioning is needed and the supplier did not provide percentages
- Textile outer soles requiring the durability and separately-identifiable-component test
- Reinforcement-versus-structural-panel ambiguity on the upper
- House slippers versus outdoor footwear where the supplier description is ambiguous between indoor and outdoor use
- Waterproof boot versus water-resistant boot where the supplier wording is partial or marketing-led
- Ski boots and other specialised athletic footwear with construction-specific subheadings (motorcycle boots, ice-hockey skates with attached blades, ski-boot constructions)
- FOB unit value within $0.10 of a bracket boundary, where a supplier discount or a quantity-tier rebate could push the row into the next bracket
- Supplier-quoted HS code that diverges from what the extracted columns indicate
The flag is a column rather than a row exclusion. A flagged row still produces a structured entry — the supplier-side data is extracted, the provisional material categories populated, the supplier-quoted HS captured — and the flag tells the downstream consumer (broker, ABI filer, pre-classification reviewer) where the determination needs human attention before the entry is committed. With the same saved prompt running against every shipment, the team's effort goes into the rows that need attention rather than into rebuilding the column list each time.
Extract invoice data to Excel with natural language prompts
Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.
Related Articles
Explore adjacent guides and reference articles on this topic.
Extract Apparel Commercial Invoices for HTSUS Classification
How US apparel importers extract HTSUS fields from supplier commercial invoices: fiber composition, knit vs woven, shell vs lining, and country of origin.
Extract ADT & Securitas Invoices to Excel for Multi-Site AP
Extract ADT, Securitas, and Allied Universal multi-site security invoices to Excel — per-site monitoring, false-alarm fees, and guard hours for GL coding.
EPC Invoice Validation Checklist for Construction AP Teams
Walk EPC and capital-project invoice validation by evidence layer — contract, SOV, change orders, retainage, lien waivers — with an Excel field schema.