How to Extract Australian Payslips to Excel

Convert Australian payslip PDFs into a clean Excel spreadsheet: map the Fair Work fields, super and PAYG, then extract YTD for payroll migration or back-pay.

Published
Updated
Reading Time
10 min
Topics:
Financial DocumentsPayrollAustraliaPAYGsuperannuation guaranteepayroll migration

To extract Australian payslips to Excel, upload the payslip PDFs and prompt the tool to return one row per payslip, or one row per employee per pay period, with columns for gross pay, PAYG, superannuation (SG), net pay, and any allowances, loadings and penalty rates. Keep a source-file and page reference on every row so each figure can be traced back to the payslip it came from. That structure is what turns a folder of PDFs into a spreadsheet you can reconcile, migrate, or audit against, rather than a flat dump of numbers.

The reason a generic "name, gross, tax, net" extraction falls short here is that an Australian payslip carries a specific set of fields, and the Fair Work Act requires most of them to be shown. A clean extraction has to respect that structure. The mandatory fields are the employer's name and ABN, the employee's name, the date of payment and the pay period it covers, gross and net pay, and every deduction with both its amount and what it is for. For anyone paid by the hour, the payslip must also show the ordinary hourly rate, the number of hours worked at that rate, and the amount paid. On top of that sit the components that drive most payroll complexity: loadings, allowances, bonuses, incentive payments and penalty rates, each shown as a rate and an amount, and superannuation contributions with the fund name or number and the amount paid. If you need a field-by-field walk-through before you extract, our guide on how to read an Australian payslip line by line covers each one.

The single point country-agnostic OCR tools get wrong is superannuation. Super is paid by the employer on top of gross pay; it is not deducted from the employee's pay. An extraction that files the SG amount in a "deductions" column corrupts every total built on top of it, which is why super needs its own column from the start.

A few figures are worth pinning down because they change what a correct row looks like. The superannuation guarantee rate is 12% of ordinary time earnings (OTE) from 1 July 2025. Payday super, the requirement to pay an employee's SG within seven days of payday, starts on 1 July 2026, so payslips dated after that point should show super aligned with each pay event rather than accrued for a later quarterly run. PAYG withholding and any HECS-HELP repayments appear as their own amounts, separate from gross and net.

The gap between a generic field list and the real Australian structure is where the downstream work breaks: a tool that pulls only name, gross, tax and net drops the award allowances, loadings and penalty rates you need intact when you later reconstruct year-to-date balances for a payroll migration or assemble pay data for an underpayment review.


Setting Up a Clean Extraction: Prompt and Column Structure

The practical workflow does not involve building a template or mapping fields by hand. You describe the data you want and the shape you want it in, then download a structured Excel, CSV, or JSON file. The most reliable way to frame the request is to tell the tool what the data is for, the way you would brief a new payroll assistant: something along the lines of "I'm preparing payroll data from a batch of Australian payslips." A goal-oriented prompt like that gives the extraction enough context to handle the edge cases that a bare field list would miss, such as a penalty rate shown on a separate line or an allowance that only appears in some periods.

The first decision is the row grain. For routine bulk work, one row per payslip is the natural unit. When you need to compare periods, for instance to track how an allowance moved month to month, use one row per employee per pay period instead. That per-period grain is the one the migration and remediation workflows below both rely on, so it is worth choosing deliberately rather than by default.

For the columns, name them to match the Australian structure rather than a generic payroll layout: Gross, PAYG, Super (SG), Net, and then a separate column for each named allowance, loading and penalty rate you need to keep distinct. Super stays in its own column and is never folded into a deduction, for the reason set out above. You can ask for the columns in a specific order and give them the exact headers your accounting system expects on import. This is where the tool earns its place: you can extract Australian payslips into a structured spreadsheet with the AU-correct columns already named, instead of reshaping a generic export afterwards.

Two output details make the result usable rather than just neat. First, native Excel typing: amounts come back as numbers and dates as dates, so the spreadsheet works in formulas, sorts and pivot tables straight away without re-typing a single cell. Second, every row carries a reference to the source file and page it was drawn from. For Australian payroll work that traceability is not a nicety. When a reconstructed figure has to stand up to an ATO-facing reconciliation or an underpayment review, being able to point from any cell back to the exact payslip is what makes the spreadsheet defensible.

Real payslip batches are rarely tidy. They arrive in mixed layouts from different payroll systems, some as native PDFs and some as scanned or photographed payslips that a plain OCR converter would struggle to read cleanly, and sometimes several payslips are concatenated into one PDF. The same prompt processes all of them, so the output stays consistent across the batch regardless of the source format, and the work scales the same whether you are handling ten payslips or several thousand. That consistency at volume is the line between this and pasting a handful of payslips into a general-purpose chatbot, and it is the same engine described in the general payroll PDF-to-Excel extraction workflow applied to Australian payslips specifically. For bulk payslip extraction in Australia, where a bureau might process a different client's stack every week, that repeatability is the whole point.


Workflow A: Rebuilding YTD and Opening Balances for a Payroll Switch

When a business moves payroll systems, MYOB to Xero, or across to Employment Hero (formerly KeyPay), Reckon, Payroller or QuickBooks, the new system needs each employee's year-to-date figures and opening balances to carry on paying correctly. The catch is that those figures usually cannot be pulled across automatically. The Xero API cannot retrieve prior MYOB pay runs, so YTD has to be rebuilt by hand from historical payslips and payroll reports. The problem is sharpest on a mid-financial-year switch, where there is a substantial year already accrued and no clean handover of it.

This is the source-side gap that migration guides tend to skip. They explain where to enter opening balances in the destination system but assume you already have the numbers. Extraction is how you assemble them. Run the historical payslips and payroll reports through as one row per employee per pay period, then sum those rows in the spreadsheet to reconstruct each employee's YTD, broken out into gross, PAYG, super, and each allowance and loading separately. Because each component sits in its own column, the opening balances you key into the new system match the way it expects them, rather than arriving as a single lumped total you then have to pick apart.

The figures also have to reconcile to what the ATO already holds. Under Single Touch Payroll, the end-of-financial-year income statement replaced the old payment summary, and an employee's YTD amounts move to "tax ready" once the employer finalises, which is due by 14 July. Reconstructed YTD has to line up with those finalised figures, because opening balances that are out by even a small margin produce STP mismatches at year end that someone then has to chase down. Getting the reconstruction right at migration time is what keeps the first end-of-year run under the new system clean. Once the new system is live, you can reconcile payday super in Xero and MYOB as an ongoing check on the super side.

This kind of reconstruction is possible because the underlying records are required to exist. Australian employers must keep employee records, including pay and deductions, hours of work, leave taken and accrued, and superannuation contributions, for seven years, per the Australian Government's 7-year employee record-keeping requirement. The historical payslips and payroll reports a migration draws on are not incidental paperwork; they are records the business is obliged to hold, which is what makes payslip-and-report reconstruction a sound basis for opening balances rather than a guess.


Workflow B: Assembling Pay Data for an Underpayment Review

The stakes on this work changed on 1 January 2025, when intentional underpayment of wages became a criminal offence in Australia. The Fair Work Ombudsman also runs a Payroll Remediation Program for employers correcting underpayments. Both mean that when you review historical pay, the data you assemble has to be accurate, complete across periods, and traceable back to source, because it may end up supporting a remediation calculation that is examined closely.

The extraction job is to pull the pay components from each payslip across the periods under review, ordinary hours and rate, allowances, loadings, penalty rates, and super, into one row per employee per pay period. That gives you a complete, period-by-period record of what was actually paid, which is the foundation any back-pay calculation rests on. The same remediation shape applies in other jurisdictions; the approach mirrors how you would extract NZ payslip data for Holidays Act remediation, where the underlying entitlement model differs but the need to assemble pay data across periods is the same.

The calculation itself happens in the spreadsheet, not in the extraction. Back-pay is worked out per entitlement, rate multiplied by hours, plus allowances, loadings and penalties, comparing what was paid against what should have been paid under the relevant award or agreement. The Fair Work Ombudsman publishes the methodology for computing those entitlements; this work sits upstream of it. You assemble the underlying pay data, then apply the methodology in your own spreadsheet.

That division is worth being explicit about, because it is exactly where an honest account separates from a generic tool's claims. When you extract payslip data to calculate back pay, the extraction returns the values printed on each payslip, native-typed and tied to their source page, and it can classify or flag components through the prompt. It does not sum across periods, compute back-pay deltas, or decide what an employee was owed. The arithmetic and the entitlement judgement stay with you and your spreadsheet. Treating extraction as a fast, faithful way to assemble the inputs, rather than as something that adjudicates the outcome, is what keeps the resulting figures defensible.

Extract invoice data to Excel with natural language prompts

Upload your invoices, describe what you need in plain language, and download clean, structured spreadsheets. No templates, no complex configuration.

Exceptional accuracy on financial documents
1–8 seconds per page with parallel processing
50 free pages every month — no subscription
Any document layout, language, or scan quality
Native Excel types — numbers, dates, currencies
Files encrypted and auto-deleted within 24 hours
Continue Reading