How to automate Amazon seller reports into a spreadsheet

TL;DR

You can pull Amazon seller reports directly into Google Sheets using Python and the SP-API, with the whole setup taking under two hours. Once the pipeline is running, your spreadsheet refreshes every morning without any manual downloading. You need a free Amazon developer account, Python 3.9+, and a Google Cloud project with the Sheets API enabled.

What You Need Before You Start

An Amazon Seller Central account on the Professional plan (SP-API access is not available on the Individual plan)
SP-API developer access approved at developer.amazon.com (approval takes one to three business days for self-authorised apps)
Your SP-API credentials ready: Client ID, Client Secret, Refresh Token, Marketplace ID
Python 3.9 or newer installed locally
A Google account with Google Sheets API and Google Drive API enabled in Google Cloud Console
A service account JSON key file downloaded from Google Cloud Console
pip packages: python-amazon-sp-api, gspread, gspread-dataframe, pandas, python-dotenv, schedule
Basic comfort running commands in the terminal
Optional: a $4/month cloud VM (DigitalOcean, Hetzner, or Linode) if you want the script running 24/7 without leaving your laptop on

Step 1: Choose Which Reports You Actually Need

Amazon generates dozens of report types and requesting all of them is a fast way to hit rate limits and confuse yourself. Pick your three most important reports before writing any code.

For most sellers, these five are the most useful:

GET_FLAT_FILE_ALL_ORDERS_DATA_BY_ORDER_DATE_GENERAL for order-level data with SKU and buyer details
GET_FBA_MYI_UNSUPPRESSED_INVENTORY_DATA for current FBA stock levels
GET_V2_SETTLEMENT_REPORT_DATA_FLAT_FILE for settlement totals per disbursement cycle
GET_SALES_AND_TRAFFIC_REPORT for unit session percentage and traffic by ASIN
GET_FBA_ESTIMATED_FBA_FEES_TXT_DATA for fee estimates per SKU

Log in to Seller Central and navigate to Reports > Business Reports. Preview each report format before automating it. Note whether the download file is TSV, CSV, or JSON because your parsing code will depend on this. Most flat-file reports are TSV.

Start with just one or two report types. Adding more is straightforward once the pipeline is working.

You should now see the column structure for each report type you plan to automate.

Step 2: Get Your SP-API Credentials

Go to developer.amazon.com and create a developer profile. When prompted for the app type, choose Self-Authorised App. This is for sellers automating their own account, not building a public app.

Once your application is approved, go back to Seller Central and navigate to Apps and Services > Develop Apps. Click Authorise next to your app and complete the OAuth flow. At the end you receive a Refresh Token. Copy it somewhere safe immediately.

Your credential block should look like this:

LWA_APP_ID=amzn1.application-oa2-client.xxxxx
LWA_CLIENT_SECRET=amzn1.oa2-cs.v1.xxxxx
LWA_REFRESH_TOKEN=Atzr|xxxxx
MARKETPLACE_ID=ATVPDKIKX0DER

The ATVPDKIKX0DER value is the US marketplace ID. For UK use A1F83G8C2ARO7P, for Germany use A1PA6795UKMFR9. If you sell across multiple marketplaces, see the SP-API beginner guide on this site for a full marketplace ID reference.

Save these in a .env file at your project root. Never hard-code them directly in a script file.

You should now have a .env file with four populated fields and no errors.

Step 3: Set Up Your Python Environment

Open your terminal and create an isolated environment for this project:

python -m venv amazon-reports-env
source amazon-reports-env/bin/activate
# Windows: amazon-reports-env\Scripts\activate

pip install python-amazon-sp-api gspread gspread-dataframe pandas python-dotenv schedule

python-amazon-sp-api is a community-maintained wrapper that handles SP-API authentication, rate limiting, and report status polling automatically. It saves you several hundred lines of boilerplate around token refreshing and retry logic.

Create your project folder structure:

amazon-reports/
  .env
  service_account.json
  fetch_reports.py
  push_to_sheets.py
  main.py
  .gitignore

Add .env and service_account.json to .gitignore immediately. Both files contain credentials that will get your accounts compromised or revoked if they end up in a public repository.

You should now be able to run python -c "from sp_api.api import Reports" without any import errors.

Step 4: Fetch a Report From the SP-API

Create fetch_reports.py. This script requests the report, polls until Amazon has finished generating it, then downloads the raw content.

import os
import time
from dotenv import load_dotenv
from sp_api.api import Reports
from sp_api.base import Marketplaces

load_dotenv()

credentials = {
    "lwa_app_id": os.getenv("LWA_APP_ID"),
    "lwa_client_secret": os.getenv("LWA_CLIENT_SECRET"),
    "refresh_token": os.getenv("LWA_REFRESH_TOKEN"),
}

def fetch_orders_report(start_date: str, end_date: str) -> str:
    api = Reports(credentials=credentials, marketplace=Marketplaces.US)

    response = api.create_report(
        reportType="GET_FLAT_FILE_ALL_ORDERS_DATA_BY_ORDER_DATE_GENERAL",
        dataStartTime=f"{start_date}T00:00:00Z",
        dataEndTime=f"{end_date}T23:59:59Z",
    )
    report_id = response.payload["reportId"]

    while True:
        status = api.get_report(report_id).payload["processingStatus"]
        if status == "DONE":
            break
        if status == "FATAL":
            raise RuntimeError(f"Report generation failed for ID {report_id}")
        time.sleep(15)

    document_id = api.get_report(report_id).payload["reportDocumentId"]
    doc = api.get_report_document(document_id, decrypt=True)
    return doc.payload["document"]

if __name__ == "__main__":
    raw = fetch_orders_report("2026-05-01", "2026-05-15")
    print(raw[:500])

The time.sleep(15) between polls is intentional. Hammering the status endpoint faster than this triggers throttling.

You should now see the first 500 characters of raw TSV order data printed in your terminal when you run the script.

Step 5: Parse the Report Data With Pandas

The raw text from SP-API is tab-separated and includes columns you will never use. Strip it down to what your spreadsheet actually needs.

Create parse_reports.py:

import pandas as pd
import io

def parse_orders(raw_text: str) -> pd.DataFrame:
    df = pd.read_csv(io.StringIO(raw_text), sep="\t", dtype=str)

    keep = [
        "amazon-order-id",
        "purchase-date",
        "sku",
        "quantity",
        "item-price",
        "order-status",
    ]
    df = df[keep]

    df["purchase-date"] = pd.to_datetime(df["purchase-date"]).dt.strftime("%Y-%m-%d")
    df["item-price"] = pd.to_numeric(df["item-price"], errors="coerce").round(2)
    df["quantity"] = pd.to_numeric(df["quantity"], errors="coerce")

    # Remove cancelled and pending orders from revenue calculations
    df = df[df["order-status"] == "Shipped"]

    return df.reset_index(drop=True)

Filtering to Shipped only is important. Unshipped and cancelled rows are present in the raw export and will inflate your revenue totals if you sum item-price across the whole file.

You should now see a clean DataFrame with six columns when you call parse_orders(raw_text) and print the head.

Step 6: Connect to Google Sheets With gspread

Go to Google Cloud Console, create a project, and enable both the Google Sheets API and the Google Drive API under APIs and Services. Then create a Service Account, assign it the Editor role, and download the JSON key as service_account.json.

Open your target Google Sheet and share it with the service account email address (it looks like your-sa@your-project.iam.gserviceaccount.com). Grant it Editor access. This is the step most people miss the first time.

Create push_to_sheets.py:

import gspread
from gspread_dataframe import set_with_dataframe
import pandas as pd

def push_to_sheet(df: pd.DataFrame, sheet_name: str, tab_title: str):
    gc = gspread.service_account(filename="service_account.json")
    sh = gc.open(sheet_name)

    try:
        ws = sh.worksheet(tab_title)
        ws.clear()
    except gspread.WorksheetNotFound:
        ws = sh.add_worksheet(title=tab_title, rows=10000, cols=20)

    set_with_dataframe(ws, df)
    print(f"Pushed {len(df)} rows to '{tab_title}'")

For more detail on Google Cloud service account setup, the automate Google Sheets with Python post on this site walks through the console steps with screenshots.

You should now see your DataFrame appear in the Google Sheet within a few seconds of calling push_to_sheet().

Step 7: Wire the Pipeline Together in main.py

Connect the fetch, parse, and push steps into one script you can run on demand or on a schedule.

from datetime import datetime, timedelta
from fetch_reports import fetch_orders_report
from parse_reports import parse_orders
from push_to_sheets import push_to_sheet

SHEET_NAME = "Amazon Seller Dashboard"

def run_pipeline():
    today = datetime.utcnow()
    start = (today - timedelta(days=30)).strftime("%Y-%m-%d")
    end = today.strftime("%Y-%m-%d")

    print(f"Fetching orders {start} to {end}...")
    raw = fetch_orders_report(start, end)

    df = parse_orders(raw)
    push_to_sheet(df, SHEET_NAME, "Orders")

    print(f"Pipeline complete. {len(df)} shipped orders written.")

if __name__ == "__main__":
    run_pipeline()

Run it once manually with python main.py before scheduling anything. Watch for gspread.exceptions.APIError: 403 which means the service account does not have Editor access on the sheet. A KeyError on a column name usually means the report type changed its header.

You should now see the Orders tab in your Google Sheet populated with the last 30 days of shipped orders.

Step 8: Schedule the Script to Run Automatically

On Mac or Linux, use cron to trigger the script every morning at 6am:

crontab -e

Add this line, adjusting paths to match your setup:

0 6 * * * /path/to/amazon-reports-env/bin/python /path/to/amazon-reports/main.py >> /path/to/amazon-reports/cron.log 2>&1

On Windows, open Task Scheduler, create a Basic Task, set the trigger to Daily at 6:00am, and point the action to your Python executable with main.py as the argument.

If you want the script running without leaving your laptop on all day, spin up a cheap cloud VM. A Hetzner CX11 or DigitalOcean Basic Droplet runs about four dollars a month. SSH in, clone your project, set up the cron job, and your Google Sheet refreshes every morning regardless of whether your laptop is open.

You should now see entries timestamped in cron.log after each scheduled run confirming the pipeline executed.

Step 9: Add a Summary Tab With Spreadsheet Formulas

Raw data in the Orders tab is only useful if you can see totals at a glance. Add a Summary tab to your sheet and use these formulas to surface the numbers you check every morning.

Total shipped revenue for the pulled date range:

=SUM(Orders!E:E)

Units sold by SKU (put SKU codes in column A of the Summary tab, then in column B):

=SUMIF(Orders!C:C,A2,Orders!D:D)

Top ten SKUs by revenue using QUERY:

=QUERY(Orders!A:F,"SELECT C, SUM(E) GROUP BY C ORDER BY SUM(E) DESC LIMIT 10 LABEL SUM(E) 'Revenue'",1)

Daily revenue trend for a chart:

=QUERY(Orders!A:F,"SELECT B, SUM(E) GROUP BY B ORDER BY B ASC LABEL SUM(E) 'Daily Revenue'",1)

These formulas recalculate automatically the moment Python writes new rows to the Orders tab. Your summary numbers stay current with zero extra work after initial setup.

You should now see a summary table that updates every morning without any manual effort.

Common Mistakes To Avoid

Requesting reports more than once per 30 minutes. Amazon throttles the Reports API aggressively. If you request the same report type repeatedly in a short window, you will start seeing QuotaExceeded errors. Design your schedule around once-daily or once-hourly pulls at most.
Hard-coding credentials in your script. Amazon’s automated credential scanner watches public GitHub repositories. If your Client Secret ends up in a commit, your SP-API access gets revoked within minutes. Always use .env and confirm it is in .gitignore before the first commit.
Forgetting to filter cancelled and pending orders. The orders flat file includes all statuses. Summing item-price across the full file without filtering to Shipped will show you inflated revenue figures that do not match your disbursements.
Using incorrect date format for report requests. SP-API requires ISO 8601 with timezone, meaning 2026-05-01T00:00:00Z. Passing a bare date like 2026-05-01 causes a validation error on several report types and the failure message is not always obvious about the cause.
Clearing the sheet instead of appending when building history. If your goal is a running log of historical data, use ws.append_rows() rather than ws.clear() followed by a full write. Clearing the sheet also destroys any manual notes, colour-coding, or formatting you have applied.
Ignoring the FATAL processing status. Amazon occasionally fails to generate a report for large date ranges or unusual account states. If you only check for DONE, a FATAL status will leave your script polling forever. Always handle the failure case explicitly.

When To Level Up

The Python-plus-cron approach handles a single seller account with a handful of report types comfortably. Once you are managing more than 50 active SKUs, selling across three or more marketplaces, or need to combine SP-API data with advertising API data and supplier invoices, the homegrown script starts to crack.

Managing multiple refresh tokens across accounts, handling API version changes when Amazon deprecates a report type, and building retry logic for transient failures each become their own maintenance burden. That maintenance compounds quickly when you are also running a business.

At that scale, purpose-built tools make more sense. Sellerboard handles P&L automation across FBA fees, ad spend, refunds, and returns without any code. Jungle Scout includes automated reporting built around keyword tracking and inventory forecasting. For no-code pipeline builders, Zapier and Make both offer Amazon Seller Central connectors that can push data to Google Sheets on a schedule with error notifications and conditional logic built in.

Head to the automation category on this site for a current comparison of tools that handle higher-volume seller reporting and multi-account management.

Frequently Asked Questions

Do I need a Professional selling plan to use the SP-API?
Yes. Amazon restricts SP-API access to sellers on the Professional plan. Individual plan sellers can still export reports manually from Seller Central and import them into a spreadsheet, but programmatic API access requires the paid plan at $39.99 per month.

How long does SP-API approval take?
Self-authorised apps for your own account typically get approved within one to three business days. Third-party developer applications, where you are building a tool for other sellers, take longer because Amazon requires use case documentation and sometimes a review call with their partner team.

Can I run this on Windows without a cloud VM?
Yes. Use Windows Task Scheduler instead of cron and point it to your Python executable and main.py. The script itself is platform-agnostic. File paths in your .env will use backslashes instead of forward slashes, which Python handles without issues.

Will my Google Sheets formulas break when Amazon changes a report’s columns?
They can, but the setup here protects you from most changes. By pinning the keep list in your parse step, your sheet always receives the same fixed columns regardless of what Amazon adds. The only risk is if Amazon removes a column you depend on, which is rare and usually announced in advance.

Is there a free way to host this without a cloud VM?
Google Cloud Run and AWS Lambda both have free tiers that comfortably handle a lightweight daily report pull. A Cloud Run job triggered by Cloud Scheduler costs nothing for most sellers doing a single daily fetch. The initial setup is more complex than a VPS but the ongoing cost in most months is zero.

Bottom Line

Automating your Amazon seller reports into a spreadsheet comes down to four moving parts: requesting the report through SP-API, polling until it is ready, parsing the TSV output with pandas, and writing clean rows to a Google Sheet with gspread. A cron job handles the daily trigger. A Summary tab with QUERY and SUMIF formulas turns the raw data into the numbers you actually care about each morning.

The setup takes a couple of hours the first time and almost no maintenance after that. It works well for individual sellers and small teams managing a focused catalogue. Once the complexity grows, specialised tools will serve you better than a handbuilt script. Check the automation category for tool comparisons that match your current volume and budget.