how to clean messy data in Google Sheets (step-by-step)

how to clean messy data in Google Sheets (step-by-step)

garbage in, garbage out. this is the most important rule in data analysis and the one most beginners skip.

if your data has duplicate rows, inconsistent text, blank cells in the wrong places, or dates stored as text, every analysis you run on it will be wrong — quietly, without any error message. pivot tables will double-count. formulas will silently exclude rows. charts will misrepresent trends.

this guide covers the data cleaning steps to run before any analysis in Google Sheets.

why cleaning data first matters

two scenarios that illustrate the cost of skipping this step:

scenario 1: you build a pivot table showing revenue by region. “New York” and “new york” and “NYC” are three different entries in your data — the pivot splits them into three rows instead of one, making New York look like three different small markets instead of your biggest market.

scenario 2: your sales spreadsheet has 200 rows but 12 are duplicates from a bad data export. your total revenue calculation is 6% higher than the real number. every business decision you make from that sheet is based on inflated numbers.

neither error is obvious until you check manually. clean the data first.

step 1: remove duplicate rows

using the built-in tool

  1. select your data range (or click anywhere inside your data)
  2. go to Data → Data Cleanup → Remove Duplicates
  3. choose which columns define a “unique” record
  4. click Remove Duplicates

Google Sheets removes matching rows and tells you how many duplicates it found. for most business datasets (orders, form responses, customer records), choose all columns to catch exact duplicates.

finding duplicates before deleting

if you want to review duplicates first:

  1. add a helper column with =COUNTIF($A$2:$A, A2) (for the ID or key column)
  2. filter for values greater than 1 — those are duplicates
  3. review and delete manually, then remove the helper column

step 2: standardize text entries

inconsistent text is the most common source of bad pivot table results.

TRIM — remove extra spaces

=TRIM(A2) removes leading spaces, trailing spaces, and double spaces between words.

paste this formula in a new column next to your messy data, fill down the column, then copy → Paste Special → Values Only to replace the originals. delete the helper column.

PROPER — standardize capitalization

=PROPER(A2) converts “NEW YORK”, “new york”, and “New York” all to “New York.”

UPPER and LOWER

=UPPER(A2) forces all caps — useful for codes, IDs, or country abbreviations.
=LOWER(A2) forces lowercase — useful for email addresses.

Find & Replace for specific inconsistencies

press Ctrl+H (Cmd+H on Mac) to open Find & Replace. use it to standardize known variants: find “NYC” replace with “New York”, find “US” replace with “United States.”

do Find & Replace on specific columns only — select the column first to avoid replacing text in the wrong place.

SUBSTITUTE — remove unwanted characters

=SUBSTITUTE(A2, "$", "") removes dollar signs from a column of currency text.
=SUBSTITUTE(A2, ",", "") removes commas from numbers formatted as text.

chain SUBSTITUTE functions for multiple characters:
=SUBSTITUTE(SUBSTITUTE(A2, "$", ""), ",", "")

step 3: fix mixed date formats

dates stored as text strings are one of the most frustrating data cleaning problems because cells look like dates but do not behave like them.

how to spot text dates

real dates are right-aligned in Google Sheets by default. text dates are left-aligned. if your date column is left-aligned, the dates are stored as text.

converting text to dates

if the text follows a recognizable format like “01/15/2026” or “January 15 2026”:

  1. add a helper column with =DATEVALUE(A2)
  2. DATEVALUE converts the text string to a date serial number
  3. format the helper column as Date (Format → Number → Date)
  4. copy the helper column → Paste Special → Values Only to replace the originals

if the date format is unusual (e.g., “15-Jan-26” or “2026.01.15”), use SUBSTITUTE to normalize it first:
=DATEVALUE(SUBSTITUTE(A2, ".", "/")) converts “2026.01.15” to a recognizable format first.

step 4: handle blank cells

blank cells cause problems in formulas (AVERAGE skips blanks but COUNT includes them differently depending on the function) and in pivot tables (blanks appear as a separate category).

finding blank cells

use Ctrl+G (or Edit → Find) and look for the option “Blanks only” if available, or use a filter: Data → Create a Filter → click the filter dropdown on the column → uncheck everything except “(Blanks).”

options for blank cells

  • delete the row: if a blank in a key column (like Customer ID) means the record is invalid
  • fill with a placeholder: =IF(A2="", "Unknown", A2) replaces blanks with a standard value
  • fill down: if blanks represent “same as above” — select the blank cells, type the first value, and press Ctrl+Enter (fills all selected cells with that value)

IFERROR for formula errors showing as blank

if blanks are being produced by formula errors (like a VLOOKUP that did not find a match), wrap the formula in IFERROR:
=IFERROR(VLOOKUP(A2, Range, 2, FALSE), "Not Found")

this replaces error cells with a readable value instead of #N/A or #ERROR.

step 5: fix numbers stored as text

a column that looks like numbers but behaves like text is a quiet problem — SUM will return 0, AVERAGE will be wrong, pivot table totals will be blank.

how to spot text numbers

text numbers are left-aligned. real numbers are right-aligned. if your revenue or quantity column is left-aligned, the values are text.

converting text numbers to real numbers

method 1: select the column → Data → Data Cleanup → Convert to Number (if available in your version of Sheets).

method 2: multiply by 1. add a helper column with =A2*1. text that looks like a number converts to a real number. text that is not a number returns an error.

method 3: paste special → values only → in a blank cell type 1 → copy that cell → select your text numbers → Paste Special → Multiply. this converts text to numbers in place.

the pre-analysis cleaning checklist

run these checks in order before building any pivot table or running any analysis:

  1. remove duplicate rows (Data → Data Cleanup → Remove Duplicates)
  2. check for blank cells in key columns (ID, Date, Category fields)
  3. standardize text: run TRIM on text columns, check for capitalization variants
  4. verify date columns are right-aligned (real dates, not text)
  5. verify number columns are right-aligned (real numbers, not text)
  6. check for hidden characters: =LEN(TRIM(A2)) vs =LEN(A2) — if they differ, there are hidden spaces or characters
  7. confirm the first row is headers only — no data in row 1
  8. check that no merged cells exist in the data range (merged cells break pivot tables)

what to do with very messy data

if your data has hundreds of inconsistencies that manual cleaning cannot handle efficiently, consider:

  • Google Sheets Apps Script: JavaScript automation for batch cleaning — loops through every row applying TRIM, PROPER, and standardization rules
  • Python pandas: for large or complex datasets, pandas has robust data cleaning tools — see Python pandas tutorial for non-programmers
  • SQL: if your data comes from a database, clean it at the query level with TRIM(), UPPER(), COALESCE() — see SQL for beginners

for most business spreadsheets under 10,000 rows, the manual Google Sheets steps above handle 95% of cleaning tasks in under 30 minutes.