How to build self-serve analytics at a small startup - Data Research Analysis Collection

TL;DR

You can give your whole team safe, self-serve access to your startup’s data in a single weekend. The core setup takes four to six hours using a free BigQuery sandbox, a thin layer of SQL views, and Metabase’s free community edition. Once it is running, anyone on your team can answer their own data questions without waiting on a developer.

What You Need Before You Start

Before you open a single tool, confirm you have the following in place:

A Google account with a new GCP project (BigQuery free sandbox gives you 10 GB storage and 1 TB of query processing per month at no cost)
Your primary data source credentials (PostgreSQL connection string, MySQL, or export access from Stripe, Shopify, or a similar SaaS)
Admin access to any SaaS tools you want to pull data from, such as Stripe, HubSpot, or Intercom
Docker Desktop (free) if you want to run Metabase locally before committing to a server deployment
Airbyte (free self-hosted) or a free Segment workspace for data ingestion
At least one person who can write basic SQL SELECT statements with WHERE and GROUP BY
Optional: a $6/month DigitalOcean or Hetzner droplet if you want the setup to run 24/7 from day one

You do not need a dedicated data engineer, and you do not need an enterprise data warehouse contract. Most early-stage startups run this entire stack for under twenty dollars a month, often for free.

Step 1: Define Your Three Core Metrics First

The most common mistake is building dashboards before deciding what decisions those dashboards should drive. Before you touch any tool, write down three questions your team argues about every week. Not ten. Three.

For a typical SaaS startup those three might be: what is our weekly active user count and is it trending up, what is the median time from signup to the first key action, and which acquisition channel produces the best 30-day retention?

Write each question in plain English. Next to it, write the data source that contains the answer. Then write how often you actually need to look at it, whether daily, weekly, or monthly.

Create a Google Sheet with three columns: Question, Source table, and Update frequency. This is your analytics spec. Every dashboard you build should map to at least one row in that sheet.

You should now see a one-page document that prevents scope creep and gives any future hire or contractor a clear brief on what the data stack is supposed to answer.

Step 2: Set Up BigQuery as Your Data Warehouse

BigQuery is the most forgiving starting point for a small startup. The free sandbox requires no credit card, and the SQL dialect is standard enough that switching warehouses later is manageable.

Go to console.cloud.google.com, create a new project named something like yourcompany-analytics, and enable the BigQuery API. In the BigQuery console, click Create dataset and name it raw. This is where source data will land before any transformation.

Run this test query to confirm everything is working:

SELECT
  CURRENT_TIMESTAMP() AS now,
  'setup_complete'     AS status

The result should appear in under two seconds. If you see a billing error, check that you selected the free sandbox tier during project creation rather than a paid account.

You should now see a working BigQuery project with one empty dataset called raw and zero dollars charged to any payment method.

Step 3: Load Your First Data Source

Pick your single most important data source and get it into BigQuery. For most startups that is either your PostgreSQL production database or Stripe.

For a PostgreSQL load, use Segment or the open-source tool Airbyte (free self-hosted tier) to set up a connector. Point the connector at your raw BigQuery dataset. For a Stripe export, Airbyte’s Stripe connector maps charges, customers, and subscriptions to flat tables automatically. A basic Airbyte deployment on a $6 Hetzner CX11 instance handles startups doing under fifty thousand events per day without issues.

Never connect Metabase or any BI tool directly to your production database. Every heavy query competes with real users hitting your app. The warehouse is your buffer.

Once the first sync completes, run a row count to verify the data arrived correctly:

SELECT COUNT(*) AS row_count
FROM raw.stripe_charges
WHERE DATE(created) >= CURRENT_DATE() - 7

You should now see a non-zero row count that roughly matches what your Stripe dashboard shows for the same period.

Step 4: Write SQL Views That Non-Technical Teammates Can Trust

Raw tables are messy. Column names like amt_in_cents or cust_id_fk confuse anyone who is not the person who wrote the ingestion pipeline. The fix is a thin layer of SQL views that rename columns, cast types correctly, and filter out test data and internal accounts.

Create a second BigQuery dataset called marts. Then write one view per core metric. Here is the weekly active users example:

CREATE OR REPLACE VIEW marts.weekly_active_users AS
SELECT
  DATE_TRUNC(event_date, WEEK)  AS week_start,
  COUNT(DISTINCT user_id)        AS wau
FROM raw.product_events
WHERE event_name    = 'session_start'
  AND test_account  = FALSE
  AND internal_user = FALSE
GROUP BY 1
ORDER BY 1 DESC

Keep each view simple. One job each. No joins spanning more than two tables at this stage. If you want to version-control your views and add documentation later, the data stack for startups guide covers adding dbt to this workflow.

You should now see a marts dataset with clean, readable views that return sensible column names and match the totals you would calculate manually in a spreadsheet.

Step 5: Install and Configure Metabase

Metabase Community Edition is free, open-source, and the fastest path to dashboards your non-technical teammates will actually open and trust. Pull the Docker image and start it:

docker pull metabase/metabase
docker run -d -p 3000:3000 --name metabase metabase/metabase

Open http://localhost:3000. The setup wizard asks for your name, email, and a database connection. Choose Google BigQuery from the database list. You will need to create a service account in Google Cloud Console with two roles: BigQuery Data Viewer and BigQuery Job User. Download the JSON key file and paste its contents into the Metabase connection form.

Point Metabase at your marts dataset, not raw. This is critical. You do not want teammates accidentally querying half-transformed source tables and building charts on top of them.

You should now see the Metabase home screen with your marts views listed under Browse Data, with clean column names visible when you click through.

Step 6: Build Your First Dashboard

Go to New > Dashboard and name it something specific, like “Growth Weekly Review” rather than a vague “Main Dashboard.” Specific names prevent the confusion that happens when every team builds their own “General Dashboard.”

Click Add a question and use the visual query builder to create your first card. For weekly active users, select the weekly_active_users view, choose Line, and set the x-axis to week_start and the y-axis to wau. No SQL required for this step.

For metrics that need custom SQL logic, click SQL editor and paste your query directly. Metabase stores the query alongside the card, so anyone who opens it can see exactly how the number was calculated. This transparency is what makes self-serve analytics trustworthy rather than a black box.

Add at least three cards: your core growth metric, your primary revenue number, and your main activation rate. Put them in the top row so the most important numbers are visible without scrolling.

You should now see a dashboard with three or more charts that update when you reload the page, pulling live data through BigQuery from your source systems.

Step 7: Set Up Roles and Permissions

Go to Admin > Permissions > Data. For your main user group (everyone except admins), set the raw dataset to No self-service and the marts dataset to Unrestricted. Teammates can explore any clean view but cannot touch raw source tables.

Create two groups: Viewers (can see dashboards but cannot edit queries) and Analysts (can build their own questions and save them). Assign your ops, marketing, and customer success teammates to Viewers. Assign yourself and anyone comfortable writing SQL to Analysts.

For any tables containing personally identifiable information, create a third BigQuery dataset called restricted and set it to No self-service for all groups except admins. Never include PII fields in your marts views.

You should now see that a test login using a Viewer account can browse dashboards and drill into underlying data, but the New question button is greyed out and raw tables are not visible.

Step 8: Schedule Automated Data Refreshes

A dashboard showing last week’s data because someone forgot to trigger a sync is worse than no dashboard at all. Set automated refreshes so the data is always current.

In Airbyte or Segment, open your connector settings and set a sync schedule. For most startup metrics, a daily sync at 5 or 6 AM local time is enough. For revenue dashboards that executives open at 9 AM, set the sync to 4 AM to give the pipeline time to complete before anyone looks.

In Metabase, go to Admin > Caching. Enable query caching and set a cache TTL of 3600 seconds (one hour) for most dashboards. This prevents BigQuery from re-running expensive queries every time a teammate opens a new browser tab.

For email delivery, open your dashboard, click the Subscriptions icon, and schedule a weekly PDF to your team’s Slack channel or a shared email address. Now data is both pull (anyone can open the dashboard anytime) and push (everyone gets a weekly summary without remembering to check).

You should now see your BigQuery tables updating overnight automatically, with Metabase dashboards showing current data every morning without any manual intervention.

Common Mistakes To Avoid

Querying production directly. Every analytical query you run against your production PostgreSQL competes with live user traffic. Always replicate to a warehouse first, even if it is just a nightly export.
Building twenty dashboards in week one. Start with three metrics. Add more only when a specific pending decision requires a new one.
Giving everyone Analyst permissions from day one. Users who can write SQL against your clean views will eventually write queries that bypass them and go straight to raw tables. Ramp permissions slowly.
Skipping the marts view layer entirely. When a column name changes upstream, every dashboard that queries that column directly breaks. A view layer acts as a contract that you can update in one place.
Not documenting what each metric means. If half your team thinks WAU counts all logins and the other half thinks it counts only paid users, the number creates arguments rather than decisions. Use Metabase’s Description field on every saved question.
Forgetting to filter out internal accounts and test users. Do this at the view level, not inside individual dashboards, so the filter is applied consistently everywhere.

When To Level Up

This stack works well until you hit roughly five data sources, a team of more than twenty people, or queries that regularly take more than thirty seconds to return. At that point the cracks start to appear.

The first sign is a slow dashboard that teammates quietly stop trusting. the second is someone building a one-off spreadsheet because they could not find what they needed in Metabase, creating a parallel version of the truth. The third is a new data source that your current views do not account for, and nobody has time to update them.

When those signs appear, you are ready for a managed transformation layer like dbt and possibly a proper orchestration tool like Dagster or Prefect. You may also need to evaluate whether Metabase still fits your team or whether a tool with stronger governance and row-level security is a better match.

The comparison guides in /category/data-analysis/ walk through the next tier of tools, including options that scale to hundreds of users. Start with Best BI tools for growing startups and Metabase vs Looker Studio: which fits your team before committing to anything.

Frequently Asked Questions

Do I need a dedicated data engineer to set this up?
No. The stack described here is designed for a technical founder or a product manager who is comfortable with SQL. You can set it up in a weekend and maintain it in under two hours a week once the syncs are automated.

Is BigQuery genuinely free for an early-stage startup?
Yes, within limits. Google’s free sandbox gives you 10 GB of storage and 1 TB of query processing per month at no cost. Most startups stay inside the free tier for six to twelve months. Check the BigQuery free tier guide for details on what counts toward your monthly quota.

What if all my data currently lives in spreadsheets?
That is a fine starting point. Google Sheets connects natively to BigQuery via the built-in Data Transfer Service. Load your sheets into the raw dataset, apply the same view layer on top, and connect Metabase. When your data moves to a proper database later, you only need to update the view definition, not rebuild every single dashboard.

How do I keep personally identifiable information safe?
Create a restricted BigQuery dataset for any table containing names, emails, or payment details. Grant access to admins only. In your marts views, exclude or hash those columns before they reach Metabase. Google Cloud also supports column-level security policies if you need field-level masking for compliance reasons.

Can non-technical teammates actually use Metabase without writing SQL?
Yes. Metabase’s visual query builder lets users filter, group, and chart data from your marts views with clicks rather than code. The key is that your SQL views do the heavy lifting so the interface stays simple and the results stay trustworthy.

Bottom Line

Building self-serve analytics at a startup does not require a data team, an enterprise contract, or months of setup time. You define three core questions, load your data into BigQuery’s free tier, write a handful of clean SQL views in a marts dataset, connect Metabase Community, set up automated syncs, and lock down permissions so teammates can explore safely. The whole thing fits in a weekend if you stay focused and resist the urge to solve every possible question before you ship the first dashboard. Start with three metrics, document what each one means, and add complexity only when a real decision demands it. When you outgrow this setup, the data analysis tools category has comparison guides for every next step, from managed transformation layers to BI platforms that scale to large teams without requiring a full data engineering org.