Analyze
Paste a sample of rows or summary stats — get hypotheses, sanity checks, suggested visualizations, and SQL-ish questions to ask next (without inventing rows you did not paste).
So we label uncertainty loudly.
Paste a slice of rows or summary statistics — the analyzer describes what is visible in your sample, flags anomalies and outliers worth investigating, proposes 3-5 hypotheses for the full dataset, suggests charts with axes, and ends with an explicit "cannot determine from sample" list. It will not invent company-wide revenue because you pasted ten rows. Posture controls how confident the language gets — exploratory hedges everything, QC focuses on data quality, exec summary stays cautious enough to forward up the chain.
Five inputs that distinguish hypotheses from claims.
Six structured sections sized to the posture you chose.
Inferred types
Column-by-column type inference with flags for ambiguous or mixed-type fields.
From sample only
Counts, ranges, and distributions where the sample size supports it; no false precision.
Outliers + nulls
Specific cells that look weird, with reasoning so you know whether to investigate or ignore.
3-5 next steps
Modeling and analysis directions to consider, each labeled with the uncertainty level.
Visualization picks
Which charts would clarify what — with explicit axis recommendations for each.
Honest gaps
Explicit list of what the sample does not let you answer, so you know what to query next.
Analysis moments where you need direction, not conclusions.
The same sample can support three different legitimate framings.
Exploratory analysis surfaces hypotheses without claiming any of them are proven — appropriate for design partners, internal R&D, and early product discovery. QC posture focuses on what is wrong with the data itself — appropriate for new pipelines, vendor-supplied datasets, and pre-launch sanity checks. Exec-summary posture keeps statistical language cautious and avoids causal claims — appropriate for board updates and external readouts. Picking the wrong posture makes the analyst look reckless even when the underlying analysis is correct. Match the posture to the audience, then run.
Habits that compound across data review work.
No — it ideates and triages. Always verify on complete data in code before making decisions, especially anything involving statistical tests or production metrics.
It is explicitly instructed not to. If a number describes the full population (vs your sample), the model marks it as inferred or asks for the underlying data.
Enough to see the shape — 50-200 rows usually beats 5 rows or 5000. Larger samples slow inference without much added insight at this stage.
Best with flat CSV/TSV. For nested JSON, flatten or summarize first; the model can read JSON but reasons better about tabular shapes.
They are starting points, not best-of-class visualizations. Use them as direction, then build with your real chart library and full data.
Default reasoning-capable text models for analysis quality. Switch to deeper models for complex multi-table or multi-hypothesis explorations.
Pick exec-summary posture. The model defaults to hedged phrasing and avoids causal language even when the sample seems to support a claim.
Know where to look first.
Turn raw dumps into a structured QC checklist before you burn an afternoon (or a GPU hour) on the wrong question. Use the analyzer to find the question worth asking, then go answer it properly in your real analysis environment.