Recipe: Data analysis with project-state

28 April 2026#recipe#data-analysis#tutorial#milestones#decision-log

Data analysis projects have a shape that most PM tools handle badly. The work isn't a linear task list — it's an iterative cycle of data acquisition, cleaning, exploration, modelling, and insight delivery, with multiple stakeholders who want different things from the same analysis. project-state handles this well because it's structured around milestones and stakeholder reporting, not task boards.

Here's how to adapt it for a data analysis engagement.

How data analysis maps to project-state concepts

Data analysis concept	project-state concept
Analysis phases (acquire, clean, explore, model, deliver)	Phase preset
Dataset versions, model iterations	Milestones + technical_progress notes
Client, analyst team, exec sponsor	Stakeholder groups
Weekly analysis brief, final report	Reporting matrix entries
Scope change (new data source, new question)	Change register
Key analytical decisions (model choice, exclusion logic)	Decision log
Published findings, methodology notes	Document index

Step 1: Scaffold with a custom phase preset

ask claude: "scaffold a new v2 project, kind: research, phases: data-acquisition, data-cleaning, exploratory-analysis, modelling, insight-delivery"

Define gate criteria for each phase:

phases:
  - name: data-acquisition
    gate_criteria:
      - all source datasets received and stored
      - data dictionary documented
      - access permissions confirmed for all team members

  - name: data-cleaning
    gate_criteria:
      - null/missing value audit complete
      - outlier policy documented and applied
      - cleaning log committed to project docs
      - clean dataset version locked (document index entry: status=approved)

  - name: exploratory-analysis
    gate_criteria:
      - EDA summary document approved by analyst lead
      - key hypotheses documented as decisions
      - at least one stakeholder review of preliminary findings

  - name: modelling
    gate_criteria:
      - model selection decision logged
      - validation approach documented
      - baseline model milestone complete

  - name: insight-delivery
    gate_criteria:
      - final report milestone complete
      - client review meeting conducted
      - all deliverables in document index (status=delivered)

These gate criteria become the checklist the agent evaluates when you ask "can we advance the phase?"

Step 2: Set up stakeholders and the reporting matrix

A typical data analysis project has three stakeholder groups:

Analyst team — the people doing the work. They need internal status: what's blocked, what decisions are pending, what the current model state is.

Client / sponsor — the people who commissioned the analysis. They need progress updates and access to the findings as they emerge.

Exec / decision-maker — the end consumer of insights. They need a clean, concise view of findings and recommendations, not methodology.

entries:
  - stakeholder_group: analyst_team
    report_type: internal_status
    cadence: weekly
    format: slack_message
    surface: slack
    channel: "#analysis-[project-name]"

  - stakeholder_group: client
    report_type: progress_update
    cadence: biweekly
    format: email_draft
    surface: gmail

  - stakeholder_group: exec_sponsor
    report_type: findings_brief
    cadence: on_milestone
    trigger_milestones: ["eda-complete", "modelling-complete", "final-report"]
    format: email_draft
    surface: gmail

The on_milestone cadence is key here — the exec sponsor doesn't need weekly noise, just signal when something significant lands.

Step 3: Define milestones around analytical outputs, not tasks

Milestones in data analysis should be analytical outputs, not work activities. "Clean dataset" not "clean the data". "EDA complete" not "run exploratory analysis".

ask claude: "add milestones:
  - Clean dataset v1, due [date], owner: data engineer, definition of done: clean dataset file versioned and documented in project docs
  - EDA summary, due [date], owner: lead analyst, definition of done: EDA document approved by team
  - Baseline model, due [date], owner: ML engineer, definition of done: baseline results documented with evaluation metrics
  - Model v1, due [date], owner: ML engineer, definition of done: model validated, assumptions documented
  - Final report, due [date], owner: project lead, definition of done: report delivered and accepted by client"

The technical_progress note on each milestone is where the analytical narrative lives:

ask claude: "update milestone clean-dataset-v1: 70% complete, technical progress: missing value treatment complete for main tables, working on date normalization across three source systems which have inconsistent timezone handling"

This note goes directly into the next status report. The client doesn't see the detail — but the analyst team brief does.

Step 4: Log analytical decisions

Data analysis is full of decisions that need to be traceable: why a particular exclusion criterion was applied, why one model was chosen over another, why an outlier was treated a certain way. Log them as they happen:

ask claude: "log a decision: excluding records with NULL in [field] rather than imputing, rationale: imputation would introduce systematic bias in the low-income cohort, decided by: analyst team, date: today"

ask claude: "log a decision: using XGBoost rather than logistic regression, rationale: non-linear interactions between [var1] and [var2] were significant in EDA, decided by: ML lead, approved by: client"

When the client asks "why did you exclude those records?" three months later, the decision is in the log with full rationale, not lost in a Slack thread.

Step 5: Use the change register for scope changes

Scope changes in data analysis are common and dangerous. A new data source mid-project. A new question the client wants answered. A change in the target variable definition. These are material changes that need to be logged and approved.

ask claude: "log a change: client wants to add [new_datasource] to the analysis pipeline, classify it"

The change register classifies it (material — this expands scope and timeline) and creates a change record. The next status report to the client mentions it as a pending change request. Nothing moves until the change is approved and logged.

Step 6: Deliver findings through the document index

As deliverables are produced — EDA summaries, model documentation, final reports — register them in the document index:

ask claude: "add document: EDA Summary v1.2, type: analytical-report, file: docs/eda-summary-v1.2.pdf, status: under-review, description: exploratory analysis covering [scope], author: [name]"

The document index tracks the approval lifecycle: draft → under-review → approved → delivered. Phase gate criteria can check document status — "can't advance to modelling until EDA Summary is approved."

The result

A data analysis project running on project-state has:

Full decision traceability from day one
Automatic status reports that don't require manual preparation
Phase gates that enforce analytical rigor before advancing
A change register that catches scope creep
Stakeholder-appropriate reporting: analyst brief, client update, exec findings brief
A document index that tracks every deliverable through its approval lifecycle

The analyst team focuses on the analysis. The system handles the reporting.

𝕏 Post