Scheduling & Monitoring¶

Automate syncs, monitor data quality, detect schema changes, and scan for PII — all built in.

Overview¶

Dango's scheduling and monitoring features cover three pillars:

Scheduling — Automate your data pipeline with cron-based schedules. Define when sources sync, when dbt runs, and get notified when things go wrong. Schedules persist across restarts and recover missed runs automatically.

Monitoring — Track key metrics after every sync and compare them against historical baselines. When a metric changes more than expected, Dango identifies which dimensions drove the change and sends an alert.

Governance — Protect data quality with automatic schema drift detection and PII scanning. Breaking schema changes block dbt to prevent silent failures. PII findings flag columns that may contain personal information.

How It All Fits Together¶

flowchart LR
    A[Scheduled Sync] --> B[Schema Drift Check]
    B --> C{Breaking?}
    C -->|No| D[dbt Transform]
    C -->|Yes| E[Block dbt + Alert]
    D --> F[PII Scan]
    D --> G[Monitoring Metrics]
    F --> H{PII Found?}
    H -->|Yes| I[Webhook Alert]
    G --> J{Threshold Exceeded?}
    J -->|Yes| I
    J -->|No| K[Record as Normal]

After a scheduled sync completes, Dango runs a post-sync pipeline:

Schema drift detection compares the new schema against the baseline
dbt transformation runs (unless blocked by breaking drift)
PII scanning checks for personal information in newly synced data
Monitoring metrics evaluate configured monitors against historical baselines
Webhook notifications fire for any alerts or governance events

Capabilities at a Glance¶

Feature	What It Does	Guide
Scheduled Syncs	Cron-based automation for source syncs and dbt runs	Scheduled Syncs
Webhook Notifications	Slack and HTTP alerts for sync events and governance findings	Webhook Notifications
Monitoring Metrics	Automated metric tracking with baseline comparison	Monitoring Metrics
Configuring Monitors	Define metrics, thresholds, and drill-down dimensions	Configuring Monitors
Schema Drift	Detect breaking and additive schema changes	Schema Drift
PII Scanning	Find email addresses, phone numbers, and other PII in your data	PII Scanning
Data Catalog	Browse models, columns, lineage, and profiling stats	Data Catalog

Section Guides¶

Scheduled Syncs

Set up cron schedules to automate source syncs and dbt runs with retry, timeout, and missed-run recovery.

Set up schedules
Webhook Notifications

Get Slack or HTTP alerts when syncs complete, fail, go stale, or when governance events are detected.

Configure webhooks
Monitoring Metrics

Understand how Dango compares metric values against baselines and detects trends.

Learn about metrics
Configuring Monitors

Define what to measure, set thresholds, and configure drill-down analysis in monitors.yml.

Configure monitors
Schema Drift

Detect when source schemas change and protect dbt models from breaking silently.

Understand schema drift
PII Scanning

Automatically find personally identifiable information in your synced data.

Scan for PII
Data Catalog

Browse models, columns, lineage graphs, and profiling statistics in one place.

Explore the catalog

Quick Start¶

Set up a schedule:
```
dango schedule add
```

Add webhook notifications (edit .dango/schedules.yml):

notifications:
  webhooks:
    - name: slack_alerts
      url: "https://hooks.slack.com/services/T.../B.../xxx"
      format: slack
  on_failure: true
  on_success: false

Run monitors manually to see results:
```
dango monitor run
```

Cloud vs Local

Scheduling requires dango start running locally, or is always-on in cloud deployments. Cloud schedules run 24/7 via systemd and survive reboots automatically.

What Runs When?¶

Understanding the order of operations helps you configure schedules and monitors effectively:

Phase	What Happens	Configurable?
1. Source sync	dlt pulls data from configured sources into DuckDB	Schedule type: `sync` or `sync_only`
2. Schema drift check	Compares new schema against saved baseline	Automatic — no config needed
3. dbt transformation	Runs `dbt build` on your models	Schedule type: `sync` (included) or `dbt` (standalone)
4. PII scan	Analyzes string columns for personal information	Automatic — runs after sync
5. Monitoring	Evaluates configured monitors against baselines	Requires `monitors.yml`
6. Notifications	Sends webhooks for any alerts	Requires webhooks in `schedules.yml`

Phases 2–6 are post-sync hooks

These run automatically after every sync (scheduled or manual). You don't need to configure them separately — they're part of the sync pipeline.

Next Steps¶

Deployment — deploy to the cloud for 24/7 scheduling
CLI Schedule Commands — full CLI reference for schedule management
Local vs Cloud — understand the differences between local and cloud operation