Scheduling & Monitoring¶
Automate syncs, monitor data quality, detect schema changes, and scan for PII — all built in.
Overview¶
Dango's scheduling and monitoring features cover three pillars:
Scheduling — Automate your data pipeline with cron-based schedules. Define when sources sync, when dbt runs, and get notified when things go wrong. Schedules persist across restarts and recover missed runs automatically.
Monitoring — Track key metrics after every sync and compare them against historical baselines. When a metric changes more than expected, Dango identifies which dimensions drove the change and sends an alert.
Governance — Protect data quality with automatic schema drift detection and PII scanning. Breaking schema changes block dbt to prevent silent failures. PII findings flag columns that may contain personal information.
How It All Fits Together¶
flowchart LR
A[Scheduled Sync] --> B[Schema Drift Check]
B --> C{Breaking?}
C -->|No| D[dbt Transform]
C -->|Yes| E[Block dbt + Alert]
D --> F[PII Scan]
D --> G[Monitoring Metrics]
F --> H{PII Found?}
H -->|Yes| I[Webhook Alert]
G --> J{Threshold Exceeded?}
J -->|Yes| I
J -->|No| K[Record as Normal] After a scheduled sync completes, Dango runs a post-sync pipeline:
- Schema drift detection compares the new schema against the baseline
- dbt transformation runs (unless blocked by breaking drift)
- PII scanning checks for personal information in newly synced data
- Monitoring metrics evaluate configured monitors against historical baselines
- Webhook notifications fire for any alerts or governance events
Capabilities at a Glance¶
| Feature | What It Does | Guide |
|---|---|---|
| Scheduled Syncs | Cron-based automation for source syncs and dbt runs | Scheduled Syncs |
| Webhook Notifications | Slack and HTTP alerts for sync events and governance findings | Webhook Notifications |
| Monitoring Metrics | Automated metric tracking with baseline comparison | Monitoring Metrics |
| Configuring Monitors | Define metrics, thresholds, and drill-down dimensions | Configuring Monitors |
| Schema Drift | Detect breaking and additive schema changes | Schema Drift |
| PII Scanning | Find email addresses, phone numbers, and other PII in your data | PII Scanning |
| Data Catalog | Browse models, columns, lineage, and profiling stats | Data Catalog |
Section Guides¶
-
Scheduled Syncs
Set up cron schedules to automate source syncs and dbt runs with retry, timeout, and missed-run recovery.
-
Webhook Notifications
Get Slack or HTTP alerts when syncs complete, fail, go stale, or when governance events are detected.
-
Monitoring Metrics
Understand how Dango compares metric values against baselines and detects trends.
-
Configuring Monitors
Define what to measure, set thresholds, and configure drill-down analysis in
monitors.yml. -
Schema Drift
Detect when source schemas change and protect dbt models from breaking silently.
-
PII Scanning
Automatically find personally identifiable information in your synced data.
-
Data Catalog
Browse models, columns, lineage graphs, and profiling statistics in one place.
Quick Start¶
-
Set up a schedule:
-
Add webhook notifications (edit
.dango/schedules.yml): -
Run monitors manually to see results:
Cloud vs Local
Scheduling requires dango start running locally, or is always-on in cloud deployments. Cloud schedules run 24/7 via systemd and survive reboots automatically.
What Runs When?¶
Understanding the order of operations helps you configure schedules and monitors effectively:
| Phase | What Happens | Configurable? |
|---|---|---|
| 1. Source sync | dlt pulls data from configured sources into DuckDB | Schedule type: sync or sync_only |
| 2. Schema drift check | Compares new schema against saved baseline | Automatic — no config needed |
| 3. dbt transformation | Runs dbt run + dbt test on your models | Schedule type: sync (included) or dbt (standalone) |
| 4. PII scan | Analyzes string columns for personal information | Automatic — runs after sync |
| 5. Monitoring | Evaluates configured monitors against baselines | Requires monitors.yml |
| 6. Notifications | Sends webhooks for any alerts | Requires webhooks in schedules.yml |
Phases 2–6 are post-sync hooks
These run automatically after every sync (scheduled or manual). You don't need to configure them separately — they're part of the sync pipeline.
Next Steps¶
- Deployment — deploy to the cloud for 24/7 scheduling
- CLI Schedule Commands — full CLI reference for schedule management
- Local vs Cloud — understand the differences between local and cloud operation