Sync Modes¶
Understand how Dango loads data from your sources — incremental by default, with full refresh and date range options when you need them.
Overview¶
Dango supports three sync modes that control how data is fetched from sources and loaded into DuckDB:
| Mode | Command | Behavior | Use Case |
|---|---|---|---|
| Incremental | dango sync | Load only new/changed data since last sync | Daily operations (default) |
| Full Refresh | dango sync --full-refresh | Drop existing data, reload everything | Schema changes, data corruption |
| Date Range | dango sync --since 2026-01-01 | Load data within a specific time window | Backfills, gap filling |
Incremental Sync¶
Incremental sync is the default mode. It loads only data that has changed since the last successful sync, keeping sync times short and API usage low.
How It Works¶
dlt tracks sync state automatically — each pipeline remembers its last cursor position (e.g., the most recent updated_at timestamp or page offset). On the next sync, dlt resumes from where it left off.
# First sync: loads all historical data
dango sync stripe_prod
# Subsequent syncs: loads only new/changed records
dango sync stripe_prod
Lookback Window¶
Some sources support a lookback window that re-fetches recent data to catch late-arriving records. For example, Google Ads attribution data can update for up to 90 days after the initial click (depending on your attribution model and conversion action settings).
When a source has lookback_days configured in the registry, incremental syncs automatically extend the fetch window back by that many days. This is handled transparently — no configuration needed.
Local File Incremental¶
For local_files sources, incremental sync uses file metadata tracking instead of API cursors:
- Dango maintains a
_dango_file_metadatatable tracking every loaded file - On each sync, files are classified as new, updated, unchanged, or deleted
- Only new and updated files are loaded
- Deleted files are soft-deleted (marked with
_dango_deleted = true)
See Local Files for details.
Full Refresh¶
Full refresh drops all existing data for a source and reloads everything from scratch. Use this when incremental state becomes invalid.
# Full refresh with confirmation prompt
dango sync sales_data --full-refresh
# Skip confirmation
dango sync sales_data --full-refresh --yes
Safety Guard Rails¶
Full refresh includes several protections:
- Confirmation prompt — requires explicit confirmation before proceeding (bypass with
--yes) - State backup — dlt pipeline state is backed up before the refresh starts
- Row count anomaly detection — if the refreshed data has significantly fewer rows than before, Dango warns you and keeps the state backup for recovery
- Automatic restore on failure — if the sync fails mid-refresh, the backed-up state is restored so subsequent incremental syncs work correctly
Full refresh reloads all data
For large sources (e.g., Stripe with years of transaction history), a full refresh may take significantly longer than an incremental sync and consume more API quota. Use --dry-run first to see what would happen.
Write Disposition¶
Some dlt sources always use replace write disposition internally — meaning every sync is effectively a full refresh regardless of the --full-refresh flag. Dango detects this automatically and notes it in the sync output.
Sources known to use replace mode include Stripe, Jira, Asana, Airtable, Notion, and GitHub. For these sources, incremental behavior comes from dlt's state tracking (fetching only new pages), not from the write disposition.
Date Range Sync¶
Date range sync lets you load data for a specific time period. This is useful for backfills, gap filling, and testing.
Flags¶
| Flag | Format | Description |
|---|---|---|
--since | YYYY-MM-DD | Start date (inclusive) |
--until | YYYY-MM-DD | End date (inclusive) |
--backfill | Nd, Nw, Nm | Relative duration from today |
# Load data from a specific date forward
dango sync ga4_data --since 2026-01-01
# Load a specific date range
dango sync ga4_data --since 2026-01-01 --until 2026-03-31
# Backfill last 30 days
dango sync ga4_data --backfill 30d
# Backfill last 2 weeks
dango sync ga4_data --backfill 2w
Backfill Durations¶
The --backfill flag accepts these suffixes:
| Suffix | Meaning | Example |
|---|---|---|
d | Days | 30d = last 30 days |
w | Weeks | 2w = last 14 days |
m | Months | 1m = last 30 days |
Mutual exclusivity
--backfill cannot be combined with --since or --until. Use one approach or the other.
Gap Fill Detection¶
When you provide a --since date earlier than the earliest data in your warehouse, Dango logs a notice indicating a gap fill operation. This helps you track when historical data is being loaded.
CLI Flags Reference¶
All flags available on dango sync:
| Flag | Type | Description |
|---|---|---|
SOURCE_NAME | Positional | Sync a specific source (e.g., dango sync stripe_prod) |
--full-refresh | Flag | Drop existing data and reload from scratch |
--since YYYY-MM-DD | Option | Start date for date range sync |
--until YYYY-MM-DD | Option | End date for date range sync |
--backfill Nd\|Nw\|Nm | Option | Relative backfill duration |
--dry-run | Flag | Preview what would be synced without executing |
--allow-schema-changes | Flag | Allow schema evolution for file sources (add columns, treat missing as NULL) |
--limit N | Option | Limit rows per source (for development/testing) |
--yes / -y | Flag | Skip confirmation prompts |
Dry Run¶
Use --dry-run to preview a sync without making changes:
This shows:
- Which sources would be synced
- Sync options (full refresh, date range, row limit)
- Any warnings (disabled sources, missing credentials)
No data is fetched or written.
Single-Writer Lock¶
DuckDB supports only one writer process at a time. Dango enforces this with a lock file at .dango/state/dbt.lock.
When a sync is running:
- Other
dango synccommands wait up to 5 minutes for the lock (queued) - The lock is released when the sync completes (success or failure)
- If the process crashes, the stale lock is detected and cleaned up on the next sync
$ dango sync orders
⏳ Another sync is running. Waiting for lock (up to 5 minutes)...
✓ Lock acquired. Starting sync.
Parallel source syncs
Sources within a single dango sync run are synced sequentially (one at a time) due to the single-writer constraint. To sync faster, sync individual sources in separate terminal sessions — they'll queue automatically.
Troubleshooting¶
"Lock file exists" or sync appears stuck¶
Another sync process holds the lock. Wait for it to complete, or check for a stale lock:
# Check if a sync is actually running
ps aux | grep "dango sync"
# If no sync is running, the lock is stale — the next sync will clean it up
dango sync my_source
Full refresh loaded fewer rows¶
Dango detected a potential data loss anomaly. The state backup is preserved at ~/.dlt/pipelines/<pipeline_name>_backup_<timestamp>/. If the refresh was intentional (e.g., the source API now has less data), no action is needed. If unexpected, investigate the source.
Date range flags ignored¶
Not all sources support date range filtering. If a source doesn't accept start_date/end_date parameters, Dango warns you and proceeds with a normal incremental sync.
Sync takes much longer than expected¶
- Check if the source uses
replacewrite disposition (every sync reloads all data) - Use
--limit Nduring development to cap row counts - For API sources, check rate limits in the provider's dashboard
Related Pages¶
- Adding Sources — set up a new source and run your first sync
- Deduplication — how duplicate records are handled across strategies
- Local Files — file-specific sync behavior and metadata tracking
- DuckDB & Single-Writer — why only one write process is allowed