Source & Sync¶

Add, manage, and sync data sources from the command line.

Overview¶

The dango source commands manage your data source configurations, while dango sync loads data from those sources into your DuckDB warehouse.

Commands covered on this page:

dango source add — Add a data source (interactive wizard)
dango source list — List configured sources
dango source remove — Remove a source
dango source edit — Open config in editor
dango sync — Load data from sources

Adding Sources¶

dango source add¶

Launch the interactive source wizard.

dango source add

The wizard supports 27+ sources across 9 categories:

Category	Sources
Marketing & Analytics	Facebook Ads, Google Ads, Google Sheets, Google Analytics, and more
Business & CRM	HubSpot, Salesforce, Zendesk, Jira, and more
E-commerce & Payment	Stripe
Files & Storage	Notion, Email Inbox
Databases	MongoDB
Streaming	Kafka, Kinesis
Development	GitHub
Communication	Slack
Local & Custom	CSV, REST API

Wizard flow:

Select source type from the categorized list
Provide a unique source name
Configure credentials (environment variable or secrets file)
Select resources/tables to sync
Set start date for incremental loading
Test connection (automatic)
Save configuration

Select source type:
  1. CSV files
  2. dlt Native (advanced)
  3. REST API
  4. Google Sheets
  ...

Source name (unique identifier): stripe_payments
API Key stored in environment variable: STRIPE_API_KEY
Start date (YYYY-MM-DD) [2024-01-01]: 2024-06-01

Testing connection...
✓ Connected to Stripe API
✓ Source added to .dango/sources.yml

Tip

After adding a source, run dango sync <source_name> to load data, then dango generate to create staging models.

Listing Sources¶

dango source list¶

List all configured data sources with their status and last sync time.

dango source list [OPTIONS]

Option	Description
`--enabled-only`	Show only enabled sources

Example:

dango source list

Configured Sources:

  ● stripe_payments (stripe) - Enabled
      Last synced: 2024-12-09 12:34:56
      Tables: charges, customers, subscriptions

  ● orders_csv (csv) - Enabled
      Last synced: 2024-12-08 18:45:12
      File: data/orders.csv

  ○ old_hubspot (hubspot) - Disabled
      Last synced: 2024-12-04 14:22:10

Filter to enabled only:

dango source list --enabled-only

Removing Sources¶

dango source remove¶

Remove a data source from configuration.

dango source remove SOURCE_NAME [OPTIONS]

Parameter	Type	Description
`SOURCE_NAME`	positional, required	Name of source to remove
`-y`, `--yes`	flag	Skip confirmation prompt

dango source remove my_csv
dango source remove my_csv --yes

Warning

This removes the source from .dango/sources.yml. It does not delete data already loaded into DuckDB. Use dango db clean to remove orphaned tables.

Editing Sources¶

dango source edit¶

Open sources.yml in your default editor ($EDITOR).

dango source edit [NAME]

Parameter	Type	Description
`NAME`	positional, optional	Hints at the section to focus on

dango source edit              # Edit full sources.yml
dango source edit chess         # Edit with focus hint

Tip

Set the EDITOR environment variable to your preferred editor: export EDITOR=vim

Syncing Data¶

Web UI alternative

You can also trigger syncs from the Sources page in the Web UI. Navigate to http://localhost:8800/sources and click Sync Now on any source. See Web UI — Sources.

dango sync¶

Load data from all sources (or a specific source) into DuckDB.

dango sync [SOURCE_NAME] [OPTIONS]

Arguments:

Parameter	Type	Description
`SOURCE_NAME`	positional, optional	Sync only this source

Options:

Option	Type	Description
`--source TEXT`	deprecated	Use positional `SOURCE_NAME` instead
`--since TEXT`	date	Start date for incremental loading (YYYY-MM-DD)
`--until TEXT`	date	End date for incremental loading (YYYY-MM-DD)
`--backfill TEXT`	duration	Backfill duration (e.g. `7d`, `2w`, `1m`)
`--limit INTEGER`	number	Limit rows per source (dev testing)
`--full-refresh`	flag	Drop existing data and reload from scratch
`--dry-run`	flag	Show what would be synced without executing
`--allow-schema-changes`	flag	Allow CSV schema changes (add columns, NULL for missing)
`-y`, `--yes`	flag	Skip confirmation prompts

What happens during sync:

Reads .dango/sources.yml for enabled sources
For each source: connects, fetches data (incremental or full), writes to DuckDB raw layer
Updates metadata (_dlt_loads table)
Logs progress and results

Sync All Sources¶

dango sync

Sync Specific Source¶

dango sync stripe_payments

Incremental Loading¶

By default, syncs use incremental loading — only fetching new data since the last sync. Override the start date:

dango sync --since 2024-01-01
dango sync --until 2024-06-30
dango sync --since 2024-01-01 --until 2024-06-30

Backfill¶

Backfill a relative time range:

dango sync --backfill 7d      # Last 7 days
dango sync --backfill 2w      # Last 2 weeks
dango sync --backfill 1m      # Last 1 month

Full Refresh¶

Drop all existing data for the source and reload from scratch:

dango sync --full-refresh

Warning

Full refresh deletes existing data before reloading. For large sources, this can take significant time and API quota.

Dev Mode¶

Limit rows per source for quick testing:

dango sync --limit 1000

Dry Run¶

Preview what would be synced without executing:

dango sync --dry-run

CSV Schema Changes¶

Allow schema changes when syncing CSV files (new columns added, missing columns treated as NULL):

dango sync --allow-schema-changes

Common Workflows¶

Initial Setup¶

# Add a source
dango source add

# Sync data
dango sync stripe_payments

# Generate staging models
dango generate --all

# Run transformations
dango run

Daily Operations¶

# Sync all sources and run transformations
dango sync && dango run

# Sync with backfill for missed data
dango sync --backfill 3d && dango run

Schema Changes¶

When a source adds new columns:

# For CSV: allow new columns
dango sync my_csv --allow-schema-changes

# Regenerate staging models to pick up new columns
dango generate --all

Troubleshooting¶

Sync fails with connection error

Check that credentials are correctly set in your environment or .dlt/secrets.toml. Run dango oauth check for OAuth sources or verify API keys are exported.

Sync is slow

Use --limit for development. For production, ensure incremental loading is configured (check start_date in source config). Avoid --full-refresh unless necessary.

Orphaned tables after removing a source

Run dango db status to see orphaned tables, then dango db clean to remove them.

CSV schema changed

Use --allow-schema-changes to handle added/removed columns. New columns get NULL for historical rows; removed columns keep existing data.

CLI Reference — Quick reference for all commands
Transform & Model — dbt transformations after syncing
Schedule Commands — Automate sync schedules
OAuth Commands — Set up OAuth for Google/Facebook sources