Data Sources¶

Connect to APIs, databases, and local files through Dango's unified data ingestion layer.

Overview¶

Dango supports 33 data sources through dlt (data load tool). Whether you're working with local CSV files, cloud APIs, or existing databases, Dango provides a unified configuration interface.

Wizard vs Manual Sources

Wizard-enabled sources (25 sources): Add via dango source add interactive wizard — handles authentication, configuration, and validation automatically.

Manual sources: Configure directly in sources.yml using dlt_native for any dlt verified source.

See the Source Catalog for the complete list of all 33 sources.

Source categories at a glance:

Local Files — CSV, JSON, JSONL, Parquet from your filesystem
OAuth Sources — Google Sheets, GA4, Google Ads, Facebook Ads (browser-based auth)
API Key Sources — Stripe, HubSpot, Salesforce, GitHub, Slack, and more
Database Sources — PostgreSQL, MongoDB, and others via dlt
REST API — Connect to any API with JSON responses
Custom Sources — Build integrations with Python and dlt

For dlt Users¶

If you're already familiar with dlt (data load tool), here's how Dango relates:

Dango wraps dlt with:

YAML configuration instead of Python scripts
Automatic dbt staging model generation
Unified CLI (dango sync) for all sources
Web UI for monitoring and management

What stays the same:

Credentials in .dlt/secrets.toml (same format)
All dlt verified sources available via dlt_native
Standard dlt decorators (@dlt.source, @dlt.resource)

When to use what:

Scenario	Use
Standard sources (Stripe, Google Sheets, etc.)	Dango wizard or YAML config
Custom API with simple logic	Dango `dlt_native` + Python file
Complex pipelines, custom destinations	Pure dlt (Dango not needed)

Learn more:

Custom Sources — "dlt vs. Dango Workflow" comparison
Database Sources — "How This Differs from Standard dlt" table
dlt Documentation — Official dlt docs for advanced topics

Quick Start¶

Add Your First Source¶

Choose your source type and follow the guide:

Local FilesOAuth (Google Sheets)DatabaseCustom API

# Recommended: Use the wizard
dango source add
# Select "File Import (CSV, JSON, Parquet)" and follow prompts

Or configure manually in .dango/sources.yml:

sources:
  - name: sales_data
    type: local_files
    enabled: true
    local_files:
      directory: data/uploads/sales_data
      file_pattern: "*.csv"

Then copy files and sync:

cp my_sales.csv data/uploads/sales_data/
dango sync sales_data

Learn more →

# Interactive setup
dango source add
# Select "Google Sheets" from the list
# Follow OAuth flow in browser

# Sync
dango sync my_sheets

Learn more →

# Configure .dlt/secrets.toml
[sources.sql_database]
credentials = "postgresql://user:pass@host:5432/db"

# Edit .dango/sources.yml
sources:
  - name: my_postgres
    type: dlt_native
    dlt_native:
      source_module: sql_database
      source_function: sql_database
      function_kwargs:
        schema: "public"

# Sync
dango sync my_postgres

Learn more →

# custom_sources/my_api.py
import dlt
import requests

@dlt.source
def my_api():
    @dlt.resource(name="data")
    def get_data():
        return requests.get("https://api.example.com/data").json()
    return [get_data()]

# .dango/sources.yml
sources:
  - name: my_api
    type: dlt_native
    dlt_native:
      source_module: my_api
      source_function: my_api

Learn more →

Source Type Guides¶

Local Files

Load CSV, JSON, JSONL, and Parquet files with automatic schema detection and incremental sync.
- 5 supported formats
- File change tracking
- Schema evolution support
Local Files Guide
OAuth Sources

Connect to cloud services using OAuth 2.0 authentication.
- Google Sheets, GA4, Google Ads, Facebook Ads
- Automatic token management
- Browser-based authentication
OAuth Sources Guide
Database Sources

Connect to PostgreSQL, MySQL, SQL Server via dlt.
- Full table or incremental loading
- SSL/TLS support
Database Sources Guide
Custom Sources

Build custom integrations using Python and dlt.
- REST APIs
- Web scraping
- Custom data formats
Custom Sources Guide
Source Catalog

Complete catalog of all 33 supported data sources.
- Source types and auth methods
- Configuration examples
- Sync behavior details
Source Catalog
Adding Sources

Step-by-step wizard walkthrough and manual YAML configuration.

Adding Sources
Sync Modes

Incremental loading, full refresh, and date range syncs.

Sync Modes
Deduplication

Four strategies for handling duplicate records in your data.

Deduplication

Common Workflows¶

Adding a New Source¶

Choose source type based on your data
Run the wizard with dango source add (or edit sources.yml manually)
Configure credentials (OAuth flow, API key in .env, or connection string)
Sync with dango sync <name>
Verify in Metabase or with dango db query

Managing Multiple Sources¶

# .dango/sources.yml
version: '1.0'
sources:
  # Production Stripe data
  - name: stripe_prod
    type: stripe
    enabled: true
    stripe:
      stripe_secret_key_env: STRIPE_PROD_API_KEY

  # Google Sheets for manual data
  - name: manual_overrides
    type: google_sheets
    enabled: true

  # PostgreSQL analytics database
  - name: analytics_db
    type: dlt_native
    enabled: true
    dlt_native:
      source_module: sql_database
      source_function: sql_database

  # Local CSV exports
  - name: finance_reports
    type: local_files
    enabled: true
    local_files:
      directory: data/uploads/finance_reports
      file_pattern: "*.csv"

Sync All Sources¶

# Sync all enabled sources
dango sync

# Sync specific source
dango sync stripe_prod

# List all sources
dango source list

Data Flow¶

Understanding how data flows from sources to your warehouse:

graph LR
    A[Data Source] --> B[dlt]
    B --> C[Raw Layer]
    C --> D[DuckDB]
    D --> E[dbt Staging]
    E --> F[dbt Marts]
    F --> G[Metabase]

    style A fill:#e1f5ff
    style B fill:#fff3e0
    style C fill:#f3e5f5
    style D fill:#e8f5e9
    style E fill:#fff9c4
    style F fill:#ffebee
    style G fill:#e0f2f1

Source — External API, database, or file
dlt — Fetches and normalizes data
Raw Layer — Source data as-loaded in DuckDB
Staging — Clean starting point (auto-generated by Dango)
Marts — Business logic (custom SQL models you write)
Metabase — Dashboards and queries

Learn more about data layers →

Source Configuration¶

sources.yml Structure¶

version: '1.0'
sources:
  - name: unique_source_name      # Identifier (lowercase, underscores)
    type: local_files              # Source type
    enabled: true                  # Toggle sync
    description: "Optional description"
    local_files:                   # Type-specific config
      directory: data/uploads/unique_source_name
      file_pattern: "*.csv"

Common Parameters¶

Parameter	Required	Description
`name`	Yes	Unique identifier for this source
`type`	Yes	Source type from the catalog
`enabled`	No	Whether to include in sync (default: `true`)
`description`	No	Human-readable description
`deduplication`	No	Strategy: `none`, `latest_only`, `append_only`, `scd_type2`

Credentials Management¶

Never commit credentials! Use one of these methods:

Recommended: .env file (persists across sessions)

# Create or edit .env file (gitignored by default)
echo 'MY_API_KEY=your-key-here' >> .env

Or .dlt/secrets.toml (gitignored credential storage)

[sources.stripe]
api_key = "sk_live_..."

Or environment variables (current session only)

export MY_API_KEY="your-key-here"

Testing Status¶

Source Type	Status	Notes
Local Files	Production-ready	CSV, JSON, JSONL, Parquet
Stripe	Production-ready	All resources supported
Google Sheets	Production-ready	OAuth flow verified
Google Analytics 4	Production-ready	OAuth flow verified
Facebook Ads	Production-ready	OAuth flow verified
Google Ads	Production-ready	OAuth flow verified
HubSpot	Production-ready	Contacts, companies, deals, tickets
GitHub	Production-ready	Issues, PRs, commits
Salesforce	Tested	Service account auth
Slack	Tested	Channels, messages, users
PostgreSQL	Tested	Full table and incremental
MongoDB	Tested	Collections with filtering
REST API	Tested	Generic API connector
dlt_native	Tested	Registry bypass for any dlt source
Coming Soon sources	Pending	Shopify, Matomo, Jira, Asana, Strapi, Personio

Best Practices¶

1. Use Descriptive Names¶

# Good
- name: stripe_production_payments
- name: marketing_facebook_ads
- name: finance_google_sheets

# Avoid
- name: source1
- name: data

2. Enable Only What You Need¶

Disable unused sources to speed up sync:

- name: old_source
  enabled: false  # Keeps config but skips sync

3. Document Your Sources¶

- name: crm_export
  type: local_files
  description: "Weekly CRM export from sales team, updated every Monday"
  local_files:
    directory: data/uploads/crm_export
    file_pattern: "*.csv"

4. Use Incremental When Possible¶

Incremental sync is the default — it loads only new and changed data. See Sync Modes for details on when to use full refresh instead.

5. Monitor Source Health¶

# Validate all sources
dango validate

# Check specific source
dango source list

Troubleshooting¶

Source Not Syncing¶

Check enabled: true in sources.yml
Verify credentials in .env or .dlt/secrets.toml
Run dango validate to see errors
Check network connectivity

Authentication Failures¶

API keys: Verify not expired, check permissions
OAuth: Re-authenticate with dango oauth refresh <source_type>
Database: Test connection outside Dango

Schema Mismatches¶

When APIs change: 1. Run dango sync (schema auto-updates for API sources, staging models regenerated) 2. Update custom dbt models if needed

File schema changes

For local file sources, schema is fixed on first load. Use --allow-schema-changes to add new columns, or --full-refresh to reload with a new schema. See Local Files.

Performance Issues¶

Use incremental loading for large datasets (default behavior)
Sync sources individually rather than all at once
Check API rate limits
Use --limit N during development to cap row counts

Next Steps¶

Adding Sources

Step-by-step wizard walkthrough for your first source.

Adding Sources
Source Catalog

Explore all 33 supported data source types.

Source Catalog
Transformations

Transform your loaded data with dbt.

Transformations