Skip to content

Source Registry

Complete reference for all data source types supported by Dango.


Overview

Dango supports 33 data sources across 10 categories, using 5 authentication types. Sources are added via dango source add (wizard-enabled) or manual YAML configuration in .dango/sources.yml.

  • 25 wizard-enabled sources can be configured interactively
  • 8 wizard-disabled sources require manual YAML configuration or the dlt_native bypass
  • Sources without a dedicated Pydantic config model use generic_config: dict in YAML

Source Summary

Source Type Key Category Auth Wizard Incremental
File Import (CSV, JSON, Parquet) local_files Local & Custom None ✅ ✅
REST API (Generic) rest_api Local & Custom API Key ✅ ✅
dlt Native Source (Advanced) dlt_native Local & Custom None ✅
CSV Files csv Local & Custom None ✅
Files & Cloud Storage filesystem Local & Custom None
Google Sheets google_sheets Marketing & Analytics OAuth ✅
Facebook Ads facebook_ads Marketing & Analytics OAuth ✅ ✅
Google Analytics (GA4) google_analytics Marketing & Analytics OAuth ✅ ✅
Google Ads google_ads Marketing & Analytics OAuth ✅
Mux mux Marketing & Analytics API Key ✅
Airtable airtable Marketing & Analytics API Key ✅
Matomo Analytics matomo Marketing & Analytics API Key ✅
HubSpot hubspot Business & CRM API Key ✅ ✅
Salesforce salesforce Business & CRM Service Account ✅ ✅
Zendesk zendesk Business & CRM Basic ✅ ✅
Pipedrive pipedrive Business & CRM API Key ✅ ✅
Freshdesk freshdesk Business & CRM API Key ✅ ✅
Workable workable Business & CRM API Key ✅ ✅
Jira jira Business & CRM Basic
Asana asana Business & CRM API Key ✅
Stripe stripe E-commerce & Payment API Key ✅ ✅
Shopify shopify E-commerce & Payment OAuth ✅
Notion notion Files & Storage API Key ✅
Email Inbox (IMAP) inbox Files & Storage Basic ✅ ✅
MongoDB mongodb Databases Basic ✅ ✅
PostgreSQL postgres Databases Basic ✅ ✅
GitHub github Development API Key ✅ ✅
Slack slack Communication API Key ✅ ✅
Apache Kafka kafka Streaming None ✅ ✅
Amazon Kinesis kinesis Streaming Service Account ✅ ✅
Chess.com chess Other None ✅
Strapi strapi Other API Key
Personio personio Other API Key ✅

Sources by Category

Local & Custom

File Import (local_files) ✅

Load CSV, JSON, JSONL, or Parquet files from a directory. All matching files are combined into a single raw table. On re-sync, new/modified files are loaded, deleted files are removed.

sources:
  - name: sales_data
    type: local_files
    local_files:
      directory: data/uploads/sales
      file_pattern: "*.csv"
Field Type Default Description
directory path -- Directory containing files (required)
file_pattern string "*" Glob pattern for files to load
notes string -- Notes about how to regenerate files

REST API (rest_api) ✅

Connect to any REST API with configurable authentication (bearer, API key, basic, OAuth2 client credentials, custom header).

sources:
  - name: custom_api
    type: rest_api
    rest_api:
      base_url: https://api.example.com/v1
      auth_type: bearer
      auth_token_env: API_TOKEN
      endpoints:
        - path: /users
        - path: /orders
          params:
            limit: 100
Field Type Default Description
base_url string -- Base URL for API (required)
endpoints list[dict] -- Endpoint definitions (required)
auth_type string "bearer" Auth type: bearer, api_key, basic, oauth2_client_credentials, custom_header, none
auth_token_env string -- Env var with auth token/key
api_key_name string -- Header or query param name for API key auth
api_key_location string -- Where to send API key: "header" or "query"
basic_username_env string -- Env var for HTTP Basic username
basic_password_env string -- Env var for HTTP Basic password
access_token_url string -- OAuth2 token endpoint URL
client_id_env string -- Env var for OAuth2 client ID
client_secret_env string -- Env var for OAuth2 client secret
auth_header_name string -- Custom auth header name (e.g., X-Shopify-Access-Token)
headers dict -- Additional request headers

dlt Native Source (dlt_native) ✅

Use any dlt verified source or custom source not in Dango's registry. For advanced users.

sources:
  - name: hubspot_crm
    type: dlt_native
    dlt_native:
      source_module: hubspot
      source_function: hubspot
      function_kwargs:
        api_key: "env:HUBSPOT_API_KEY"
Field Type Default Description
source_module string -- Python module name (required)
source_function string -- Function name to call (required)
function_kwargs dict {} Arguments passed to the source function
pipeline_name string source name Custom pipeline name
dataset_name string source name Custom dataset name

CSV Files (csv)

Hidden source

The csv type is hidden in the wizard. Use local_files instead, which supports CSV plus JSON, JSONL, and Parquet formats.

Field Type Default Description
directory path -- Directory containing CSV files (required)
file_pattern string "*.csv" Glob pattern for files
notes string -- Regeneration notes

Files & Cloud Storage (filesystem)

Hidden source

The filesystem type is hidden in the wizard. Use local_files for local files or filesystem with manual YAML for cloud storage (S3, GCS, Azure).


Marketing & Analytics

Google Sheets (google_sheets) ✅

Load data from Google Sheets (one or more tabs). Requires OAuth setup via dango oauth setup google.

sources:
  - name: budgets
    type: google_sheets
    google_sheets:
      spreadsheet_url_or_id: https://docs.google.com/spreadsheets/d/1abc...
      range_names:
        - Monthly Budget
        - Quarterly Forecast
      deduplication: latest_only
Field Type Default Description
spreadsheet_url_or_id string -- Spreadsheet URL or ID (required)
range_names list[string] -- Sheet/tab names to load (required)
deduplication enum latest_only Dedup strategy: none, latest_only, append_only, scd_type2

Facebook Ads (facebook_ads) ✅

Load ad campaigns, ads, creatives, leads, and daily performance metrics.

sources:
  - name: facebook_marketing
    type: facebook_ads
    facebook_ads:
      account_id: 123456789
      access_token_env: FB_ACCESS_TOKEN
      initial_load_past_days: 30
Field Type Default Description
account_id string -- Facebook Ads Account ID with act_ prefix (required)
access_token_env string FB_ACCESS_TOKEN Env var with access token
initial_load_past_days integer 30 Historical days to load on first sync
start_date date -- Start date (YYYY-MM-DD)
resources list[string] all Resources to sync

Default resources: campaigns, ads, ad_sets, facebook_insights

Available resources: campaigns, ads, ad_sets, ad_creatives, leads, facebook_insights

Google Analytics (google_analytics) ✅

Load website analytics data from Google Analytics 4. Supports custom report queries.

sources:
  - name: website_analytics
    type: google_analytics
    google_analytics:
      property_id: "123456789"
      credentials_env: GOOGLE_CREDENTIALS
      start_date: "2024-01-01"
Field Type Default Description
property_id string -- GA4 property ID (required)
credentials_env string GOOGLE_CREDENTIALS Env var with credentials
start_date string -- Start date (YYYY-MM-DD or relative like 90daysAgo)

Load daily performance metrics from Google Ads via GAQL queries. Includes 5 default queries (campaign stats, ad group stats, keyword stats, ad stats, search term stats).

Field Type Default Description
property_id string -- Google Ads customer ID (required)
credentials_env string GOOGLE_CREDENTIALS Env var with credentials

Mux (mux) ✅

Load video analytics data from Mux.

Field Type Default Description
generic_config dict -- See generic_config

Airtable (airtable) ✅

Load tables from Airtable bases.

Field Type Default Description
generic_config dict -- See generic_config

Matomo Analytics (matomo)

Wizard disabled

Disabled because Matomo passes the auth token via GET parameter, which is a security risk. Configure manually with dlt_native.


Business & CRM

HubSpot (hubspot) ✅

Load contacts, companies, deals, and tickets from HubSpot CRM.

sources:
  - name: hubspot_crm
    type: hubspot
    hubspot:
      api_key_env: HUBSPOT_API_KEY
      resources:
        - contacts
        - companies
        - deals
Field Type Default Description
api_key_env string HUBSPOT_API_KEY Env var with API key
resources list[string] ["contacts", "companies", "deals", "tickets"] Resources to sync

Available resources: contacts, companies, deals, tickets, products, quotes, owners, properties, pipelines_deal, pipelines_ticket

Salesforce (salesforce) ✅

Load data from Salesforce CRM using service account authentication.

Field Type Default Description
resources list[string] all Resources to sync

Default resources: account, contact, lead, opportunity, campaign

Available resources: account, contact, lead, opportunity, campaign, task, event, sf_user, user_role, product_2

Zendesk (zendesk) ✅

Load support tickets, users, and chat data from Zendesk Support.

Field Type Default Description
generic_config dict -- See generic_config

Default resources: tickets, ticket_fields

Available resources: tickets, ticket_fields, ticket_events, ticket_metric_events

Pipedrive (pipedrive) ✅

Load deals, contacts, and activities from Pipedrive CRM.

Field Type Default Description
generic_config dict -- See generic_config

Freshdesk (freshdesk) ✅

Load support tickets, agents, and companies from Freshdesk.

Field Type Default Description
generic_config dict -- See generic_config

Workable (workable) ✅

Load candidates, jobs, and events from Workable ATS.

Field Type Default Description
generic_config dict -- See generic_config

Jira (jira)

Wizard disabled

Disabled due to endpoint issues in the dlt source. Configure manually with dlt_native.

Asana (asana)

Wizard disabled

Disabled because the Asana SDK was removed from the dlt source. Configure manually with dlt_native.


E-commerce & Payment

Stripe (stripe) ✅

Load payment data from Stripe (charges, customers, subscriptions, etc.).

sources:
  - name: stripe_payments
    type: stripe
    stripe:
      stripe_secret_key_env: STRIPE_API_KEY
      endpoints:
        - charges
        - customers
        - invoices
      start_date: "2024-01-01"
Field Type Default Description
stripe_secret_key_env string STRIPE_API_KEY Env var with Stripe secret key
endpoints list[string] all Specific endpoints to sync
start_date date -- Start date (YYYY-MM-DD)
end_date date -- End date (YYYY-MM-DD)

Shopify (shopify)

Wizard disabled

Disabled because Shopify requires Authorization Code Grant OAuth 2.0, which needs a dedicated dango oauth shopify provider (not yet implemented).


Files & Storage

Notion (notion) ✅

Load pages and databases from Notion.

Field Type Default Description
generic_config dict -- See generic_config

Email Inbox (inbox) ✅

Read messages and attachments from email inbox via IMAP.

Field Type Default Description
generic_config dict -- See generic_config

Databases

MongoDB (mongodb) ✅

Load collections from MongoDB databases with incremental support.

Field Type Default Description
generic_config dict -- See generic_config

PostgreSQL (postgres) ✅

Load tables from PostgreSQL databases with schema filtering.

Field Type Default Description
generic_config dict -- See generic_config

Streaming

Apache Kafka (kafka) ✅

Extract messages from Kafka topics.

Field Type Default Description
generic_config dict -- See generic_config

Amazon Kinesis (kinesis) ✅

Read messages from Kinesis streams.

Field Type Default Description
generic_config dict -- See generic_config

Development

GitHub (github) ✅

Load repository data, issues, pull requests, and commits from GitHub.

sources:
  - name: my_repo
    type: github
    github:
      access_token_env: GITHUB_ACCESS_TOKEN
      owner: my-org
      name: my-repo
Field Type Default Description
access_token_env string GITHUB_ACCESS_TOKEN Env var with personal access token
owner string -- GitHub username or organization (required)
name string -- Repository name (required)

Communication

Slack (slack) ✅

Load messages, channels, and user data from Slack.

sources:
  - name: slack_data
    type: slack
    slack:
      access_token_env: SLACK_ACCESS_TOKEN
      selected_channels:
        - C01234ABCDE
Field Type Default Description
access_token_env string SLACK_ACCESS_TOKEN Env var with Slack bot token
selected_channels list[string] all Channel IDs to sync
start_date date -- Start date for message history

Other

Chess.com (chess) ✅

Load player profiles and games from Chess.com API. No authentication required.

Strapi (strapi)

Wizard disabled

Untested, requires a Docker Strapi instance.

Personio (personio)

Wizard disabled

Enterprise-only API.


generic_config

Sources without a dedicated Pydantic configuration model use the generic_config field. This applies to 21+ sources including Zendesk, Pipedrive, Freshdesk, Workable, Airtable, Mux, Notion, Inbox, MongoDB, PostgreSQL, Kafka, Kinesis, and others.

The generic_config field accepts any key-value pairs that the underlying dlt source function expects:

sources:
  - name: my_zendesk
    type: zendesk
    generic_config:
      subdomain: mycompany
      email: [email protected]

Refer to the dlt documentation for source-specific parameters.


Capabilities Matrix

Capability Description Sources
Performance Metrics Source provides built-in analytics/metrics Facebook Ads, Google Analytics, Google Ads, Matomo, Mux
Date Range Supports start_date/end_date filtering Stripe, Shopify, Google Analytics, Google Ads, Zendesk, Workable, Slack, Mux
Incremental Supports incremental loading (only new/changed data) CSV, Local Files, REST API, Facebook Ads, Google Analytics, HubSpot, Salesforce, Zendesk, Pipedrive, Freshdesk, Workable, Stripe, Shopify, Slack, GitHub, Inbox, MongoDB, PostgreSQL, Kafka, Kinesis, Asana, Matomo, Chess, Personio
Custom Queries Supports user-defined queries or report definitions dlt Native, REST API, Google Analytics, Google Ads, Matomo

Authentication Types

Auth Type Description Sources
None No authentication required CSV, Local Files, dlt Native, Filesystem, Kafka, Chess
API Key API key passed via environment variable REST API, Stripe, HubSpot, GitHub, Slack, Pipedrive, Freshdesk, Workable, Airtable, Mux, Matomo, Notion, Asana, Strapi, Personio
OAuth OAuth 2.0 flow via dango oauth <provider> Google Sheets, Facebook Ads, Google Analytics, Google Ads, Shopify
Basic Username/password or token authentication Zendesk, Jira, Inbox, MongoDB, PostgreSQL
Service Account Service account credentials (JSON key file) Salesforce, Kinesis

Common Source Fields

These fields are available on every source regardless of type:

Field Type Default Description
name string -- Unique source name (required, lowercase alphanumeric + underscore)
type SourceType -- Source type key (required)
enabled boolean true Whether to include in syncs
description string -- Human-readable description
tags list[string] [] Metadata tags for organization
lookback_days integer -- Re-load this many days on incremental sync (ignored on full refresh)