Skip to content

REST API

Connect any REST API as a data source with configurable authentication, pagination, and endpoint mapping.


Overview

The REST API source type connects Dango to any HTTP-based API. Use it for APIs that aren't covered by a built-in source — internal services, niche SaaS tools, or any service with a REST interface.

When to use REST API:

  • The API you need isn't in the Source Catalog
  • You want a no-code/low-code setup (no Python required)
  • The API returns JSON responses

When to use Custom Sources instead:

  • You need complex data transformations during ingestion
  • The API requires non-standard authentication flows
  • You need to call multiple dependent endpoints in sequence

Managing this source in the Web UI

After setup, manage this source from the Sources page in the Web UI (http://localhost:8800/sources). Trigger syncs, view history, and monitor status without using the CLI. See Web UI — Sources.


Prerequisites

  • The API's base URL (e.g., https://api.example.com/v2)
  • Authentication credentials (API key, username/password, or OAuth2 client credentials)
  • Knowledge of the endpoints you want to sync

Setup

The wizard walks you through every step — base URL, auth, endpoints, pagination, and a live test:

dango source add
? Select a data source: REST API
? Source name: acme
? Base URL (e.g., https://api.example.com): https://api.example.com/v2
? Authentication method: Bearer Token
? Environment variable for bearer token [ACME_API_TOKEN]: ACME_API_TOKEN
? Add custom headers? No
? Endpoint path (e.g., /orders): /orders
? Resource name (table name in DuckDB) [orders]: orders
? Add query parameters? No
? Pagination type: Auto-detect (recommended)
? Test this endpoint? Yes
  ✓ 200 OK — 50 records found
? Data path [data.orders] (blank=auto-detect): data.orders
? Primary key field (default: id): id
  ✓ Added: /orders → orders
? Add another endpoint? No

The wizard creates your configuration, tests each endpoint, and suggests the data_selector (JSON path to your results array) based on the API response.

Via Configuration File

Edit .dango/sources.yml:

version: '1.0'
sources:
  - name: acme
    type: rest_api
    enabled: true
    description: My REST API data
    rest_api:
      base_url: https://api.example.com/v2
      auth_type: bearer
      auth_token_env: ACME_API_TOKEN
      endpoints:
        - path: /orders
          name: orders
          data_selector: data.orders
        - path: /customers
          name: customers
          data_selector: data.customers

Store credentials in .env:

# .env (gitignored)
ACME_API_TOKEN=your_token_here

Set credentials on the remote server:

dango remote env set ACME_API_TOKEN your_token_here

First Sync

dango sync acme

Authentication Types

Dango supports 6 authentication methods. Choose the one that matches your API's requirements.

Bearer Token

The most common method. Sends an Authorization: Bearer <token> header.

rest_api:
  base_url: https://api.example.com
  auth_type: bearer
  auth_token_env: MY_API_TOKEN
  endpoints:
    - path: /data
      name: data
# .env
MY_API_TOKEN=your_bearer_token_here

API Key (Header or Query)

Sends the API key as a custom header or query parameter.

rest_api:
  base_url: https://api.example.com
  auth_type: api_key
  auth_token_env: MY_API_KEY
  api_key_name: X-API-Key
  api_key_location: header
  endpoints:
    - path: /data
      name: data

Sends: X-API-Key: your_key in the request header.

rest_api:
  base_url: https://api.example.com
  auth_type: api_key
  auth_token_env: MY_API_KEY
  api_key_name: api_key
  api_key_location: query
  endpoints:
    - path: /data
      name: data

Appends: ?api_key=your_key to the URL.

# .env
MY_API_KEY=your_api_key_here

HTTP Basic

Username and password sent as a standard HTTP Basic auth header.

rest_api:
  base_url: https://api.example.com
  auth_type: basic
  basic_username_env: MY_API_USERNAME
  basic_password_env: MY_API_PASSWORD
  endpoints:
    - path: /data
      name: data
# .env
MY_API_USERNAME=your_username
MY_API_PASSWORD=your_password

OAuth2 Client Credentials

For APIs using OAuth2 Client Credentials Grant (machine-to-machine). Dango fetches an access token automatically from the token endpoint.

rest_api:
  base_url: https://api.example.com
  auth_type: oauth2_client_credentials
  access_token_url: https://auth.example.com/oauth/token
  client_id_env: MY_API_CLIENT_ID
  client_secret_env: MY_API_CLIENT_SECRET
  endpoints:
    - path: /data
      name: data
# .env
MY_API_CLIENT_ID=your_client_id
MY_API_CLIENT_SECRET=your_client_secret

Not all OAuth2 APIs support Client Credentials

Some APIs (e.g., Shopify) require Authorization Code Grant, which involves a browser-based login flow. Client Credentials only works for APIs that support machine-to-machine authentication. If you get authentication errors, check whether the API requires a different OAuth2 flow.

Custom Header Token

For APIs that use a non-standard header name (e.g., X-Shopify-Access-Token, X-Auth-Token).

rest_api:
  base_url: https://mystore.myshopify.com/admin/api/2024-01
  auth_type: custom_header
  auth_header_name: X-Shopify-Access-Token
  auth_token_env: SHOPIFY_ACCESS_TOKEN
  endpoints:
    - path: /orders.json
      name: orders
      data_selector: orders
# .env
SHOPIFY_ACCESS_TOKEN=shpat_your_token_here

No Authentication

For public APIs that don't require authentication.

rest_api:
  base_url: https://jsonplaceholder.typicode.com
  auth_type: none
  endpoints:
    - path: /posts
      name: posts

Custom Headers (Any Auth Type)

Add extra headers to every request, regardless of auth type. Useful for API versioning, content negotiation, or additional authentication headers.

rest_api:
  base_url: https://api.github.com
  auth_type: bearer
  auth_token_env: GITHUB_TOKEN
  headers:
    Accept: application/vnd.github.v3+json
    X-Custom-Header: my-value
  endpoints:
    - path: /user/repos
      name: repos

Header values can reference environment variables with ${VAR_NAME} syntax:

headers:
  X-Workspace-ID: ${MY_WORKSPACE_ID}

Pagination Types

Most APIs return data in pages. Dango supports 6 pagination strategies.

Omit the paginator field entirely. dlt inspects response headers and body to determine the correct pagination strategy automatically.

endpoints:
  - path: /orders
    name: orders
    # No paginator — dlt auto-detects

Start with auto-detect

Auto-detect works for most APIs (GitHub, Stripe, HubSpot, etc.). Only specify a paginator if auto-detect fails or returns incomplete data.

Used by APIs that return a Link header with rel="next" (GitHub, Shopify, many REST APIs).

endpoints:
  - path: /repos
    name: repos
    paginator: header_link

Page Number

Increments a page parameter (?page=1, ?page=2, etc.).

endpoints:
  - path: /items
    name: items
    paginator:
      type: page_number
      page_param: page       # Default: "page"

Cursor-Based

Uses a cursor/token from the response to fetch the next page (Stripe, Slack, GraphQL APIs).

endpoints:
  - path: /events
    name: events
    paginator:
      type: cursor
      cursor_path: next_cursor    # Default: "next"

The cursor_path is the JSON path in the response body that contains the next page cursor.

Offset-Based

Uses offset and limit parameters (?offset=0&limit=100, ?offset=100&limit=100, etc.).

endpoints:
  - path: /records
    name: records
    paginator:
      type: offset
      limit: 100               # Default: 100

None (Single Page)

For endpoints that return all data in a single response.

endpoints:
  - path: /config
    name: config
    paginator: single_page

Endpoint Configuration

Each endpoint defines one API path to sync. The data from each endpoint becomes a separate table in DuckDB.

Fields

Field Required Default Description
path Yes API endpoint path (e.g., /orders)
name Yes Derived from path Table name in DuckDB
data_selector No Auto-detected JSON path to the results array
primary_key No id Field used for merge/deduplication
params No Query parameters as key-value pairs
paginator No Auto-detect Pagination strategy (see above)

Query Parameters

Add static query parameters to every request for an endpoint:

endpoints:
  - path: /orders
    name: orders
    params:
      status: active
      sort: created_at
      per_page: "100"

Data Path Detection

The data_selector tells Dango where the actual records are inside the JSON response. Many APIs wrap results in a container:

{
  "status": "ok",
  "data": {
    "orders": [
      {"id": 1, "total": 99.99},
      {"id": 2, "total": 49.50}
    ]
  },
  "meta": {"page": 1, "total": 42}
}

For this response, set data_selector: data.orders to extract the orders array.

When to set it:

  • API wraps results in an envelope (e.g., {"data": [...]}) — set to data
  • API nests results deeper (e.g., {"response": {"items": [...]}}) — set to response.items
  • API returns a bare array [{...}, {...}] — leave blank (auto-detected)

Use the wizard test

When you use the wizard and test an endpoint, Dango inspects the response and suggests the correct data_selector. Accept the suggestion or override it.


Configuration Reference

Complete annotated YAML example:

version: '1.0'
sources:
  - name: my_api
    type: rest_api
    enabled: true
    description: My REST API data source
    rest_api:
      # Required
      base_url: https://api.example.com/v2

      # Authentication (pick one auth_type)
      auth_type: bearer                    # bearer | api_key | basic | oauth2_client_credentials | custom_header | none
      auth_token_env: MY_API_TOKEN         # For bearer, api_key, custom_header
      # api_key_name: X-API-Key           # For api_key: header/param name
      # api_key_location: header           # For api_key: "header" or "query"
      # basic_username_env: MY_USER        # For basic
      # basic_password_env: MY_PASS        # For basic
      # access_token_url: https://...      # For oauth2_client_credentials
      # client_id_env: MY_CLIENT_ID        # For oauth2_client_credentials
      # client_secret_env: MY_SECRET       # For oauth2_client_credentials
      # auth_header_name: X-Custom-Auth    # For custom_header

      # Optional: extra headers on every request
      headers:
        Accept: application/json
        X-Custom: ${MY_ENV_VAR}            # Env var reference

      # Endpoints (at least one required)
      endpoints:
        - path: /orders
          name: orders
          data_selector: data.orders       # JSON path to results array
          primary_key: order_id            # Default: "id"
          params:
            status: active
          paginator:
            type: page_number
            page_param: page

        - path: /customers
          name: customers
          data_selector: data
          paginator: header_link           # String shorthand

        - path: /config
          name: config
          paginator: single_page           # No pagination

Examples

Example 1: JSONPlaceholder (No Auth, No Pagination)

The simplest possible REST API source — a public API with no authentication.

version: '1.0'
sources:
  - name: jsonplaceholder
    type: rest_api
    enabled: true
    description: JSONPlaceholder test API
    rest_api:
      base_url: https://jsonplaceholder.typicode.com
      auth_type: none
      endpoints:
        - path: /posts
          name: posts
        - path: /users
          name: users
        - path: /comments
          name: comments

No .env file needed. Sync with:

dango sync jsonplaceholder

Tables created: raw_jsonplaceholder.posts, raw_jsonplaceholder.users, raw_jsonplaceholder.comments

version: '1.0'
sources:
  - name: github
    type: rest_api
    enabled: true
    description: GitHub repository data
    rest_api:
      base_url: https://api.github.com
      auth_type: bearer
      auth_token_env: GITHUB_TOKEN
      headers:
        Accept: application/vnd.github.v3+json
      endpoints:
        - path: /user/repos
          name: repos
          paginator: header_link
          params:
            per_page: "100"
            sort: updated
        - path: /user/starred
          name: starred_repos
          paginator: header_link
          params:
            per_page: "100"
# .env
GITHUB_TOKEN=ghp_your_personal_access_token

Tables created: raw_github.repos, raw_github.starred_repos

Creating a GitHub token
  1. Go to Settings > Developer settings > Personal access tokens > Fine-grained tokens
  2. Click Generate new token
  3. Select the repositories and permissions you need
  4. Copy the token (starts with ghp_)

Example 3: Shopify API (Custom Header Auth, Data Selector)

Shopify uses a custom authentication header and wraps responses in a container object.

version: '1.0'
sources:
  - name: shopify
    type: rest_api
    enabled: true
    description: Shopify store data
    rest_api:
      base_url: https://mystore.myshopify.com/admin/api/2024-01
      auth_type: custom_header
      auth_header_name: X-Shopify-Access-Token
      auth_token_env: SHOPIFY_ACCESS_TOKEN
      endpoints:
        - path: /orders.json
          name: orders
          data_selector: orders
          paginator: header_link
          params:
            status: any
            limit: "250"
        - path: /products.json
          name: products
          data_selector: products
          paginator: header_link
          params:
            limit: "250"
        - path: /customers.json
          name: customers
          data_selector: customers
          paginator: header_link
          params:
            limit: "250"
# .env
SHOPIFY_ACCESS_TOKEN=shpat_your_access_token

Tables created: raw_shopify.orders, raw_shopify.products, raw_shopify.customers

Shopify data_selector

Shopify wraps responses like {"orders": [...]}. The data_selector: orders extracts the array from the wrapper object.


Common Issues

Pagination Not Working / Missing Data

Symptoms: Only the first page of results is returned.

Solutions:

  1. Try a specific paginator instead of auto-detect. Check the API docs for how pagination works.
  2. For page_number, verify the correct page_param name (some APIs use p, pageNo, etc.).
  3. For cursor, check the correct cursor_path in the response body.

401 Unauthorized / 403 Forbidden

  • Verify your credentials in .env are correct
  • Check that the token/key hasn't expired
  • For API keys, verify the key has access to the endpoints you configured
  • For OAuth2, verify the access_token_url is correct

Empty Data / No Records

  • Wrong data_selector: Test the endpoint manually (e.g., with curl) and check the JSON structure. Set data_selector to the correct path.
  • Wrong endpoint path: Verify the path is correct relative to the base_url.
  • Query parameters filtering too aggressively: Remove params temporarily to test.

data_selector Troubleshooting

If the API response looks like:

{"result": {"items": [{"id": 1}, {"id": 2}]}, "count": 2}

Set data_selector: result.items.

If the API returns a bare array:

[{"id": 1}, {"id": 2}]

Leave data_selector blank — dlt detects this automatically.

Rate Limiting (429 Errors)

Most APIs enforce rate limits. dlt includes built-in retry logic with exponential backoff for 429 responses. If you still hit limits:

  • Reduce the number of endpoints synced at once
  • Add per_page or limit parameters to reduce request frequency
  • Increase time between syncs in your schedule

Next Steps