CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-apache-airflow-providers-hashicorp

Apache Airflow provider package for HashiCorp Vault integration, enabling secret management and authentication within Airflow workflows.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

secrets-backend.mddocs/

Secrets Backend

Automatic retrieval of Airflow connections, variables, and configurations from HashiCorp Vault. The VaultBackend integrates seamlessly with Airflow's secrets management system, enabling transparent access to secrets stored in Vault without code changes.

Capabilities

VaultBackend Class

Secrets backend implementation that retrieves Airflow secrets from HashiCorp Vault with configurable path mapping and authentication.

class VaultBackend(BaseSecretsBackend, LoggingMixin):
    def __init__(
        self,
        connections_path: str | None = "connections",
        variables_path: str | None = "variables",
        config_path: str | None = "config",
        url: str | None = None,
        auth_type: str = "token",
        auth_mount_point: str | None = None,
        mount_point: str | None = "secret",
        kv_engine_version: int = 2,
        token: str | None = None,
        token_path: str | None = None,
        username: str | None = None,
        password: str | None = None,
        key_id: str | None = None,
        secret_id: str | None = None,
        role_id: str | None = None,
        assume_role_kwargs: dict | None = None,
        region: str | None = None,
        kubernetes_role: str | None = None,
        kubernetes_jwt_path: str = "/var/run/secrets/kubernetes.io/serviceaccount/token",
        gcp_key_path: str | None = None,
        gcp_keyfile_dict: dict | None = None,
        gcp_scopes: str | None = None,
        azure_tenant_id: str | None = None,
        azure_resource: str | None = None,
        radius_host: str | None = None,
        radius_secret: str | None = None,
        radius_port: int | None = None,
        **kwargs
    ):
        """
        Initialize VaultBackend with path and authentication configuration.

        Parameters:
        - connections_path: Vault path for Airflow connections (default: "connections")
        - variables_path: Vault path for Airflow variables (default: "variables")
        - config_path: Vault path for Airflow configurations (default: "config")
        - url: Base URL for Vault instance
        - auth_type: Authentication method (default: "token")
        - auth_mount_point: Mount point for authentication method
        - mount_point: Secret engine mount point (default: "secret")
        - kv_engine_version: KV engine version (1 or 2, default: 2)
        - token: Authentication token (for token/github auth)
        - token_path: Path to token file (for token/github auth)
        - username: Username (for ldap/userpass auth)
        - password: Password (for ldap/userpass auth)
        - key_id: Key ID (for aws_iam/azure auth)
        - secret_id: Secret ID (for approle/aws_iam/azure auth)
        - role_id: Role ID (for approle/aws_iam auth)
        - assume_role_kwargs: AWS assume role parameters
        - region: AWS region for STS API calls
        - kubernetes_role: Kubernetes authentication role
        - kubernetes_jwt_path: Path to Kubernetes JWT token
        - gcp_key_path: Path to GCP service account key file
        - gcp_keyfile_dict: GCP keyfile parameters as dictionary
        - gcp_scopes: OAuth2 scopes for GCP authentication
        - azure_tenant_id: Azure AD tenant ID
        - azure_resource: Azure application URL
        - radius_host: RADIUS server host
        - radius_secret: RADIUS shared secret
        - radius_port: RADIUS server port
        """

Connection Retrieval

Retrieve Airflow connection objects from Vault with automatic deserialization.

def get_connection(self, conn_id: str) -> Connection | None:
    """
    Retrieve Airflow connection from Vault.

    Parameters:
    - conn_id: Connection identifier

    Returns:
    Connection | None: Airflow Connection object or None if not found
    """

def get_response(self, conn_id: str) -> dict | None:
    """
    Get raw response dictionary for connection from Vault.

    Parameters:
    - conn_id: Connection identifier

    Returns:
    dict | None: Raw connection data or None if not found
    """

Variable Management

Access Airflow variables stored in Vault as key-value pairs.

def get_variable(self, key: str) -> str | None:
    """
    Retrieve Airflow variable from Vault.

    Parameters:
    - key: Variable key name

    Returns:
    str | None: Variable value as string or None if not found
    """

Configuration Access

Retrieve Airflow configuration values from Vault for dynamic configuration management.

def get_config(self, key: str) -> str | None:
    """
    Retrieve Airflow configuration value from Vault.

    Parameters:
    - key: Configuration key name

    Returns:
    str | None: Configuration value as string or None if not found
    """

Configuration Examples

Airflow Configuration

Configure VaultBackend in airflow.cfg:

[secrets]
backend = airflow.providers.hashicorp.secrets.vault.VaultBackend
backend_kwargs = {
    "connections_path": "connections",
    "variables_path": "variables", 
    "url": "http://127.0.0.1:8200",
    "mount_point": "secret",
    "auth_type": "token",
    "token": "your-vault-token"
}

Advanced Authentication

[secrets]
backend = airflow.providers.hashicorp.secrets.vault.VaultBackend
backend_kwargs = {
    "connections_path": "airflow/connections",
    "variables_path": "airflow/variables",
    "config_path": "airflow/config",
    "url": "https://vault.company.com:8200",
    "mount_point": "kv",
    "kv_engine_version": 2,
    "auth_type": "kubernetes",
    "kubernetes_role": "airflow-prod",
    "kubernetes_jwt_path": "/var/run/secrets/kubernetes.io/serviceaccount/token"
}

Custom Path Configuration

backend_kwargs = {
    "connections_path": "prod/airflow/connections",
    "variables_path": "prod/airflow/variables",
    "config_path": "prod/airflow/config",
    "url": "https://vault.example.com",
    "mount_point": "secret-v2",
    "auth_type": "approle",
    "role_id": "12345678-1234-1234-1234-123456789012",
    "secret_id": "abcdef12-3456-7890-abcd-ef1234567890"
}

Vault Secret Structure

Connection Secrets

Store Airflow connections in Vault with standard connection attributes:

{
  "conn_type": "postgres",
  "host": "db.example.com",
  "login": "airflow_user",
  "password": "secure_password",
  "schema": "airflow_db",
  "port": 5432,
  "extra": "{\"sslmode\": \"require\"}"
}

Vault path: {mount_point}/{connections_path}/{conn_id}

Variable Secrets

Store variables as simple key-value pairs:

{
  "value": "production_api_key_12345"
}

Vault path: {mount_point}/{variables_path}/{variable_key}

Configuration Secrets

Store configuration values for dynamic Airflow configuration:

{
  "value": "INFO"
}

Vault path: {mount_point}/{config_path}/{config_key}

Usage in DAGs

Once configured, secrets are automatically retrieved without code changes:

from airflow import DAG
from airflow.providers.postgres.operators.postgres import PostgresOperator
from airflow.models import Variable

# Connection automatically retrieved from Vault
postgres_task = PostgresOperator(
    task_id='run_query',
    postgres_conn_id='postgres_default',  # Retrieved from Vault
    sql='SELECT * FROM users;'
)

# Variable automatically retrieved from Vault  
api_key = Variable.get('api_key')  # Retrieved from Vault

Path Resolution

The backend constructs Vault paths using the pattern: {mount_point}/{path_type}/{identifier}

Examples:

  • Connection postgres_default: secret/connections/postgres_default
  • Variable api_key: secret/variables/api_key
  • Config logging_level: secret/config/logging_level

Set mount_point to None to disable the mount point prefix and use full paths directly.

Types

# External types used in API signatures
from airflow.models.connection import Connection
from airflow.secrets import BaseSecretsBackend
from airflow.utils.log.logging_mixin import LoggingMixin

Install with Tessl CLI

npx tessl i tessl/pypi-apache-airflow-providers-hashicorp

docs

authentication.md

index.md

secrets-backend.md

vault-hook.md

tile.json