Apache Airflow provider package for HashiCorp Vault integration, enabling secret management and authentication within Airflow workflows.
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Automatic retrieval of Airflow connections, variables, and configurations from HashiCorp Vault. The VaultBackend integrates seamlessly with Airflow's secrets management system, enabling transparent access to secrets stored in Vault without code changes.
Secrets backend implementation that retrieves Airflow secrets from HashiCorp Vault with configurable path mapping and authentication.
class VaultBackend(BaseSecretsBackend, LoggingMixin):
def __init__(
self,
connections_path: str | None = "connections",
variables_path: str | None = "variables",
config_path: str | None = "config",
url: str | None = None,
auth_type: str = "token",
auth_mount_point: str | None = None,
mount_point: str | None = "secret",
kv_engine_version: int = 2,
token: str | None = None,
token_path: str | None = None,
username: str | None = None,
password: str | None = None,
key_id: str | None = None,
secret_id: str | None = None,
role_id: str | None = None,
assume_role_kwargs: dict | None = None,
region: str | None = None,
kubernetes_role: str | None = None,
kubernetes_jwt_path: str = "/var/run/secrets/kubernetes.io/serviceaccount/token",
gcp_key_path: str | None = None,
gcp_keyfile_dict: dict | None = None,
gcp_scopes: str | None = None,
azure_tenant_id: str | None = None,
azure_resource: str | None = None,
radius_host: str | None = None,
radius_secret: str | None = None,
radius_port: int | None = None,
**kwargs
):
"""
Initialize VaultBackend with path and authentication configuration.
Parameters:
- connections_path: Vault path for Airflow connections (default: "connections")
- variables_path: Vault path for Airflow variables (default: "variables")
- config_path: Vault path for Airflow configurations (default: "config")
- url: Base URL for Vault instance
- auth_type: Authentication method (default: "token")
- auth_mount_point: Mount point for authentication method
- mount_point: Secret engine mount point (default: "secret")
- kv_engine_version: KV engine version (1 or 2, default: 2)
- token: Authentication token (for token/github auth)
- token_path: Path to token file (for token/github auth)
- username: Username (for ldap/userpass auth)
- password: Password (for ldap/userpass auth)
- key_id: Key ID (for aws_iam/azure auth)
- secret_id: Secret ID (for approle/aws_iam/azure auth)
- role_id: Role ID (for approle/aws_iam auth)
- assume_role_kwargs: AWS assume role parameters
- region: AWS region for STS API calls
- kubernetes_role: Kubernetes authentication role
- kubernetes_jwt_path: Path to Kubernetes JWT token
- gcp_key_path: Path to GCP service account key file
- gcp_keyfile_dict: GCP keyfile parameters as dictionary
- gcp_scopes: OAuth2 scopes for GCP authentication
- azure_tenant_id: Azure AD tenant ID
- azure_resource: Azure application URL
- radius_host: RADIUS server host
- radius_secret: RADIUS shared secret
- radius_port: RADIUS server port
"""Retrieve Airflow connection objects from Vault with automatic deserialization.
def get_connection(self, conn_id: str) -> Connection | None:
"""
Retrieve Airflow connection from Vault.
Parameters:
- conn_id: Connection identifier
Returns:
Connection | None: Airflow Connection object or None if not found
"""
def get_response(self, conn_id: str) -> dict | None:
"""
Get raw response dictionary for connection from Vault.
Parameters:
- conn_id: Connection identifier
Returns:
dict | None: Raw connection data or None if not found
"""Access Airflow variables stored in Vault as key-value pairs.
def get_variable(self, key: str) -> str | None:
"""
Retrieve Airflow variable from Vault.
Parameters:
- key: Variable key name
Returns:
str | None: Variable value as string or None if not found
"""Retrieve Airflow configuration values from Vault for dynamic configuration management.
def get_config(self, key: str) -> str | None:
"""
Retrieve Airflow configuration value from Vault.
Parameters:
- key: Configuration key name
Returns:
str | None: Configuration value as string or None if not found
"""Configure VaultBackend in airflow.cfg:
[secrets]
backend = airflow.providers.hashicorp.secrets.vault.VaultBackend
backend_kwargs = {
"connections_path": "connections",
"variables_path": "variables",
"url": "http://127.0.0.1:8200",
"mount_point": "secret",
"auth_type": "token",
"token": "your-vault-token"
}[secrets]
backend = airflow.providers.hashicorp.secrets.vault.VaultBackend
backend_kwargs = {
"connections_path": "airflow/connections",
"variables_path": "airflow/variables",
"config_path": "airflow/config",
"url": "https://vault.company.com:8200",
"mount_point": "kv",
"kv_engine_version": 2,
"auth_type": "kubernetes",
"kubernetes_role": "airflow-prod",
"kubernetes_jwt_path": "/var/run/secrets/kubernetes.io/serviceaccount/token"
}backend_kwargs = {
"connections_path": "prod/airflow/connections",
"variables_path": "prod/airflow/variables",
"config_path": "prod/airflow/config",
"url": "https://vault.example.com",
"mount_point": "secret-v2",
"auth_type": "approle",
"role_id": "12345678-1234-1234-1234-123456789012",
"secret_id": "abcdef12-3456-7890-abcd-ef1234567890"
}Store Airflow connections in Vault with standard connection attributes:
{
"conn_type": "postgres",
"host": "db.example.com",
"login": "airflow_user",
"password": "secure_password",
"schema": "airflow_db",
"port": 5432,
"extra": "{\"sslmode\": \"require\"}"
}Vault path: {mount_point}/{connections_path}/{conn_id}
Store variables as simple key-value pairs:
{
"value": "production_api_key_12345"
}Vault path: {mount_point}/{variables_path}/{variable_key}
Store configuration values for dynamic Airflow configuration:
{
"value": "INFO"
}Vault path: {mount_point}/{config_path}/{config_key}
Once configured, secrets are automatically retrieved without code changes:
from airflow import DAG
from airflow.providers.postgres.operators.postgres import PostgresOperator
from airflow.models import Variable
# Connection automatically retrieved from Vault
postgres_task = PostgresOperator(
task_id='run_query',
postgres_conn_id='postgres_default', # Retrieved from Vault
sql='SELECT * FROM users;'
)
# Variable automatically retrieved from Vault
api_key = Variable.get('api_key') # Retrieved from VaultThe backend constructs Vault paths using the pattern:
{mount_point}/{path_type}/{identifier}
Examples:
postgres_default: secret/connections/postgres_defaultapi_key: secret/variables/api_keylogging_level: secret/config/logging_levelSet mount_point to None to disable the mount point prefix and use full paths directly.
# External types used in API signatures
from airflow.models.connection import Connection
from airflow.secrets import BaseSecretsBackend
from airflow.utils.log.logging_mixin import LoggingMixinInstall with Tessl CLI
npx tessl i tessl/pypi-apache-airflow-providers-hashicorp