A curated collection of Agent Skills for working with dbt, to help AI agents understand and execute dbt workflows more effectively.
65
82%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Risky
Do not use without reviewing
Systematically diagnose and resolve dbt Cloud job failures using available MCP tools, CLI commands, and data investigation.
Not for: Local dbt development errors - use the skill using-dbt-for-analytics-engineering instead
Never modify a test to make it pass without understanding why it's failing.
A failing test is evidence of a problem. Changing the test to pass hides the problem. Investigate the root cause first.
| You're Thinking... | Reality |
|---|---|
| "Just make the test pass" | The test is telling you something is wrong. Investigate first. |
| "There's a board meeting in 2 hours" | Rushing to a fix without diagnosis creates bigger problems. |
| "We've already spent 2 days on this" | Sunk cost doesn't justify skipping proper diagnosis. |
| "I'll just update the accepted values" | Are the new values valid business data or bugs? Verify first. |
| "It's probably just a flaky test" | "Flaky" means there's an overall issue. Find it. We don't allow flaky tests to stay. |
flowchart TD
A[Job failure reported] --> B{MCP Admin API available?}
B -->|yes| C[Use list_jobs_runs to get history]
B -->|no| D[Ask user for logs and run_results.json]
C --> E[Use get_job_run_error for details]
D --> F[Classify error type]
E --> F
F --> G{Error type?}
G -->|Infrastructure| H[Check warehouse, connections, timeouts]
G -->|Code/Compilation| I[Check git history for recent changes]
G -->|Data/Test Failure| J[Use discovering-data skill to investigate]
H --> K{Root cause found?}
I --> K
J --> K
K -->|yes| L[Create branch, implement fix]
K -->|no| M[Create findings document]
L --> N[Add test - prefer unit test]
N --> O[Create PR with explanation]
M --> P[Document what was checked and next steps]Use these tools first - they provide the most comprehensive data:
| Tool | Purpose |
|---|---|
list_jobs_runs | Get recent run history, identify patterns |
get_job_run_error | Get detailed error message and context |
# Example: Get recent runs for job 12345
list_jobs_runs(job_id=12345, limit=10)
# Example: Get error details for specific run
get_job_run_error(run_id=67890)Ask the user to provide these artifacts:
run_results.json - contains execution status for each nodeTo get the run_results.json, generate the artifact URL for the user:
https://<DBT_ENDPOINT>/api/v2/accounts/<ACCOUNT_ID>/runs/<RUN_ID>/artifacts/run_results.json?step=<STEP_NUMBER>Where:
<DBT_ENDPOINT> - The dbt Cloud endpoint. e.g
cloud.getdbt.com for the US multi-tenant platform (there are other endpoints for other regions)ACCOUNT_PREFIX.us1.dbt.com for the cell-based platforms (there are different cell endpoints for different regions and cloud providers)<ACCOUNT_ID> - The dbt Cloud account ID<RUN_ID> - The failed job run ID<STEP_NUMBER> - The step that failed (e.g., if step 4 failed, use ?step=4)Example request:
"I don't have access to the dbt MCP server. Could you provide:
- The debug logs from dbt Cloud (Job Run → Logs → Download)
- The run_results.json - open this URL and copy/paste or upload the contents:
https://cloud.getdbt.com/api/v2/accounts/12345/runs/67890/artifacts/run_results.json?step=4
| Error Type | Indicators | Primary Investigation |
|---|---|---|
| Infrastructure | Connection timeout, warehouse error, permissions | Check warehouse status, connection settings |
| Code/Compilation | Undefined macro, syntax error, parsing error | Check git history for recent changes, use LSP tools |
| Data/Test Failure | Test failed with N results, schema mismatch | Use discovering-data skill to query actual data |
Check git history for recent changes:
If you're not in the dbt project directory, use the dbt MCP server to find the repository:
# Get project details including repository URL and project subdirectory
get_project_details(project_id=<project_id>)The response includes:
repository - The git repository URLdbt_project_subdirectory - Optional subfolder where the dbt project lives (e.g., dbt/, transform/analytics/)Then either:
gh CLI if it's on GitHubgit clone <repo_url> /tmp/dbt-investigationImportant: If the project is in a subfolder, navigate to it after cloning:
cd /tmp/dbt-investigation/<project_subdirectory>Once in the project directory:
git log --oneline -20
git diff HEAD~5..HEAD -- models/ macros/Use the CLI and LSP tools from the dbt MCP server or use the dbt CLI to check for errors:
If the dbt MCP server is available, use its tools:
# CLI tools
mcp__dbt_parse() # Check for parsing errors
mcp__dbt_list_models() # With selectos and `+` for finding models dependencies
mcp__dbt_compile(models="failing_model") # Check compilation
# LSP tools
mcp__dbt_get_column_lineage() # Check column lineageOtherwise, use the dbt CLI directly:
dbt parse # Check for parsing errors
dbt list --select +failing_model # Check for models upstream of the failing model
dbt compile --select failing_model # Check compilationSearch for the error pattern:
Use the discovering-data skill to investigate the actual data.
Get the test SQL
dbt compile --select project_name.folder1.folder2.test_unique_name --output jsonthe full path for the test can be found with a dbt ls --resource-type test command
Query the failing test's underlying data:
dbt show --inline "<query_from_the_test_SQL>" --output jsonCompare to recent git changes:
Create a new branch:
git checkout -b fix/job-failure-<description>Implement the fix addressing the actual root cause
Add a test to prevent recurrence:
unit_tests:
- name: test_status_mapping
model: orders
given:
- input: ref('stg_orders')
rows:
- {status_code: 1, expected_status: 'pending'}
- {status_code: 2, expected_status: 'shipped'}
expect:
rows:
- {status: 'pending'}
- {status: 'shipped'}Create a PR with:
Do not guess. Create a findings document.
Create docs/investigations/job-failure-<date>.md:
# Job Failure Investigation: <Job Name>
**Date:** YYYY-MM-DD
**Job ID:** <id>
**Status:** Unresolved
## Summary
Brief description of the failure and symptoms.
## What Was Checked
### Tools Used
- [ ] list_jobs_runs - findings
- [ ] get_job_run_error - findings
- [ ] git history - findings
- [ ] Data investigation - findings
### Hypotheses Tested
| Hypothesis | Evidence | Result |
|------------|----------|--------|
| Recent code change | No changes to affected models in 7 days | Ruled out |
## Patterns Observed
- Failures occur between 2-4 AM (peak load time?)
- Always fails on model X
## Suggested Next Steps
1. [ ] Check the data ingestion process to see if new data was added
2. [ ] Check if a new version of dbt or of the dbt adapter was released
## Related Resources
- Link to job run logs
- Link to relevant documentationCommit this document to the repository so findings aren't lost.
| Task | Tool/Command |
|---|---|
| Get job run history | list_jobs_runs (MCP) |
| Get detailed error | get_job_run_error (MCP) |
| Check recent git changes | git log --oneline -20 |
| Parse project | dbt parse |
| Compile specific model | dbt compile --select model_name |
| Query data | dbt show --inline "SELECT ..." --output json |
| Run specific test | dbt test --select test_name |
Modifying tests to pass without investigation
Skipping git history review
Not documenting when unresolved
Making best-guess fixes under pressure
Ignoring data investigation for test failures
evals
skills
adding-dbt-unit-test
references
answering-natural-language-questions-with-dbt
building-dbt-semantic-layer
configuring-dbt-mcp-server
fetching-dbt-docs
scripts
migrating-dbt-core-to-fusion
running-dbt-commands
troubleshooting-dbt-job-errors
using-dbt-for-analytics-engineering