Migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps. Use when: (1) User wants to migrate from Model Serving to Apps, (2) User has a ResponsesAgent with predict()/predict_stream() methods, (3) User wants to convert to @invoke/@stream decorators.
84
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
This guide instructs LLM coding agents how to migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps.
Goal: Migrate an agent deployed on Databricks Model Serving (using ResponsesAgent with predict()/predict_stream()) to Databricks Apps (using MLflow GenAI Server with @invoke/@stream decorators).
Key Transformation:
predict() and predict_stream() methods on a class@invoke and @stream decorators (sync or async, based on user preference)Deliverables: After migration is complete, you will have:
<working-directory>/
├── original_mlflow_model/ # Downloaded artifacts from Model Serving
│ ├── MLmodel
│ ├── code/
│ │ └── agent.py
│ ├── input_example.json
│ └── requirements.txt
│
└── <app-name>/ # New Databricks App (ready to deploy)
├── agent_server/
│ ├── agent.py # Migrated agent code
│ └── ...
├── databricks.yml # Bundle config with resources
├── pyproject.toml
├── uv.lock
└── ...
<app-name>is the name the user provides at the start of the migration. It is used as both the directory name and the Databricks App name at deploy time.
Before doing anything else, ask the user three questions. Use the AskUserQuestion tool to collect all answers at once so the user is only prompted once, then Claude can execute the rest of the migration autonomously.
Questions to ask:
databricks auth profiles first to list available profiles and their workspaces, then present the options to the user.)await/async for), enabling higher concurrency on smaller compute — no more threads sitting idle while waiting for LLM responses or long-running tool calls.ResponsesAgent class and wraps it with @invoke/@stream decorators. Simpler migration, but each request blocks a thread while waiting for I/O.Store the answers as:
<profile> — used for ALL databricks CLI commands throughout the migration (via --profile <profile>)<app-name> — used as both the directory name for the migrated app AND the app name when deploying with databricks bundle deploy<async> — yes or no, determines whether to convert the agent code to async or keep it synchronousAfter receiving the user's answers, validate the selected profile:
databricks current-user me --profile <profile>If this fails with an authentication error, prompt the user to re-authenticate:
databricks auth login --profile <profile>Important: Remember to include
--profile <profile>on everydatabricksCLI command throughout the migration.
Copy all scaffold files from the current working directory into a new directory named <app-name>/. Exclude instruction files (AGENTS.md, CLAUDE.md), hidden directories (.claude/, .git/), and any migration artifacts (e.g., original_mlflow_model/, .migration-venv/). Do NOT search for or copy scaffold files from other directories or templates — everything you need is right here.
All subsequent migration steps operate inside the <app-name>/ directory.
Note: The
agent_server/agent.pyscaffold is intentionally framework-agnostic — it contains the@invoke/@streamdecorator pattern with TODO placeholders. Step 3 (Migrate the Agent Code) will replace these placeholders with the actual agent logic from the original Model Serving endpoint.
Create a task list to track progress. This helps the user follow along and see what's completed, in progress, and pending.
User tip: Press
Ctrl+Tto toggle the task list view in your terminal. The display shows up to 10 tasks at a time with status indicators.
Create the following tasks using the TaskCreate tool:
| Task | Description |
|---|---|
| Authenticate to Databricks | Verify Databricks CLI authentication and validate the selected profile |
| Download original agent artifacts | Download the MLflow model artifacts from Model Serving endpoint |
| Analyze and understand agent code | Examine the original agent code, identify tools, resources, and dependencies |
| Migrate agent code to Apps format | Transform ResponsesAgent class to @invoke/@stream decorated functions |
| Set up and configure the app | Install dependencies, run quickstart, configure environment |
| Test agent locally | Start local server and verify the agent works correctly |
| Deploy to Databricks Apps | Configure databricks.yml resources and deploy with Databricks Asset Bundles |
| Test deployed app | Verify the deployed app responds correctly |
Update task status as you progress:
in_progress when starting each stepcompleted when finishedTask: Mark "Authenticate to Databricks" as
completed. Mark "Download original agent artifacts" asin_progress.Note: The
<profile>and<app-name>values were collected from the user in the "Before You Begin" section. Use them throughout.
Download the original agent code from the Model Serving endpoint. This requires setting up a virtual environment with MLflow to access the model artifacts.
If you have a serving endpoint name, extract the model details:
# Get endpoint info (remember to include --profile if using non-default)
databricks serving-endpoints get <endpoint-name> --profile <profile> --output jsonLook for served_entities[0].entity_name (model name) and entity_version in the response. Find the entity with 100% traffic in traffic_config.routes.
Use uv run --with to download artifacts without creating a separate virtual environment. The mlflow[databricks] extra includes boto3 for Unity Catalog artifact access:
DATABRICKS_CONFIG_PROFILE=<profile> uv run --no-project \
--with "mlflow[databricks]>=2.15.0" \
--with "databricks-sdk>=0.30.0" \
python3 << 'EOF'
import mlflow
mlflow.set_tracking_uri("databricks")
# Replace with actual values from step 1.1
MODEL_NAME = "<model-name>"
VERSION = "<version>"
print(f"Downloading model: models:/{MODEL_NAME}/{VERSION}")
mlflow.artifacts.download_artifacts(
artifact_uri=f"models:/{MODEL_NAME}/{VERSION}",
dst_path="./original_mlflow_model"
)
print("Download complete! Artifacts saved to ./original_mlflow_model")
EOFCheck that the key files exist and understand the full structure:
# List all downloaded files recursively
find ./original_mlflow_model -type f | head -50
# Check for MLmodel file (contains resource requirements)
cat ./original_mlflow_model/MLmodel
# Check for input example (useful for testing)
cat ./original_mlflow_model/input_example.json 2>/dev/nullExamine the /code folder - contains all code dependencies logged via code_paths=["..."]:
# List all code files
ls -la ./original_mlflow_model/code/
# The main agent is typically agent.py, but there may be additional modules
find ./original_mlflow_model/code -name "*.py" -type fExamine the /artifacts folder (if present) - contains artifacts logged via artifacts={...}:
# Check for artifacts folder
ls -la ./original_mlflow_model/artifacts/ 2>/dev/null
# List all artifacts
find ./original_mlflow_model/artifacts -type f 2>/dev/nullImportant: Take note of ALL files in
/codeand/artifacts. You will need to copy these to the migrated app and ensure imports still work correctly.
After successful download, you should have:
./original_mlflow_model/
├── MLmodel # Model metadata and resource requirements
├── code/ # Code logged via code_paths=["..."]
│ ├── agent.py # Main agent implementation
│ ├── utils.py # (optional) Helper modules
│ ├── tools.py # (optional) Custom tool definitions
│ └── ... # Any other code dependencies
├── artifacts/ # (optional) Artifacts logged via artifacts={...}
│ ├── config.yaml # (optional) Configuration files
│ ├── prompts/ # (optional) Prompt templates
│ └── ... # Any other artifacts (data files, etc.)
├── input_example.json # Sample request for testing
├── requirements.txt # Original dependencies
└── ...code/agent.py - Contains the ResponsesAgent class with predict() and predict_stream() methodscode/*.py - Any additional Python modules the agent importsMLmodel - Contains the resources section listing required Databricks resourcesartifacts/ - Any configuration files, prompts, or data files the agent usesinput_example.json - Use this to test the migrated agent"Unable to import necessary dependencies to access model version files in Unity Catalog"
This means boto3 is missing. Ensure you're using mlflow[databricks] (not just mlflow) in the --with flag — the [databricks] extra includes boto3.
"INVALID_PARAMETER_VALUE" or authentication errors Re-authenticate with Databricks (include profile if non-default):
databricks auth login --profile <profile>Wrong workspace / Model not found Make sure you're using the correct profile that corresponds to the workspace where the model is deployed:
# List profiles to see which workspace each points to
databricks auth profiles
# Verify you can access the workspace
databricks current-user me --profile <profile>
# List models in that workspace
databricks registered-models list --profile <profile>
databricks model-versions list --name "<model-name>" --profile <profile>Task: Mark "Download original agent artifacts" as
completed. Mark "Analyze and understand agent code" asin_progress.
In both cases, the ResponsesAgent class is replaced with decorated functions. The difference is whether those functions are async or sync.
Model Serving (OLD):
from mlflow.pyfunc import ResponsesAgent, ResponsesAgentRequest, ResponsesAgentResponse
class MyAgent(ResponsesAgent):
def predict(self, request: ResponsesAgentRequest, params=None) -> ResponsesAgentResponse:
# Synchronous implementation
...
return ResponsesAgentResponse(output=outputs)
def predict_stream(self, request: ResponsesAgentRequest, params=None):
# Synchronous generator
for chunk in ...:
yield ResponsesAgentStreamEvent(...)Apps — Async (if <async> = yes):
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Async implementation - typically calls streaming() and collects results
outputs = [
event.item
async for event in streaming(request)
if event.type == "response.output_item.done"
]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
# Async generator
async for event in ...:
yield eventApps — Sync (if <async> = no):
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Same sync logic from original predict(), extracted from the class
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# Same sync generator from original predict_stream(), extracted from the class
for chunk in ...:
yield ResponsesAgentStreamEvent(...)| Aspect | Model Serving | Apps (async) | Apps (sync) |
|---|---|---|---|
| Structure | class MyAgent(ResponsesAgent) | Decorated functions | Decorated functions |
| Functions | def predict() / def predict_stream() | async def with await | def (same as original) |
| Streaming | Sync generator (yield) | Async generator (async for / yield) | Sync generator (yield) |
| Server | MLflow Model Server | MLflow GenAI Server (FastAPI) | MLflow GenAI Server (FastAPI) |
| Deployment | databricks_agents.deploy() | databricks bundle deploy + bundle run | databricks bundle deploy + bundle run |
<async> = yes)Skip this section if the user chose synchronous migration. The sync path keeps all original I/O calls as-is.
All I/O operations must be converted to async:
# OLD (sync)
response = client.chat(messages)
# NEW (async)
response = await client.achat(messages)
# OLD (sync iteration)
for chunk in stream:
yield chunk
# NEW (async iteration)
async for chunk in stream:
yield chunkTask: Mark "Analyze and understand agent code" as
completed. Mark "Migrate agent code to Apps format" asin_progress.
The original MLflow model may contain multiple code files and artifacts that need to be migrated.
Copy all code files from /code to agent_server/:
# Copy all Python files from original code folder
cp ./original_mlflow_model/code/*.py ./<app-name>/agent_server/
# If there are subdirectories with code, copy those too
# cp -r ./original_mlflow_model/code/submodule ./<app-name>/agent_server/Copy artifacts (if present):
# Create an artifacts directory in the migrated app if needed
mkdir -p ./<app-name>/agent_server/artifacts
# Copy all artifacts
cp -r ./original_mlflow_model/artifacts/* ./<app-name>/agent_server/artifacts/ 2>/dev/null || trueFix import paths after copying:
When code files are moved, imports may break. Check and update imports in all copied files:
# BEFORE (if files were in different locations):
from code.utils import helper_function
from artifacts.prompts import SYSTEM_PROMPT
# AFTER (files are now in agent_server/):
from agent_server.utils import helper_function
# Or if in same directory:
from .utils import helper_function
# For artifacts, update file paths:
# BEFORE:
with open("artifacts/config.yaml") as f:
# AFTER:
import os
config_path = os.path.join(os.path.dirname(__file__), "artifacts", "config.yaml")
with open(config_path) as f:Important: Review each copied file and ensure all imports resolve correctly. The most common issues are:
- Relative imports that assumed a different directory structure
- Hardcoded file paths to artifacts
- Missing
__init__.pyfiles for package imports
From the original agent code, identify and preserve:
databricks-claude-sonnet-4-5)The approach depends on whether the user chose async or sync migration.
<async> = no)This is the minimal-changes path. Extract the logic from the ResponsesAgent class, wrap it with @invoke/@stream decorators, and keep all code synchronous.
Edit <app-name>/agent_server/agent.py:
from mlflow.genai.agent_server import invoke, stream
from mlflow.types.responses import (
ResponsesAgentRequest,
ResponsesAgentResponse,
ResponsesAgentStreamEvent,
)
# Move any class __init__ or class-level setup to module level
# e.g., client initialization, tool setup, etc.
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
# Paste the body of the original predict() method here
# Remove 'self.' references — replace with module-level variables
# Remove 'params' parameter (not used in Apps)
...
return ResponsesAgentResponse(output=outputs)
@stream()
def streaming(request: ResponsesAgentRequest):
# Paste the body of the original predict_stream() method here
# Remove 'self.' references — replace with module-level variables
# Remove 'params' parameter (not used in Apps)
for chunk in ...:
yield ResponsesAgentStreamEvent(...)Key changes from class to functions:
class MyAgent(ResponsesAgent): wrapperself parameter from all methods__init__ logic (client creation, tool setup) to module-level codeself.some_attribute with module-level variables@invoke() decorator to the non-streaming function@stream() decorator to the streaming functionKeep all other code as-is — no need to convert sync calls to async, no need to change for to async for, no need to add await.
<async> = yes)This path converts all I/O operations to async for higher concurrency. More changes are required, but the result is a more efficient server.
Edit <app-name>/agent_server/agent.py:
Update the LLM endpoint:
LLM_ENDPOINT_NAME = "<your-endpoint-from-original>"Update the system prompt:
SYSTEM_PROMPT = """<your-system-prompt-from-original>"""Add your custom tools: If your original agent had custom tools, add them:
from langchain_core.tools import tool
@tool
async def my_custom_tool(arg: str) -> str:
"""Tool description."""
# Your tool logic (make async if needed)
return resultConvert all I/O to async:
def predict() → async def non_streaming()def predict_stream() → async def streaming()client.chat() → await client.achat()for chunk in stream: → async for chunk in stream:await async equivalentsPreserve any special logic: Migrate any custom preprocessing, postprocessing, or business logic from the original agent.
If original uses checkpointer (short-term memory):
AsyncCheckpointSaver if async, or sync equivalent if sync)LAKEBASE_INSTANCE_NAME in .envrequest.custom_inputs or request.context.conversation_idIf original uses store (long-term memory):
AsyncDatabricksStore if async, or sync equivalent if sync)LAKEBASE_INSTANCE_NAME in .envrequest.custom_inputs or request.context.user_idTask: Mark "Migrate agent code to Apps format" as
completed. Mark "Set up and configure the app" asin_progress.
Before installing dependencies, ensure a README file exists (hatchling requires this):
Ensure a README file exists:
# Create a minimal README if one doesn't exist
if [ ! -f "README.md" ]; then
echo "# Migrated Agent App" > README.md
ficd <app-name>
uv syncDatabricks Apps uses uv.lock for reproducible dependency installation. Generate it by running:
uv lockIMPORTANT: The
uv.lockfile must be committed to version control. Databricks Apps detectspyproject.toml+uv.lock(with norequirements.txt) and usesuvto install dependencies. If arequirements.txtexists, it takes priority anduv.lockis ignored.
Run the uv run quickstart script to quickly set up your local environment. This is the recommended way to configure the app as it handles all necessary setup automatically.
uv run quickstartThis script will:
.env with the necessary environment variablesImportant: The quickstart script creates the MLflow experiment that the app needs for logging traces and models. This experiment will be added as a resource when deploying the app.
If there are issues with the quickstart script, refer to the manual setup in section 4.5.
If you need to manually configure the environment or add additional variables, edit .env:
# Databricks authentication
DATABRICKS_CONFIG_PROFILE=<your-profile>
# MLflow experiment (created by quickstart, or create manually)
MLFLOW_EXPERIMENT_ID=<experiment-id>
# Example: Lakebase for stateful agents
LAKEBASE_INSTANCE_NAME=<your-lakebase-instance>
# Example: Custom API keys
MY_API_KEY=<value>To manually create an MLflow experiment:
databricks experiments create-experiment "/Users/<your-username>/<app-name>" --profile <profile>Task: Mark "Set up and configure the app" as
completed. Mark "Test agent locally" asin_progress.
Test your migrated agent locally before deploying to Databricks Apps. This helps catch configuration issues early and ensures the agent works correctly.
After the quickstart setup is complete, start the agent server and chat app locally:
cd <app-name>
uv run start-appWait for the server to start. You should see output indicating the server is running on http://localhost:8000.
Note: If you only need the API endpoint (without the chat UI), you can run
uv run start-serverinstead.
The original model artifacts include an input_example.json file that contains a sample request. Use this to verify your migrated agent produces the same behavior. If there's no valid sample request then figure out a valid sample request to query agent based on its code.
# Check the original input example (from the <app-name> directory)
cat ../original_mlflow_model/input_example.jsonExample content:
{"input": [{"role": "user", "content": "What is an LLM agent?"}], "custom_inputs": {"thread_id": "example-thread-123"}}Test your local server with this input:
# Test with the original input example
curl -X POST http://localhost:8000/invocations \
-H "Content-Type: application/json" \
-d "$(cat ../original_mlflow_model/input_example.json)"# Non-streaming
curl -X POST http://localhost:8000/invocations \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "Hello!"}]}'
# Streaming
curl -X POST http://localhost:8000/invocations \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "Hello!"}], "stream": true}'# With thread_id for short-term memory
curl -X POST http://localhost:8000/invocations \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"thread_id": "test-123"}}'
# With user_id for long-term memory
curl -X POST http://localhost:8000/invocations \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "Hi"}], "custom_inputs": {"user_id": "user@example.com"}}'Before proceeding to deployment, ensure:
Note: Only proceed to Step 6 (Deploy) after confirming the agent works correctly locally.
Task: Mark "Test agent locally" as
completed. Mark "Deploy to Databricks Apps" asin_progress.
This step uses Databricks Asset Bundles (DAB) to deploy. The scaffold includes a databricks.yml that you need to update with the app name and resources from the original model.
The original model's MLmodel file contains a resources section that lists all Databricks resources the agent needs access to. Check ../original_mlflow_model/MLmodel (or ./original_mlflow_model/MLmodel if you're in the parent directory) for content like:
resources:
api_version: '1'
databricks:
lakebase:
- name: lakebase
serving_endpoint:
- name: databricks-claude-sonnet-4-5databricks.yml with ResourcesThe scaffold includes a databricks.yml with the experiment resource pre-configured. You need to:
<app-name> (the name provided by the user) in both the resources.apps.agent_migration.name field and the targets.prod.resources.apps.agent_migration.name field.resources.apps.agent_migration.resources list.Resource Type Mapping (MLmodel → databricks.yml):
| MLmodel Resource | databricks.yml Resource | Key Fields |
|---|---|---|
serving_endpoint | serving_endpoint | name, permission (CAN_QUERY) |
lakebase | database | database_name: databricks_postgres, instance_name, permission (CAN_CONNECT_AND_CREATE) |
vector_search_index | uc_securable | securable_full_name, securable_type: TABLE, permission: SELECT |
function | uc_securable | securable_full_name, securable_type: FUNCTION, permission: EXECUTE |
table | uc_securable | securable_full_name, securable_type: TABLE, permission: SELECT |
uc_connection | uc_securable | securable_full_name, securable_type: CONNECTION, permission: USE_CONNECTION |
sql_warehouse | sql_warehouse | id, permission (CAN_USE) |
genie_space | genie_space | space_id, permission (CAN_RUN) |
Note: The
experimentresource is already configured in the scaffolddatabricks.ymland is automatically created by the bundle. You do not need to add it manually.
Example: databricks.yml for an agent with a serving endpoint and UC function:
resources:
experiments:
agent_migration_experiment:
name: /Users/${workspace.current_user.userName}/${bundle.name}-${bundle.target}
apps:
agent_migration:
name: "<app-name>" # Update to user's app name
description: "Migrated agent from Model Serving to Databricks Apps"
source_code_path: ./
resources:
- name: 'experiment'
experiment:
experiment_id: "${resources.experiments.agent_migration_experiment.id}"
permission: 'CAN_MANAGE'
- name: 'serving-endpoint'
serving_endpoint:
name: 'databricks-claude-sonnet-4-5'
permission: 'CAN_QUERY'
- name: 'python-exec'
uc_securable:
securable_full_name: 'system.ai.python_exec'
securable_type: 'FUNCTION'
permission: 'EXECUTE'
targets:
prod:
resources:
apps:
agent_migration:
name: "<app-name>" # Same name for productionExample: Adding Lakebase resources (for stateful agents):
- name: 'database'
database:
database_name: 'databricks_postgres'
instance_name: 'lakebase'
permission: 'CAN_CONNECT_AND_CREATE'From inside the <app-name> directory, validate, deploy, and run:
# 1. Validate bundle configuration (catches errors before deploy)
databricks bundle validate --profile <profile>
# 2. Deploy the bundle (creates/updates resources, uploads files)
databricks bundle deploy --profile <profile>
# 3. Run the app (starts/restarts with uploaded source code) - REQUIRED!
databricks bundle run agent_migration --profile <profile>Important:
bundle deployonly uploads files and configures resources.bundle runis required to actually start/restart the app with the new code. If you only rundeploy, the app will continue running old code!
Task: Mark "Deploy to Databricks Apps" as
completed. Mark "Test deployed app" asin_progress.
# Get the app URL
APP_URL=$(databricks apps get <app-name> --profile <profile> --output json | jq -r '.url')
# Get OAuth token
TOKEN=$(databricks auth token --profile <profile> | jq -r .access_token)
# Query the app
curl -X POST ${APP_URL}/invocations \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"input": [{"role": "user", "content": "Hello!"}]}'Once the deployed app responds successfully:
Task: Mark "Test deployed app" as
completed. Migration complete!
If you encounter issues during deployment, refer to the deploy skill for detailed guidance.
Debug commands:
# Validate bundle configuration
databricks bundle validate --profile <profile>
# View app logs
databricks apps logs <app-name> --profile <profile> --follow
# Check app status
databricks apps get <app-name> --profile <profile> --output json | jq '{app_status, compute_status}'
# Get app URL
databricks apps get <app-name> --profile <profile> --output json | jq -r '.url'"App already exists" error:
If databricks bundle deploy fails because the app already exists, refer to the deploy skill for instructions on binding an existing app to the bundle.
<app-name>/
├── agent_server/
│ ├── __init__.py
│ ├── agent.py # Main agent logic - THIS IS WHERE YOU MIGRATE TO
│ ├── start_server.py # FastAPI server setup
│ ├── utils.py # Helper utilities
│ └── evaluate_agent.py # Agent evaluation
├── scripts/
│ ├── __init__.py
│ ├── quickstart.py # Setup script
│ └── start_app.py # App startup
├── databricks.yml # Databricks Asset Bundle configuration (resources, config, targets)
├── pyproject.toml # Dependencies (for local dev with uv)
├── uv.lock # Lock file for reproducible deploys (must be committed)
├── .env.example # Environment template
└── README.mdIMPORTANT: The
uv.lockfile must be committed to version control. Databricks Apps detectspyproject.toml+uv.lock(with norequirements.txt) and usesuvfor fully reproducible installs.
Original:
class ChatAgent(ResponsesAgent):
def predict(self, request, params=None):
messages = to_chat_completions_input(request.input)
response = self.llm.invoke(messages)
return ResponsesAgentResponse(output=[...])Migrated (sync):
llm = ... # Move class-level init to module level
@invoke()
def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
messages = to_chat_completions_input(request.input)
response = llm.invoke(messages)
return ResponsesAgentResponse(output=[...])
@stream()
def streaming(request: ResponsesAgentRequest):
# Original predict_stream() body, with self. removed
...Migrated (async):
@invoke()
async def non_streaming(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
outputs = [e.item async for e in streaming(request) if e.type == "response.output_item.done"]
return ResponsesAgentResponse(output=outputs)
@stream()
async def streaming(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
messages = {"messages": to_chat_completions_input([i.model_dump() for i in request.input])}
agent = await init_agent()
async for event in process_agent_astream_events(agent.astream(messages, stream_mode=["updates", "messages"])):
yield eventSync: Keep tools as-is from the original code.
Async: Migrate tools to async LangChain tools:
from langchain_core.tools import tool
@tool
async def search_docs(query: str) -> str:
"""Search the documentation."""
results = await vector_store.asimilarity_search(query)
return format_results(results)from langchain.agents import create_agent
from databricks_langchain import ChatDatabricks
async def init_agent():
tools = await mcp_client.get_tools() # MCP tools are async
model = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
return create_agent(model=model, tools=tools, system_prompt=SYSTEM_PROMPT)uv sync # Reinstall dependenciesdatabricks auth login # Re-authenticateawait client.achat() not client.chat())async for instead of for when iterating async generatorsdfeb4ac
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.