CtrlK
BlogDocsLog inGet started
Tessl Logo

domino-ai-gateway

Access external LLM providers through Domino AI Gateway - a secure proxy with centralized API key management, usage monitoring, and compliance. Supports OpenAI, AWS Bedrock, Azure OpenAI, Anthropic, and more. Use when calling LLMs from Domino, configuring AI Gateway endpoints, or monitoring LLM usage and costs.

61

Quality

71%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/ai-gateway/SKILL.md
SKILL.md
Quality
Evals
Security

Domino AI Gateway Skill

Description

This skill helps users work with Domino AI Gateway - a secure proxy for accessing external Large Language Model (LLM) providers with centralized management, monitoring, and compliance.

Activation

Activate this skill when users want to:

  • Access LLM providers (OpenAI, AWS Bedrock, etc.) in Domino
  • Configure AI Gateway endpoints
  • Monitor LLM usage and costs
  • Understand secure API key management
  • Use LLMs in workspaces and jobs

What is AI Gateway?

Domino AI Gateway provides:

  • Secure LLM Access: Proxy to external LLM providers
  • Centralized API Keys: Keys stored securely, never exposed to users
  • Usage Monitoring: Track all LLM interactions
  • Access Control: Granular permissions per endpoint
  • Audit Logging: Compliance-ready logs

Supported LLM Providers

ProviderModels
OpenAIGPT-4, GPT-4 Turbo, GPT-3.5
AWS BedrockClaude, Titan, Llama 2
Azure OpenAIGPT-4, GPT-3.5
AnthropicClaude 3, Claude 2
Google Vertex AIPaLM, Gemini
CohereCommand, Embed

Creating an AI Gateway Endpoint

Via Domino UI

  1. Go to Endpoints > Gateway LLMs
  2. Click Create Endpoint
  3. Configure:
    • Name: Endpoint name (e.g., openai-gpt4)
    • Provider: Select LLM provider
    • Model: Specific model to use
    • API Key: Provider API key (stored securely)
    • Access: Who can use this endpoint
  4. Click Create

Via API

# Create endpoint via Domino API
import requests, os

TOKEN = requests.get("http://localhost:8899/access-token").text.strip()
BASE = os.environ["DOMINO_API_HOST"]

response = requests.post(
    f"{BASE}/api/aigateway/v1/endpoints",
    headers={"Authorization": f"Bearer {TOKEN}"},
    json={
        "name": "openai-gpt4",
        "provider": "openai",
        "model": "gpt-4",
        "providerApiKey": "sk-..."
    }
)

Using AI Gateway in Code

OpenAI-Compatible Interface

AI Gateway provides an OpenAI-compatible interface:

from openai import OpenAI

# Configure client to use AI Gateway
client = OpenAI(
    api_key="not-needed",  # Handled by AI Gateway
    base_url="https://your-domino.com/api/aigateway/v1/openai"
)

# Use like standard OpenAI
response = client.chat.completions.create(
    model="openai-gpt4",  # Your endpoint name
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)

print(response.choices[0].message.content)

With LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="openai-gpt4",  # Endpoint name
    openai_api_key="not-needed",
    openai_api_base="https://your-domino.com/api/aigateway/v1/openai"
)

response = llm.invoke("What is machine learning?")
print(response.content)

Direct API Call

import requests, os

TOKEN = requests.get("http://localhost:8899/access-token").text.strip()
BASE = os.environ["DOMINO_API_HOST"]

response = requests.post(
    f"{BASE}/api/aigateway/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"Bearer {TOKEN}",
    },
    json={
        "model": "openai-gpt4",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)

result = response.json()
print(result["choices"][0]["message"]["content"])

Access Control

Endpoint Permissions

Configure who can use each endpoint:

  • Everyone: All Domino users
  • Specific Users: Named individuals
  • Organizations: Specific Domino organizations

Setting Permissions

  1. Go to endpoint settings
  2. Click Access Control
  3. Add users or organizations
  4. Save changes

Monitoring and Logging

View Usage

  1. Go to Endpoints > Gateway LLMs
  2. Click on endpoint name
  3. View metrics:
    • Request count
    • Token usage
    • Response times
    • Error rates

Download Logs

# Via UI: Endpoints > Gateway LLMs > Download logs

# Logs include:
# - Timestamp
# - User
# - Model
# - Input/Output tokens
# - Response time
# - Status

Log Format

{
  "timestamp": "2024-01-15T10:30:00Z",
  "user": "user@company.com",
  "endpoint": "openai-gpt4",
  "model": "gpt-4",
  "inputTokens": 150,
  "outputTokens": 200,
  "durationMs": 1500,
  "status": "success"
}

Cost Management

Track Costs

AI Gateway tracks token usage per:

  • User
  • Project
  • Endpoint
  • Time period

Set Limits (Admin)

Admins can configure:

  • Token limits per user/project
  • Request rate limits
  • Cost alerts

Best Practices

1. Use Endpoint Names Consistently

# Define endpoint once
LLM_ENDPOINT = "production-gpt4"

# Use throughout code
response = client.chat.completions.create(
    model=LLM_ENDPOINT,
    messages=[...]
)

2. Handle Rate Limits

import time
from openai import RateLimitError

def call_llm_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="openai-gpt4",
                messages=messages
            )
        except RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
            else:
                raise

3. Log Important Calls

import logging

logger = logging.getLogger(__name__)

def query_llm(prompt):
    logger.info(f"Querying LLM with prompt length: {len(prompt)}")
    response = client.chat.completions.create(
        model="openai-gpt4",
        messages=[{"role": "user", "content": prompt}]
    )
    logger.info(f"Response tokens: {response.usage.total_tokens}")
    return response.choices[0].message.content

4. Use Streaming for Long Responses

# Streaming response
stream = client.chat.completions.create(
    model="openai-gpt4",
    messages=[{"role": "user", "content": "Write a long story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Security

API Key Management

  • Keys stored in Domino's secure vault
  • Never exposed to end users
  • Rotatable without code changes

Data Privacy

  • Requests logged for audit
  • PII can be masked (admin config)
  • Data retention policies configurable

Troubleshooting

Authentication Error

Error: 401 Unauthorized
  • Verify Domino API key is valid
  • Check endpoint access permissions
  • Ensure correct base URL

Rate Limit Exceeded

Error: 429 Too Many Requests
  • Implement retry logic
  • Contact admin for limit increase
  • Use multiple endpoints for load distribution

Model Not Found

Error: Model 'model-name' not found
  • Verify endpoint name is correct
  • Check endpoint exists and is active
  • Confirm you have access to the endpoint

API Reference

Before writing or verifying any API call, use the cluster swagger to confirm current endpoint paths and field names. Use public docs for workflow context and field explanations.

Get the cluster base URL: $DOMINO_API_HOST (injected by Domino into every workspace, job, and app).

Fetch the swagger spec:

# No authentication required for the public API spec
curl "$DOMINO_API_HOST/assets/public-api.json"
# Browser UI: $DOMINO_API_HOST/assets/lib/swagger-ui/index.html?url=/assets/public-api.json#/

Public docs (workflow context and field explanations):

  • API Guide
  • AI Gateway
  • Monitor AI Gateway LLM logs
Repository
dominodatalab/domino-claude-plugin
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.