or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

audio.mdbatch.mdchat-completions.mdcode-interpreter.mdcompletions.mdembeddings.mdendpoints.mdevaluation.mdfiles.mdfine-tuning.mdimages.mdindex.mdmodels.mdrerank.md
tile.json

tessl/pypi-together

Python client for Together's Cloud Platform providing comprehensive AI model APIs

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/together@1.5.x

To install, run

npx @tessl/cli install tessl/pypi-together@1.5.0

index.mddocs/

Together Python API Library

The Together Python API Library is the official Python client for Together's AI platform, providing comprehensive access to state-of-the-art AI models through synchronous and asynchronous interfaces. It enables developers to integrate chat completions, text completions, image generation, embeddings, reranking, audio processing, batch inference, and model fine-tuning capabilities into their Python 3.10+ applications.

Package Information

  • Package Name: together
  • Language: Python
  • Installation: pip install together
  • Documentation: https://docs.together.ai/

Core Imports

from together import Together, AsyncTogether

For legacy compatibility:

from together import Complete, AsyncComplete, Completion
from together import Embeddings, Files, Finetune, Image, Models

Import specific components:

from together import resources, types, error
from together.types import ChatCompletionRequest, CompletionResponse

Basic Usage

from together import Together

# Initialize client (API key from TOGETHER_API_KEY env var or pass directly)
client = Together(api_key="your-api-key")

# Chat completion with text
response = client.chat.completions.create(
    model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Explain machine learning"}],
    max_tokens=500,
    temperature=0.7
)
print(response.choices[0].message.content)

# Text completion
response = client.completions.create(
    model="codellama/CodeLlama-34b-Python-hf",
    prompt="def fibonacci(n):",
    max_tokens=200,
    temperature=0.2
)
print(response.choices[0].text)

# Generate embeddings
response = client.embeddings.create(
    model="togethercomputer/m2-bert-80M-8k-retrieval",
    input=["Machine learning is amazing", "AI will transform everything"]
)
embeddings = [data.embedding for data in response.data]

# Generate image
response = client.images.generate(
    prompt="futuristic cityscape at sunset",
    model="stabilityai/stable-diffusion-xl-base-1.0",
    n=1,
    steps=20
)
image_data = response.data[0].b64_json

Architecture

The Together library follows a resource-based architecture where each AI capability is organized into separate resource classes:

  • Client Classes: Together (sync) and AsyncTogether (async) serve as the main entry points
  • Resource Classes: Each API endpoint area (chat, completions, embeddings, etc.) has dedicated resource classes
  • Type System: Comprehensive type definitions for all requests, responses, and data structures
  • Streaming Support: Native streaming capabilities for real-time response processing
  • Error Handling: Structured exception hierarchy for robust error management
  • Legacy API: Backward compatibility with deprecated interfaces

The dual sync/async design enables flexible integration patterns, from simple synchronous scripts to high-performance asynchronous applications handling concurrent requests.

Capabilities

Chat Completions

Advanced conversational AI with support for multi-modal inputs including text, images, and video. Includes streaming, async operations, and comprehensive configuration options.

def create(
    model: str,
    messages: List[dict],
    max_tokens: Optional[int] = None,
    temperature: Optional[float] = None,
    stream: bool = False,
    **kwargs
) -> ChatCompletionResponse: ...

Chat Completions

Text Completions

Raw text completion for code generation, creative writing, and general text completion tasks with streaming and batch processing support.

def create(
    model: str,
    prompt: str,
    max_tokens: Optional[int] = None,
    temperature: Optional[float] = None,
    stream: bool = False,
    **kwargs
) -> CompletionResponse: ...

Text Completions

Embeddings

High-dimensional vector representations of text for semantic search, clustering, classification, and similarity analysis with various embedding models.

def create(
    model: str,
    input: Union[str, List[str]],
    **kwargs
) -> EmbeddingResponse: ...

Embeddings

Image Generation

AI-powered image synthesis from text prompts with support for different models, resolutions, and generation parameters.

def generate(
    prompt: str,
    model: str,
    n: int = 1,
    steps: Optional[int] = None,
    width: Optional[int] = None,
    height: Optional[int] = None,
    **kwargs
) -> ImageResponse: ...

Image Generation

File Management

File upload, listing, retrieval, and deletion operations for fine-tuning datasets and batch processing workflows.

def upload(file: str, purpose: Optional[str] = None) -> FileResponse: ...
def list() -> FileList: ...
def retrieve(id: str) -> FileObject: ...
def retrieve_content(id: str) -> str: ...
def delete(id: str) -> FileDeleteResponse: ...

File Management

Models

Discovery and information retrieval for available AI models across different categories and capabilities.

def list() -> List[ModelObject]: ...

Models

Fine-tuning

Custom model training with supervised fine-tuning and direct preference optimization, including job management and model downloading.

def create(
    training_file: str,
    model: str,
    n_epochs: Optional[int] = None,
    batch_size: Optional[Union[str, int]] = None,
    learning_rate: Optional[float] = None,
    **kwargs
) -> FinetuneResponse: ...

Fine-tuning

Reranking

Document relevance scoring and reordering for improved search and retrieval results with specialized reranking models.

def create(
    model: str,
    query: str,
    documents: List[str],
    top_n: Optional[int] = None,
    **kwargs
) -> RerankResponse: ...

Reranking

Audio Processing

Speech synthesis, transcription, and translation capabilities supporting multiple languages and audio formats.

# Speech synthesis
def create(
    model: str,
    input: str,
    voice: str,
    response_format: Optional[str] = None,
    **kwargs
) -> bytes: ...

# Audio transcription
def create(
    file: str,
    model: str,
    language: Optional[str] = None,
    **kwargs
) -> AudioTranscriptionResponse: ...

Audio Processing

Batch Processing

Large-scale inference jobs with 24-hour turnaround time for processing thousands of requests efficiently and cost-effectively.

def create_batch(
    file_id: str,
    endpoint: str,
    **kwargs
) -> BatchJob: ...
def get_batch(id: str) -> BatchJob: ...
def list_batches() -> List[BatchJob]: ...

Batch Processing

Evaluation

Model performance evaluation with standardized metrics and comparison capabilities for assessing AI model quality and capabilities.

def create(
    model: str,
    evaluation_type: str,
    dataset: str,
    **kwargs
) -> EvaluationCreateResponse: ...

Evaluation

Dedicated Endpoint Management

Infrastructure management for deploying and scaling AI models on dedicated compute resources with autoscaling, hardware optimization, and performance monitoring.

def create(
    *,
    model: str,
    hardware: str,
    min_replicas: int,
    max_replicas: int,
    display_name: Optional[str] = None,
    disable_prompt_cache: bool = False,
    disable_speculative_decoding: bool = False,
    state: Literal["STARTED", "STOPPED"] = "STARTED",
    inactive_timeout: Optional[int] = None
) -> DedicatedEndpoint: ...

def list(type: Optional[Literal["dedicated", "serverless"]] = None) -> List[ListEndpoint]: ...
def get(endpoint_id: str) -> DedicatedEndpoint: ...
def update(endpoint_id: str, **kwargs) -> DedicatedEndpoint: ...
def delete(endpoint_id: str) -> None: ...
def list_hardware(model: Optional[str] = None) -> List[HardwareWithStatus]: ...

Dedicated Endpoint Management

Code Interpreter

Interactive code execution environment for running Python scripts with file upload support, session persistence, and comprehensive output capture.

def run(
    code: str,
    language: Literal["python"],
    session_id: Optional[str] = None,
    files: Optional[List[Dict[str, Any]]] = None
) -> ExecuteResponse: ...

Code Interpreter

Asynchronous Usage

All capabilities support asynchronous operations through the AsyncTogether client:

class AsyncTogether:
    completions: AsyncCompletions
    chat: AsyncChat
    embeddings: AsyncEmbeddings
    files: AsyncFiles
    images: AsyncImages
    models: AsyncModels
    fine_tuning: AsyncFineTuning
    rerank: AsyncRerank
    audio: AsyncAudio
    batches: AsyncBatches
    evaluation: AsyncEvaluation
    code_interpreter: CodeInterpreter

Core Types

Client Configuration

class Together:
    def __init__(
        self,
        api_key: Optional[str] = None,
        base_url: Optional[str] = None,
        timeout: Optional[float] = None, 
        max_retries: Optional[int] = None,
        supplied_headers: Optional[Dict[str, str]] = None
    ): ...

class AsyncTogether:
    def __init__(
        self,
        api_key: Optional[str] = None,
        base_url: Optional[str] = None,
        timeout: Optional[float] = None,
        max_retries: Optional[int] = None, 
        supplied_headers: Optional[Dict[str, str]] = None
    ): ...

Request Base Types

class TogetherRequest:
    """Base request type for Together API operations"""
    pass

class TogetherClient:
    """Core HTTP client for API communication"""
    def __init__(
        self,
        api_key: str,
        base_url: str,
        timeout: float,
        max_retries: int,
        supplied_headers: Optional[Dict[str, str]] = None
    ): ...

CLI Interface

The library includes a comprehensive command-line interface accessible via the together command:

# Chat completions
together chat.completions \
  --message "user" "Explain quantum computing" \
  --model "meta-llama/Llama-3.2-3B-Instruct-Turbo"

# Text completions  
together completions \
  "def merge_sort(arr):" \
  --model "codellama/CodeLlama-34b-Python-hf" \
  --max-tokens 200

# Image generation
together images generate \
  "abstract art with vibrant colors" \
  --model "stabilityai/stable-diffusion-xl-base-1.0" \
  --n 2

# File operations
together files upload dataset.jsonl
together files list

# Model information
together models list

Error Handling

The library provides structured error handling through the together.error module:

class AuthenticationError(Exception):
    """Raised when API key is missing or invalid"""
    pass

class APIError(Exception):
    """Base class for API-related errors"""
    pass

class RateLimitError(APIError):
    """Raised when rate limits are exceeded"""
    pass

Legacy API

Deprecated classes are available for backward compatibility:

class Complete:
    """Legacy completion interface (deprecated)"""
    pass

class AsyncComplete:
    """Legacy async completion interface (deprecated)"""
    pass

class Completion:
    """Legacy completion result class (deprecated)"""
    pass