CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Overall
score

69%

Evaluation69%

1.33x

Agent success when using this tile

Overview
Eval results
Files

chat-completions.mddocs/

Chat Completions

Conversational AI interface supporting chat templates, tool calling, and multi-turn conversations with proper message formatting and context management. Provides OpenAI-compatible chat completion functionality.

Capabilities

Chat Interface

Generate responses in conversational format with support for system messages, user messages, assistant messages, and advanced features like tool calling.

def chat(
    self,
    messages: List[ChatCompletionMessageParam],
    chat_template: Optional[str] = None,
    chat_template_content_format: ChatTemplateContentFormatOption = "auto",
    add_generation_prompt: bool = True,
    continue_final_message: bool = False,
    tools: Optional[List[ChatCompletionToolParam]] = None,
    documents: Optional[List[ChatCompletionDocumentParam]] = None,
    mm_processor_kwargs: Optional[Dict[str, Any]] = None,
    **kwargs: Any,
) -> List[ChatCompletionOutput]:
    """
    Generate chat completions from conversation messages.

    Parameters:
    - messages: List of conversation messages with roles and content
    - chat_template: Custom chat template for message formatting
    - chat_template_content_format: Content format handling ("auto", "string", "openai")
    - add_generation_prompt: Whether to add generation prompt
    - continue_final_message: Continue from the last assistant message
    - tools: Available tools for function calling
    - documents: Context documents for retrieval
    - mm_processor_kwargs: Multimodal processing arguments

    Returns:
    List of ChatCompletionOutput objects with generated responses
    """

Usage Examples

Basic Chat Conversation

from vllm import LLM

llm = LLM(model="microsoft/DialoGPT-medium")

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"},
]

response = llm.chat(messages)
print(response[0].message.content)

Types

class ChatCompletionMessageParam:
    role: str  # "system", "user", "assistant"
    content: str  # Message content

class ChatCompletionOutput:
    message: ChatMessage  # Generated response
    finish_reason: Optional[str]  # Completion reason

Install with Tessl CLI

npx tessl i tessl/pypi-vllm

docs

async-inference.md

chat-completions.md

configuration.md

index.md

parameters-types.md

text-classification.md

text-embeddings.md

text-generation.md

text-scoring.md

tile.json