A high-throughput and memory-efficient inference and serving engine for LLMs
Overall
score
69%
Evaluation — 69%
↑ 1.33xAgent success when using this tile
Conversational AI interface supporting chat templates, tool calling, and multi-turn conversations with proper message formatting and context management. Provides OpenAI-compatible chat completion functionality.
Generate responses in conversational format with support for system messages, user messages, assistant messages, and advanced features like tool calling.
def chat(
self,
messages: List[ChatCompletionMessageParam],
chat_template: Optional[str] = None,
chat_template_content_format: ChatTemplateContentFormatOption = "auto",
add_generation_prompt: bool = True,
continue_final_message: bool = False,
tools: Optional[List[ChatCompletionToolParam]] = None,
documents: Optional[List[ChatCompletionDocumentParam]] = None,
mm_processor_kwargs: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> List[ChatCompletionOutput]:
"""
Generate chat completions from conversation messages.
Parameters:
- messages: List of conversation messages with roles and content
- chat_template: Custom chat template for message formatting
- chat_template_content_format: Content format handling ("auto", "string", "openai")
- add_generation_prompt: Whether to add generation prompt
- continue_final_message: Continue from the last assistant message
- tools: Available tools for function calling
- documents: Context documents for retrieval
- mm_processor_kwargs: Multimodal processing arguments
Returns:
List of ChatCompletionOutput objects with generated responses
"""from vllm import LLM
llm = LLM(model="microsoft/DialoGPT-medium")
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
]
response = llm.chat(messages)
print(response[0].message.content)class ChatCompletionMessageParam:
role: str # "system", "user", "assistant"
content: str # Message content
class ChatCompletionOutput:
message: ChatMessage # Generated response
finish_reason: Optional[str] # Completion reasonInstall with Tessl CLI
npx tessl i tessl/pypi-vllmdocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10