A high-throughput and memory-efficient inference and serving engine for LLMs
Overall
score
69%
Evaluation — 69%
↑ 1.33xAgent success when using this tile
{
"context": "This criteria evaluates how well the engineer uses vLLM's chat-based generation API to implement a multi-turn conversational system. The focus is on proper usage of the LLM class, the chat() method, message formatting, and sampling parameter configuration.",
"type": "weighted_checklist",
"checklist": [
{
"name": "LLM Initialization",
"description": "Correctly initializes the vLLM LLM class with an appropriate chat/instruction-tuned model (e.g., using model parameter).",
"max_score": 15
},
{
"name": "Chat Method Usage",
"description": "Uses the LLM.chat() method (not generate()) to process conversational input, which is the appropriate method for chat-based interactions.",
"max_score": 25
},
{
"name": "Message Format",
"description": "Correctly formats messages as a list of dictionaries with 'role' and 'content' keys, supporting system, user, and assistant roles.",
"max_score": 20
},
{
"name": "SamplingParams Configuration",
"description": "Creates and uses a SamplingParams object to configure generation parameters (max_tokens, temperature), passing it to the chat() method.",
"max_score": 20
},
{
"name": "Response Extraction",
"description": "Correctly extracts the generated text from the RequestOutput object returned by chat(), accessing the outputs attribute and text content.",
"max_score": 15
},
{
"name": "Multi-Turn Handling",
"description": "Properly handles multi-turn conversations by passing the entire message history to chat(), allowing the model to maintain context.",
"max_score": 5
}
]
}Install with Tessl CLI
npx tessl i tessl/pypi-vllmdocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10