A high-throughput and memory-efficient inference and serving engine for LLMs
Overall
score
69%
Evaluation — 69%
↑ 1.33xAgent success when using this tile
{
"context": "This criteria evaluates how well the engineer uses vLLM's custom attention mechanism capabilities to implement a benchmarking tool. The focus is specifically on proper usage of attention backend configuration, LLM initialization with backend parameters, and execution of inference with different attention implementations.",
"type": "weighted_checklist",
"checklist": [
{
"name": "LLM Class Import",
"description": "Correctly imports the LLM class from vllm package",
"max_score": 5
},
{
"name": "SamplingParams Import",
"description": "Correctly imports SamplingParams class from vllm package for controlling generation behavior",
"max_score": 5
},
{
"name": "LLM Initialization",
"description": "Properly initializes LLM instances with the model parameter in the __init__ or run_with_backend methods",
"max_score": 15
},
{
"name": "Attention Backend Configuration",
"description": "Correctly passes the attention_backend parameter when initializing the LLM class (e.g., LLM(model=..., attention_backend=...))",
"max_score": 25
},
{
"name": "Default Backend Handling",
"description": "Properly handles the case when attention_backend is None, allowing vLLM to use its default backend",
"max_score": 10
},
{
"name": "Text Generation",
"description": "Uses the LLM.generate() method to perform text generation with the configured backend",
"max_score": 15
},
{
"name": "SamplingParams Usage",
"description": "Creates and uses SamplingParams objects to control generation parameters like max_tokens and temperature",
"max_score": 15
},
{
"name": "Output Extraction",
"description": "Correctly extracts generated text from RequestOutput objects returned by LLM.generate() (accessing outputs[0].text or similar)",
"max_score": 10
}
]
}Install with Tessl CLI
npx tessl i tessl/pypi-vllmdocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10