A high-throughput and memory-efficient inference and serving engine for LLMs
Overall
score
69%
Evaluation — 69%
↑ 1.33xAgent success when using this tile
{
"context": "This evaluation assesses how well an engineer uses vLLM's multi-modal capabilities to implement an image description service. The focus is on proper initialization of vision-language models, correct formatting of multi-modal prompts, and appropriate use of vLLM's inference APIs for processing images with text.",
"type": "weighted_checklist",
"checklist": [
{
"name": "LLM initialization",
"description": "Correctly initializes an LLM instance with a vision-language model name (e.g., using vllm.LLM class with a model parameter set to a vision-language model like 'llava-hf/llava-1.5-7b-hf' or similar)",
"max_score": 20
},
{
"name": "Multi-modal prompt format",
"description": "Uses the correct multi-modal prompt format with both 'prompt' and 'multi_modal_data' keys in a dictionary structure (e.g., {'prompt': text, 'multi_modal_data': {'image': image_data}})",
"max_score": 25
},
{
"name": "Image loading",
"description": "Properly loads image data from file paths for use with vLLM (e.g., using PIL/Pillow to load images, or vLLM's MediaIO abstraction)",
"max_score": 15
},
{
"name": "Single image processing",
"description": "Correctly uses LLM.generate() or LLM.chat() method to process single image inputs with text prompts and returns the generated text description",
"max_score": 20
},
{
"name": "Multiple image handling",
"description": "Correctly formats and processes multiple images in a single request using vLLM's multi-modal capabilities (e.g., passing multiple images in multi_modal_data)",
"max_score": 15
},
{
"name": "Error handling",
"description": "Implements appropriate error handling for invalid image paths (e.g., checking file existence before processing, raising FileNotFoundError as specified in the API)",
"max_score": 5
}
]
}Install with Tessl CLI
npx tessl i tessl/pypi-vllmdocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10