tessl/pypi-vllm

tessl install tessl/pypi-vllm@0.10.0

A high-throughput and memory-efficient inference and serving engine for LLMs

Agent Success

Agent success rate when using this tile

69%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.33x

Baseline

Agent success rate without this tile

52%

{
  "context": "This evaluation assesses how well an engineer uses vLLM's multi-modal capabilities to implement an image description service. The focus is on proper initialization of vision-language models, correct formatting of multi-modal prompts, and appropriate use of vLLM's inference APIs for processing images with text.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "LLM initialization",
      "description": "Correctly initializes an LLM instance with a vision-language model name (e.g., using vllm.LLM class with a model parameter set to a vision-language model like 'llava-hf/llava-1.5-7b-hf' or similar)",
      "max_score": 20
    },
    {
      "name": "Multi-modal prompt format",
      "description": "Uses the correct multi-modal prompt format with both 'prompt' and 'multi_modal_data' keys in a dictionary structure (e.g., {'prompt': text, 'multi_modal_data': {'image': image_data}})",
      "max_score": 25
    },
    {
      "name": "Image loading",
      "description": "Properly loads image data from file paths for use with vLLM (e.g., using PIL/Pillow to load images, or vLLM's MediaIO abstraction)",
      "max_score": 15
    },
    {
      "name": "Single image processing",
      "description": "Correctly uses LLM.generate() or LLM.chat() method to process single image inputs with text prompts and returns the generated text description",
      "max_score": 20
    },
    {
      "name": "Multiple image handling",
      "description": "Correctly formats and processes multiple images in a single request using vLLM's multi-modal capabilities (e.g., passing multiple images in multi_modal_data)",
      "max_score": 15
    },
    {
      "name": "Error handling",
      "description": "Implements appropriate error handling for invalid image paths (e.g., checking file existence before processing, raising FileNotFoundError as specified in the API)",
      "max_score": 5
    }
  ]
}

tessl/pypi-vllm

rubric.jsonevals/scenario-9/

Version

tessl/pypi-vllm

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-9/

Version

rubric.jsonevals/scenario-9/