tessl/pypi-vllm

tessl install tessl/pypi-vllm@0.10.0

A high-throughput and memory-efficient inference and serving engine for LLMs

Agent Success

Agent success rate when using this tile

69%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.33x

Baseline

Agent success rate without this tile

52%

{
  "context": "This evaluation assesses how effectively the engineer uses vLLM's model loading and initialization capabilities to implement a model configuration service with different deployment modes. The focus is on proper use of the LLM class constructor and its configuration parameters.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "LLM Class Usage",
      "description": "Uses the vLLM LLM class to initialize models. The solution should import and instantiate the LLM class with appropriate model parameter.",
      "max_score": 20
    },
    {
      "name": "GPU Memory Configuration",
      "description": "Correctly uses the gpu_memory_utilization parameter in the LLM constructor. Local mode should set it to 0.5, production mode should set it to 0.9.",
      "max_score": 25
    },
    {
      "name": "Load Format Specification",
      "description": "Properly uses the load_format parameter to specify safetensors format for testing mode. Should pass load_format='safetensors' or load_format='auto' to the LLM constructor.",
      "max_score": 20
    },
    {
      "name": "Model Path Handling",
      "description": "Correctly passes the model_path parameter to the LLM constructor's model parameter for all modes, supporting both local paths and HuggingFace identifiers.",
      "max_score": 20
    },
    {
      "name": "Error Handling",
      "description": "Implements proper validation to raise ValueError for invalid mode values before attempting model initialization.",
      "max_score": 15
    }
  ]
}

tessl/pypi-vllm

rubric.jsonevals/scenario-4/

Version

tessl/pypi-vllm

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-4/

Version

rubric.jsonevals/scenario-4/