tessl/pypi-vllm

tessl install tessl/pypi-vllm@0.10.0

A high-throughput and memory-efficient inference and serving engine for LLMs

Agent Success

Agent success rate when using this tile

69%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.33x

Baseline

Agent success rate without this tile

52%

{
  "context": "This evaluation assesses how well the engineer uses vLLM's synchronous LLMEngine and asynchronous AsyncLLMEngine for fine-grained request management and streaming inference. The focus is on proper usage of engine-level APIs rather than high-level LLM class methods.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "LLMEngine Initialization",
      "description": "Uses LLMEngine class (or EngineArgs with LLMEngine.from_engine_args()) to initialize the synchronous engine with appropriate model configuration",
      "max_score": 15
    },
    {
      "name": "add_request() Usage",
      "description": "Correctly uses LLMEngine.add_request() method to add text generation requests with request IDs and sampling parameters",
      "max_score": 15
    },
    {
      "name": "step() Execution",
      "description": "Implements step-by-step execution using LLMEngine.step() method in a loop to process requests incrementally",
      "max_score": 15
    },
    {
      "name": "has_unfinished_requests() Check",
      "description": "Uses LLMEngine.has_unfinished_requests() method to determine when all requests are complete",
      "max_score": 10
    },
    {
      "name": "abort_request() Implementation",
      "description": "Correctly calls LLMEngine.abort_request() to cancel a specific request by its ID",
      "max_score": 10
    },
    {
      "name": "AsyncLLMEngine Initialization",
      "description": "Uses AsyncLLMEngine class (or AsyncEngineArgs with AsyncLLMEngine.from_engine_args()) to initialize the asynchronous engine",
      "max_score": 10
    },
    {
      "name": "Async Generator Pattern",
      "description": "Implements async generator function using 'async def' and 'yield' to stream results incrementally from AsyncLLMEngine.generate()",
      "max_score": 15
    },
    {
      "name": "AsyncLLMEngine.generate() Streaming",
      "description": "Uses AsyncLLMEngine.generate() method with async iteration (async for) to process streaming outputs from the async engine",
      "max_score": 10
    }
  ]
}

tessl/pypi-vllm

rubric.jsonevals/scenario-1/

Version

tessl/pypi-vllm

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-1/

Version

rubric.jsonevals/scenario-1/