tessl/pypi-vllm

tessl install tessl/pypi-vllm@0.10.0

A high-throughput and memory-efficient inference and serving engine for LLMs

Agent Success

Agent success rate when using this tile

69%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.33x

Baseline

Agent success rate without this tile

52%

{
  "context": "This criteria evaluates how well the engineer uses vLLM's beam search and advanced sampling features to generate text with multiple candidate exploration. The focus is on proper usage of vLLM's LLM class, beam_search method, and SamplingParams configuration for controlling generation behavior.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "LLM Initialization",
      "description": "Correctly initializes vLLM's LLM class with the specified model name. The implementation should use vLLM.LLM(model=...) to load the model.",
      "max_score": 10
    },
    {
      "name": "Beam Search Method",
      "description": "Uses the beam_search() method (not generate()) to explore multiple candidate sequences. The implementation should call LLM.beam_search() with appropriate parameters.",
      "max_score": 25
    },
    {
      "name": "Beam Width Configuration",
      "description": "Correctly configures the beam_width parameter to control the number of parallel candidate paths explored (should use the num_candidates parameter to set beam_width).",
      "max_score": 15
    },
    {
      "name": "Length Penalty",
      "description": "Properly applies length_penalty parameter in beam_search() to prevent bias toward shorter sequences and implement length normalization.",
      "max_score": 15
    },
    {
      "name": "Max Tokens Control",
      "description": "Uses max_tokens parameter to limit the length of generated sequences to the specified maximum.",
      "max_score": 10
    },
    {
      "name": "Temperature Parameter",
      "description": "Supports temperature parameter via SamplingParams to control randomness in generation when using parallel sampling strategies.",
      "max_score": 10
    },
    {
      "name": "Vocabulary Restriction",
      "description": "Implements token filtering using SamplingParams with allowed_token_ids or logit_bias to restrict generation to specific vocabulary tokens.",
      "max_score": 10
    },
    {
      "name": "Output Processing",
      "description": "Correctly extracts and returns the best candidate sequence and its score from the beam search results. Should access the appropriate fields from the RequestOutput object.",
      "max_score": 5
    }
  ]
}

tessl/pypi-vllm

rubric.jsonevals/scenario-5/

Version

tessl/pypi-vllm

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-5/

Version

rubric.jsonevals/scenario-5/