CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

69

1.32x
Quality

Pending

Does it follow best practices?

Impact

69%

1.32x

Average score across 10 eval scenarios

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

criteria.jsonevals/scenario-7/

{
  "context": "This criteria evaluates how effectively the engineer uses vLLM's LoRA (Low-Rank Adaptation) capabilities to implement a multi-adapter text generation service. The focus is on correct usage of vLLM's adapter configuration, initialization parameters, and request-time adapter specification.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "LLM Initialization",
      "description": "Uses vLLM's LLM class to initialize the model with enable_lora parameter set appropriately to enable LoRA support",
      "max_score": 15
    },
    {
      "name": "Max LoRAs Configuration",
      "description": "Configures max_loras parameter during LLM initialization to control the maximum number of concurrent LoRA adapters",
      "max_score": 15
    },
    {
      "name": "Max LoRA Rank",
      "description": "Configures max_lora_rank parameter during LLM initialization to set the maximum rank for LoRA adapters",
      "max_score": 15
    },
    {
      "name": "LoRA Request Object",
      "description": "Uses vLLM's LoRARequest class to create adapter request objects with lora_name and lora_path parameters",
      "max_score": 25
    },
    {
      "name": "Adapter in Generate",
      "description": "Passes the LoRARequest object to the generate() method using the lora_request parameter to apply the adapter during generation",
      "max_score": 20
    },
    {
      "name": "Base Model Generation",
      "description": "Correctly handles base model generation by calling generate() without a lora_request parameter when no adapter is specified",
      "max_score": 10
    }
  ]
}

tile.json