CtrlK

Community Documentation Log in Get started

tessl/pypi-vllm

tessl install tessl/pypi-vllm@0.10.0

A high-throughput and memory-efficient inference and serving engine for LLMs

Agent Success

Agent success rate when using this tile

69%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.33x

Baseline

Agent success rate without this tile

52%

Multi-Turn Conversation System

Name: tessl/pypi-vllm
Author: tessl

Overview

Build a conversational AI system that processes multi-turn dialogues with role-based messages and maintains conversation context across multiple interactions.

Requirements

Implement a function generate_chat_response(conversation, max_tokens=100, temperature=0.7) that:

Multi-Turn Dialogue Processing: Accepts a list of messages with roles (system, user, or assistant) and generates the next response.
Role-Based Messages: Properly handles three types of message roles:
- System messages that provide instructions or context
- User messages representing user inputs
- Assistant messages representing previous AI responses
Conversation Context: Uses the entire conversation history to generate contextually appropriate responses.
Model Configuration: Supports configuration parameters:
- max_tokens: Maximum tokens to generate (default: 100)
- temperature: Sampling temperature for randomness control (default: 0.7)

Input Format

Your system should accept conversations in the following format:

conversation = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello! Can you help me?"},
    {"role": "assistant", "content": "Of course! I'd be happy to help."},
    {"role": "user", "content": "What's the capital of France?"}
]

Expected Behavior

Process the entire conversation history to generate the next response
Apply the system message as context for all responses
Generate responses that are contextually aware of previous turns
Support configurable parameters like temperature and max tokens

Implementation Notes

Use a model suitable for chat/instruction following (e.g., "meta-llama/Meta-Llama-3-8B-Instruct" or similar)
Handle conversations with varying numbers of turns
Return generated text as the response

Test Cases

Test Case 1: Basic Multi-Turn Conversation { .test }

File: tests/test_chat.py { .test-file }

def test_basic_conversation():
    """Test basic multi-turn conversation handling."""
    conversation = [
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "What is 2 + 2?"},
        {"role": "assistant", "content": "2 + 2 equals 4."},
        {"role": "user", "content": "Now multiply that by 3."}
    ]

    response = generate_chat_response(conversation)

    # Verify response acknowledges previous context
    assert response is not None
    assert len(response) > 0
    assert isinstance(response, str)

Test Case 2: System Prompt Influence { .test }

File: tests/test_chat.py { .test-file }

def test_system_prompt():
    """Test that system prompt influences response style."""
    conversation_formal = [
        {"role": "system", "content": "Respond in a very formal, professional manner."},
        {"role": "user", "content": "Hi there"}
    ]

    conversation_casual = [
        {"role": "system", "content": "Respond in a casual, friendly manner."},
        {"role": "user", "content": "Hi there"}
    ]

    response_formal = generate_chat_response(conversation_formal)
    response_casual = generate_chat_response(conversation_casual)

    # Verify both generate responses
    assert response_formal is not None
    assert response_casual is not None
    assert len(response_formal) > 0
    assert len(response_casual) > 0

Test Case 3: Single User Message { .test }

File: tests/test_chat.py { .test-file }

def test_single_message():
    """Test handling of a single user message."""
    conversation = [
        {"role": "user", "content": "Tell me a fun fact."}
    ]

    response = generate_chat_response(conversation)

    assert response is not None
    assert len(response) > 0
    assert isinstance(response, str)

Dependencies { .dependencies }

vllm { .dependency }

Provides high-performance LLM inference capabilities for chat-based text generation.

tessl/pypi-vllm

task.mdevals/scenario-8/

Multi-Turn Conversation System

Overview

Requirements

Input Format

Expected Behavior

Implementation Notes

Test Cases

Test Case 1: Basic Multi-Turn Conversation { .test }

Test Case 2: System Prompt Influence { .test }

Test Case 3: Single User Message { .test }

Dependencies { .dependencies }

vllm { .dependency }

Version

tessl/pypi-vllm

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-8/

Multi-Turn Conversation System

Overview

Requirements

Input Format

Expected Behavior

Implementation Notes

Test Cases

Test Case 1: Basic Multi-Turn Conversation { .test }

Test Case 2: System Prompt Influence { .test }

Test Case 3: Single User Message { .test }

Dependencies { .dependencies }

vllm { .dependency }

Version

task.mdevals/scenario-8/