or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/modal@1.1.x
tile.json

tessl/pypi-modal

tessl install tessl/pypi-modal@1.1.0

Python client library for Modal, a serverless cloud computing platform that enables developers to run Python code in the cloud with on-demand access to compute resources.

Agent Success

Agent success rate when using this tile

85%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.6x

Baseline

Agent success rate without this tile

53%

task.mdevals/scenario-5/

Distributed Log Aggregator

Build a distributed log aggregation system that processes log files from multiple sources in parallel and outputs consolidated results in real-time.

Background

Your team needs a system to process large volumes of log files stored across multiple directories. The system should analyze logs for error patterns, generate statistics, and display progress information during processing. Since processing is distributed across multiple remote workers, visibility into what each worker is doing is crucial for debugging and monitoring.

Requirements

1. Log Processing Function

Create a function process_log_file(file_path: str) -> dict that:

  • Reads a log file from the given path
  • Counts total lines, error lines (containing "ERROR"), and warning lines (containing "WARNING")
  • Returns a dictionary with keys: file_name, total_lines, errors, warnings
  • Prints progress messages during processing to help with monitoring

2. Batch Processor

Create a function process_log_batch(file_paths: list[str]) -> list[dict] that:

  • Takes a list of file paths to process
  • Processes each file using process_log_file
  • Returns a list of results for all files

3. Main Entry Point

Create a CLI entry point that:

  • Takes a directory path as a command-line argument
  • Finds all .log files in the directory (recursively)
  • Processes files in parallel using the batch processor
  • Ensures that output from remote workers is visible locally for monitoring
  • Prints a final summary showing total files processed, total errors, and total warnings

Test Cases

@test "processes single log file correctly"

# test_log_processor.py
from log_aggregator import process_log_file
import tempfile
import os

def test_process_single_file():
    # Create a test log file
    with tempfile.NamedTemporaryFile(mode='w', suffix='.log', delete=False) as f:
        f.write("2024-01-01 10:00:00 INFO Application started\n")
        f.write("2024-01-01 10:01:00 ERROR Database connection failed\n")
        f.write("2024-01-01 10:02:00 WARNING Memory usage high\n")
        f.write("2024-01-01 10:03:00 INFO Processing request\n")
        f.write("2024-01-01 10:04:00 ERROR Invalid user input\n")
        temp_path = f.name

    try:
        result = process_log_file(temp_path)
        assert result['total_lines'] == 5
        assert result['errors'] == 2
        assert result['warnings'] == 1
        assert temp_path.endswith(result['file_name'])
    finally:
        os.unlink(temp_path)

@test "processes multiple files in batch"

# test_log_processor.py
from log_aggregator import process_log_batch
import tempfile
import os

def test_process_batch():
    # Create multiple test log files
    temp_files = []
    for i in range(3):
        with tempfile.NamedTemporaryFile(mode='w', suffix='.log', delete=False) as f:
            f.write(f"2024-01-01 10:00:00 INFO File {i}\n")
            f.write("2024-01-01 10:01:00 ERROR Test error\n")
            temp_files.append(f.name)

    try:
        results = process_log_batch(temp_files)
        assert len(results) == 3
        assert all(r['total_lines'] == 2 for r in results)
        assert all(r['errors'] == 1 for r in results)
    finally:
        for path in temp_files:
            os.unlink(path)

Dependencies { .dependencies }

modal { .dependency }

Provides serverless compute infrastructure for distributed processing.

Implementation Notes

  • The system should process files on remote compute infrastructure
  • Progress output from remote workers must be visible locally
  • Use appropriate parallel processing patterns for efficiency
  • Handle file I/O errors gracefully
  • The solution should be production-ready and handle edge cases