CtrlK
BlogDocsLog inGet started
Tessl Logo

klingai-performance-tuning

Optimize Kling AI for speed, quality, and cost efficiency. Use when improving generation times or finding optimal settings. Trigger with phrases like 'klingai performance', 'kling ai optimize', 'faster klingai', 'klingai quality settings'.

59

Quality

70%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/klingai-pack/skills/klingai-performance-tuning/SKILL.md
SKILL.md
Quality
Evals
Security

Kling AI Performance Tuning

Overview

Optimize video generation for your use case by choosing the right model, mode, and parameters. Covers benchmarking, speed vs. quality trade-offs, connection pooling, and caching strategies.

Speed vs. Quality Matrix

Config~Gen TimeQualityCredits (5s)Best For
v2.5-turbo + standard30-60sGood10Drafts, iteration
v2-master + standard60-90sHigh10Production previews
v2.6 + standard60-120sHighest10Quality-sensitive
v2.6 + professional120-300sHighest+35Final output
v2.6 + prof + audio180-400sHighest+200Full production

Benchmarking Tool

import time, requests, json

def benchmark_model(prompt: str, model: str, mode: str = "standard",
                    runs: int = 3) -> dict:
    """Benchmark generation time for a model/mode combination."""
    times = []

    for i in range(runs):
        start = time.monotonic()

        # Submit
        r = requests.post(f"{BASE}/videos/text2video", headers=get_headers(), json={
            "model_name": model, "prompt": prompt, "duration": "5", "mode": mode,
        }).json()
        task_id = r["data"]["task_id"]

        # Poll
        while True:
            time.sleep(10)
            result = requests.get(
                f"{BASE}/videos/text2video/{task_id}", headers=get_headers()
            ).json()
            if result["data"]["task_status"] in ("succeed", "failed"):
                break

        elapsed = time.monotonic() - start
        times.append(elapsed)
        print(f"  Run {i+1}/{runs}: {elapsed:.1f}s ({result['data']['task_status']})")

    return {
        "model": model,
        "mode": mode,
        "avg_sec": round(sum(times) / len(times), 1),
        "min_sec": round(min(times), 1),
        "max_sec": round(max(times), 1),
        "runs": runs,
    }

# Compare models
prompt = "A waterfall in a tropical forest, cinematic"
for model in ["kling-v2-5-turbo", "kling-v2-master", "kling-v2-6"]:
    result = benchmark_model(prompt, model, runs=2)
    print(f"{model}: avg={result['avg_sec']}s, min={result['min_sec']}s")

Connection Pooling

import requests

# Without pooling: new TCP connection per request (slow)
# With pooling: reuse connections (fast)

session = requests.Session()
adapter = requests.adapters.HTTPAdapter(
    pool_connections=5,     # number of connection pools
    pool_maxsize=10,        # max connections per pool
    max_retries=3,          # auto-retry on connection errors
)
session.mount("https://", adapter)

# Use session instead of requests directly
response = session.post(f"{BASE}/videos/text2video", headers=get_headers(), json=body)

Prompt Optimization

Prompts that generate faster:

TechniqueWhy It Helps
Clear single subjectLess complexity to resolve
Specify camera angleReduces ambiguity
Avoid conflicting styles"realistic anime" confuses the model
Keep under 200 wordsShorter prompts process faster
Use negative promptsRemoves processing of unwanted elements
# Slow prompt (vague, conflicting)
slow = "A scene with many things happening, realistic but also artistic"

# Fast prompt (specific, clear)
fast = "A single red fox walking through snow, side view, natural lighting, 4K"

Caching Strategy

import hashlib

class PromptCache:
    """Cache results to avoid regenerating identical videos."""

    def __init__(self):
        self._cache = {}

    def _key(self, prompt: str, model: str, duration: int, mode: str) -> str:
        raw = f"{prompt}|{model}|{duration}|{mode}"
        return hashlib.sha256(raw.encode()).hexdigest()[:16]

    def get(self, prompt, model, duration, mode):
        key = self._key(prompt, model, duration, mode)
        return self._cache.get(key)

    def set(self, prompt, model, duration, mode, video_url):
        key = self._key(prompt, model, duration, mode)
        self._cache[key] = {
            "url": video_url,
            "cached_at": time.time(),
        }

cache = PromptCache()

def generate_with_cache(prompt, model="kling-v2-master", duration=5, mode="standard"):
    cached = cache.get(prompt, model, duration, mode)
    if cached:
        print(f"Cache hit: {cached['url']}")
        return cached["url"]

    # Generate
    result = client.text_to_video(prompt, model=model, duration=duration, mode=mode)
    url = result["videos"][0]["url"]
    cache.set(prompt, model, duration, mode, url)
    return url

Optimization Checklist

  • Use kling-v2-5-turbo for iteration, v2-6 for final
  • Use standard mode until final render
  • Connection pooling via requests.Session()
  • Cache identical prompt+param combinations
  • Prompt: specific, single subject, < 200 words
  • Batch submissions paced at 2-3s intervals
  • Use callback_url instead of polling
  • Download videos async (don't block on CDN download)

Resources

  • Model Catalog
  • Developer Portal
Repository
jeremylongshore/claude-code-plugins-plus-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.