CtrlK
BlogDocsLog inGet started
Tessl Logo

groq-hello-world

Create a minimal working Groq chat completion example. Use when starting a new Groq integration, testing your setup, or learning basic Groq API patterns. Trigger with phrases like "groq hello world", "groq example", "groq quick start", "simple groq code".

80

Quality

77%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/groq-pack/skills/groq-hello-world/SKILL.md
SKILL.md
Quality
Evals
Security

Groq Hello World

Overview

Build a minimal chat completion with Groq's LPU inference API. Groq uses an OpenAI-compatible endpoint, so the API shape is familiar -- but responses arrive 10-50x faster than GPU-based providers.

Prerequisites

  • groq-sdk installed (npm install groq-sdk)
  • GROQ_API_KEY environment variable set
  • Completed groq-install-auth setup

Instructions

Step 1: Basic Chat Completion (TypeScript)

import Groq from "groq-sdk";

const groq = new Groq();

async function main() {
  const completion = await groq.chat.completions.create({
    model: "llama-3.3-70b-versatile",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "What is Groq's LPU and why is it fast?" },
    ],
  });

  console.log(completion.choices[0].message.content);
  console.log(`Tokens: ${completion.usage?.total_tokens}`);
}

main().catch(console.error);

Step 2: Streaming Response

async function streamExample() {
  const stream = await groq.chat.completions.create({
    model: "llama-3.3-70b-versatile",
    messages: [
      { role: "user", content: "Explain quantum computing in 3 sentences." },
    ],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || "";
    process.stdout.write(content);
  }
  console.log(); // newline
}

Step 3: Python Equivalent

from groq import Groq

client = Groq()

completion = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Groq's LPU and why is it fast?"},
    ],
)

print(completion.choices[0].message.content)
print(f"Tokens: {completion.usage.total_tokens}")

Step 4: Try Different Models

// Speed tier -- fastest responses (~560 tok/s)
const fast = await groq.chat.completions.create({
  model: "llama-3.1-8b-instant",
  messages: [{ role: "user", content: "Hello!" }],
});

// Quality tier -- best reasoning (~280 tok/s)
const quality = await groq.chat.completions.create({
  model: "llama-3.3-70b-versatile",
  messages: [{ role: "user", content: "Explain monads in Haskell." }],
});

// Vision tier -- multimodal understanding
const vision = await groq.chat.completions.create({
  model: "meta-llama/llama-4-scout-17b-16e-instruct",
  messages: [{
    role: "user",
    content: [
      { type: "text", text: "Describe this image." },
      { type: "image_url", image_url: { url: "https://example.com/photo.jpg" } },
    ],
  }],
});

Available Models (Current)

Model IDParamsContextSpeedBest For
llama-3.1-8b-instant8B128K~560 tok/sClassification, extraction, fast tasks
llama-3.3-70b-versatile70B128K~280 tok/sGeneral purpose, reasoning, code
llama-3.3-70b-specdec70B128KFasterSame quality, speculative decoding
meta-llama/llama-4-scout-17b-16e-instruct17Bx16E128K~460 tok/sVision, multimodal
meta-llama/llama-4-maverick-17b-128e-instruct17Bx128E128KBest multimodal quality

Response Structure

interface ChatCompletion {
  id: string;                    // "chatcmpl-xxx"
  object: "chat.completion";
  created: number;               // Unix timestamp
  model: string;                 // Actual model used
  choices: [{
    index: number;
    message: { role: "assistant"; content: string };
    finish_reason: "stop" | "length" | "tool_calls";
  }];
  usage: {
    prompt_tokens: number;
    completion_tokens: number;
    total_tokens: number;
    queue_time: number;          // Groq-specific: seconds in queue
    prompt_time: number;         // Groq-specific: seconds for prompt
    completion_time: number;     // Groq-specific: seconds for completion
    total_time: number;          // Groq-specific: total processing seconds
  };
}

Error Handling

ErrorCauseSolution
401 Invalid API KeyKey not set or invalidCheck GROQ_API_KEY env var
model_not_foundTypo in model ID or deprecated modelCheck model list at console.groq.com/docs/models
429 Rate limitFree tier: 30 RPM on large modelsWait for retry-after header value
context_length_exceededPrompt + max_tokens > model contextReduce prompt size or set lower max_tokens

Resources

  • Groq Text Generation Docs
  • Groq Models Reference
  • Groq API Reference

Next Steps

Proceed to groq-local-dev-loop for development workflow setup.

Repository
jeremylongshore/claude-code-plugins-plus-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.