CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-github-models

This package provides a deprecated integration module that enables Java applications to interact with GitHub Models through the LangChain4j framework. It offers chat models (both synchronous and streaming), embedding models, and support for AI services with tool integration, JSON schema responses, and responsible AI features. The module wraps Azure AI Inference SDK to provide a unified API for accessing various language models hosted on GitHub Models, including chat completion capabilities, embeddings generation, and content filtering management. As of version 1.10.0, this module has been marked for deprecation and future removal, with users recommended to migrate to the langchain4j-openai-official module for enhanced functionality and better integration. The library is designed for reusability as a foundational component in LLM-powered Java applications that need to leverage GitHub-hosted AI models, offering builder patterns for configuration, support for proxy options, custom timeouts, and comprehensive model service versioning capabilities.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

builder-configuration.mddocs/configuration/

Builder Configuration

Common configuration patterns for all model builders.

Required Configuration

All model builders require these two parameters:

GitHub Token

.gitHubToken(String gitHubToken)

GitHub personal access token for authentication.

Best Practice: Load from environment variable, never hardcode

.gitHubToken(System.getenv("GITHUB_TOKEN"))

Model Name

// String
.modelName(String modelName)

// Enum (type-safe)
.modelName(GitHubModelsChatModelName modelName)
.modelName(GitHubModelsEmbeddingModelName modelName)

Model identifier to use.

Best Practice: Use enum constants for type safety and IDE autocomplete

// Chat models
.modelName(GitHubModelsChatModelName.GPT_4_O)

// Embedding models
.modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_SMALL)

Endpoint Configuration

Endpoint URL

.endpoint(String endpoint)

API endpoint. Default: https://models.inference.ai.azure.com

When to customize:

  • Using different Azure region
  • Using private/custom deployment
  • Testing against mock server
.endpoint("https://custom-endpoint.example.com")

Service Version

.serviceVersion(ModelServiceVersion serviceVersion)

Azure AI service API version. Uses latest if not specified.

When to customize:

  • Need specific API version for compatibility
  • Testing against specific version
import com.azure.ai.inference.ModelServiceVersion;

.serviceVersion(ModelServiceVersion.V2024_05_01)

Sampling Parameters

Control model behavior and output characteristics.

Temperature

.temperature(Double temperature)

Controls randomness. Range: 0.0-2.0

  • 0.0-0.3: Deterministic, focused, factual
  • 0.4-0.7: Balanced (default for most use cases)
  • 0.8-1.5: Creative, varied, exploratory
  • 1.6-2.0: Highly creative, unpredictable
// Deterministic
.temperature(0.0)

// Balanced
.temperature(0.7)

// Creative
.temperature(1.2)

Top P (Nucleus Sampling)

.topP(Double topP)

Alternative to temperature. Range: 0.0-1.0

  • 0.1-0.5: Conservative, high confidence
  • 0.6-0.9: Balanced
  • 0.9-1.0: Diverse
.topP(0.9)

Note: Use temperature OR topP, not both

Max Tokens

.maxTokens(Integer maxTokens)

Maximum tokens in response.

// Short responses
.maxTokens(100)

// Medium responses
.maxTokens(500)

// Long responses
.maxTokens(2000)

Presence Penalty

.presencePenalty(Double presencePenalty)

Penalize tokens based on presence in text. Range: -2.0 to 2.0

  • Positive values: Encourage new topics
  • Negative values: Stay on topic
.presencePenalty(0.6)  // Encourage diversity

Frequency Penalty

.frequencyPenalty(Double frequencyPenalty)

Penalize tokens based on frequency. Range: -2.0 to 2.0

  • Positive values: Reduce repetition
  • Negative values: Allow repetition
.frequencyPenalty(0.3)  // Reduce repetition

Seed

.seed(Long seed)

Random seed for deterministic generation.

.seed(12345L)

Use case: Reproducible outputs for testing

Stop Sequences

.stop(List<String> stop)

Stop generation at these sequences.

.stop(Arrays.asList("\n\n", "END", "---"))

Response Format (Chat Only)

JSON Object Response

.responseFormat(ChatCompletionsResponseFormat responseFormat)

Control output format.

import com.azure.ai.inference.models.ChatCompletionsResponseFormatJsonObject;

.responseFormat(new ChatCompletionsResponseFormatJsonObject())

Strict JSON Schema

.strictJsonSchema(boolean strictJsonSchema)

Enable strict JSON schema validation.

.strictJsonSchema(true)

Note: Only available for synchronous chat model, not streaming

Embedding Configuration

Custom Dimensions

.dimensions(Integer dimensions)

Custom embedding dimension (if model supports).

// Reduce from default 3072 to 512
.dimensions(512)

Supported models:

  • TEXT_EMBEDDING_3_SMALL (default: 1536)
  • TEXT_EMBEDDING_3_LARGE (default: 3072)

Not supported:

  • Cohere models (fixed at 1024)

Configuration Examples

Minimal Configuration

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o")
    .build();

Production Configuration

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName(GitHubModelsChatModelName.GPT_4_O)
    .temperature(0.7)
    .maxTokens(1000)
    .timeout(Duration.ofSeconds(60))
    .maxRetries(3)
    .build();

Creative Writing Configuration

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o")
    .temperature(1.2)
    .presencePenalty(0.6)
    .frequencyPenalty(0.3)
    .maxTokens(2000)
    .build();

Deterministic Configuration

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o")
    .temperature(0.0)
    .seed(12345L)
    .maxTokens(500)
    .build();

JSON Output Configuration

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o")
    .responseFormat(new ChatCompletionsResponseFormatJsonObject())
    .strictJsonSchema(true)
    .temperature(0.3)
    .build();

Streaming Configuration

GitHubModelsStreamingChatModel model = GitHubModelsStreamingChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o-mini")
    .temperature(0.8)
    .maxTokens(2000)
    .timeout(Duration.ofSeconds(90))
    .build();

Embedding Configuration

GitHubModelsEmbeddingModel model = GitHubModelsEmbeddingModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_LARGE)
    .dimensions(512)
    .timeout(Duration.ofSeconds(30))
    .build();

Configuration Best Practices

Use Environment Variables

// ✅ Good
.gitHubToken(System.getenv("GITHUB_TOKEN"))

// ❌ Bad
.gitHubToken("ghp_hardcoded_token_12345")

Use Enum Constants

// ✅ Good - type-safe, autocomplete
.modelName(GitHubModelsChatModelName.GPT_4_O)

// ⚠️ OK - flexible but error-prone
.modelName("gpt-4o")

Choose Appropriate Timeouts

// Fast, simple tasks
.timeout(Duration.ofSeconds(30))

// Standard tasks
.timeout(Duration.ofSeconds(60))

// Complex, long responses
.timeout(Duration.ofSeconds(120))

// Streaming
.timeout(Duration.ofSeconds(90))

Set Reasonable Retry Counts

// Development
.maxRetries(1)

// Production
.maxRetries(3)

// Critical operations
.maxRetries(5)

Reuse Model Instances

// ✅ Good - create once, reuse
private final GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName("gpt-4o")
    .build();

public String chat(String message) {
    return model.chat(request).aiMessage().text();
}

// ❌ Bad - creates new instance every call
public String chat(String message) {
    GitHubModelsChatModel model = GitHubModelsChatModel.builder()
        .gitHubToken(System.getenv("GITHUB_TOKEN"))
        .modelName("gpt-4o")
        .build();
    return model.chat(request).aiMessage().text();
}

See Also

  • Authentication
  • Network Configuration
  • Advanced Configuration

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-github-models

docs

index.md

quick-reference.md

tile.json