CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-github-models

This package provides a deprecated integration module that enables Java applications to interact with GitHub Models through the LangChain4j framework. It offers chat models (both synchronous and streaming), embedding models, and support for AI services with tool integration, JSON schema responses, and responsible AI features. The module wraps Azure AI Inference SDK to provide a unified API for accessing various language models hosted on GitHub Models, including chat completion capabilities, embeddings generation, and content filtering management. As of version 1.10.0, this module has been marked for deprecation and future removal, with users recommended to migrate to the langchain4j-openai-official module for enhanced functionality and better integration. The library is designed for reusability as a foundational component in LLM-powered Java applications that need to leverage GitHub-hosted AI models, offering builder patterns for configuration, support for proxy options, custom timeouts, and comprehensive model service versioning capabilities.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

quick-reference.mddocs/

Quick Reference

Instant code snippets for common tasks with langchain4j-github-models.

Chat Completion

Basic synchronous chat

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(System.getenv("GITHUB_TOKEN"))
    .modelName(GitHubModelsChatModelName.GPT_4_O)
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(UserMessage.from("What is the capital of France?"))
    .build());
String answer = response.aiMessage().text();

Chat with system message

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(
        SystemMessage.from("You are a helpful assistant."),
        UserMessage.from("Hello"))
    .build());

Chat with tool calling

ToolSpecification tool = ToolSpecification.builder()
    .name("get_weather")
    .description("Get weather for a location")
    .addParameter("location", "string", "City name")
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(UserMessage.from("What's the weather in Paris?"))
    .parameters(ChatRequestParameters.builder()
        .toolSpecifications(tool)
        .build())
    .build());

if (response.aiMessage().hasToolExecutionRequests()) {
    for (ToolExecutionRequest req : response.aiMessage().toolExecutionRequests()) {
        String toolName = req.name();
        String args = req.arguments();
    }
}

JSON response format

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .responseFormat(new ChatCompletionsResponseFormatJsonObject())
    .strictJsonSchema(true)
    .build();

Streaming

Basic streaming

GitHubModelsStreamingChatModel model = GitHubModelsStreamingChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .build();

model.chat(request, new StreamingChatResponseHandler() {
    public void onPartialResponse(String token) {
        System.out.print(token);
    }
    public void onCompleteResponse(ChatResponse response) {
        System.out.println("\nDone: " + response.metadata().finishReason());
    }
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
});

Accumulate streaming response

StringBuilder fullResponse = new StringBuilder();

model.chat(request, new StreamingChatResponseHandler() {
    public void onPartialResponse(String token) {
        fullResponse.append(token);
    }
    public void onCompleteResponse(ChatResponse response) {
        String complete = fullResponse.toString();
    }
    public void onError(Throwable error) { }
});

Handle content filtering

model.chat(request, new StreamingChatResponseHandler() {
    public void onPartialResponse(String token) { }
    public void onCompleteResponse(ChatResponse response) {
        if (response.metadata().finishReason() == FinishReason.CONTENT_FILTER) {
            System.out.println("Content was filtered");
        }
    }
    public void onError(Throwable error) { }
});

Embeddings

Generate embeddings for text segments

GitHubModelsEmbeddingModel model = GitHubModelsEmbeddingModel.builder()
    .gitHubToken(token)
    .modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_SMALL)
    .build();

List<TextSegment> segments = Arrays.asList(
    TextSegment.from("First text"),
    TextSegment.from("Second text")
);

Response<List<Embedding>> response = model.embedAll(segments);
List<Embedding> embeddings = response.content();

for (Embedding emb : embeddings) {
    float[] vector = emb.vector();
    int dim = emb.dimension();
}

Custom embedding dimensions

GitHubModelsEmbeddingModel model = GitHubModelsEmbeddingModel.builder()
    .gitHubToken(token)
    .modelName("text-embedding-3-large")
    .dimensions(512)
    .build();

Process large batches (auto-splits into batches of 16)

List<TextSegment> manySegments = loadSegments(); // e.g., 100 segments
Response<List<Embedding>> response = model.embedAll(manySegments);
// Internally: 7 requests (16+16+16+16+16+16+4)
// Returns: All 100 embeddings

Configuration

Set sampling parameters

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .temperature(0.7)
    .maxTokens(1000)
    .topP(0.9)
    .presencePenalty(0.6)
    .frequencyPenalty(0.3)
    .seed(12345L)
    .stop(Arrays.asList("\n\n", "END"))
    .build();

Configure network settings

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .timeout(Duration.ofSeconds(60))
    .maxRetries(3)
    .endpoint("https://custom-endpoint.com")
    .build();

Use proxy

ProxyOptions proxy = new ProxyOptions(
    ProxyOptions.Type.HTTP,
    new InetSocketAddress("proxy.example.com", 8080)
);

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .proxyOptions(proxy)
    .build();

Enable request logging

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .logRequestsAndResponses(true)
    .build();

Custom headers

Map<String, String> headers = new HashMap<>();
headers.put("X-Custom-Header", "value");

GitHubModelsChatModel model = GitHubModelsChatModel.builder()
    .gitHubToken(token)
    .modelName("gpt-4o")
    .customHeaders(headers)
    .build();

Model Selection

Common chat models

// Latest GPT-4
.modelName(GitHubModelsChatModelName.GPT_4_O)
.modelName(GitHubModelsChatModelName.GPT_4_O_MINI)

// Microsoft Phi
.modelName(GitHubModelsChatModelName.PHI_3_5_MINI_INSTRUCT)
.modelName(GitHubModelsChatModelName.PHI_3_5_VISION_INSTRUCT)

// Meta Llama
.modelName(GitHubModelsChatModelName.META_LLAMA_3_1_405B_INSTRUCT)
.modelName(GitHubModelsChatModelName.META_LLAMA_3_1_8B_INSTRUCT)

// Mistral
.modelName(GitHubModelsChatModelName.MISTRAL_LARGE)

Common embedding models

// OpenAI embeddings
.modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_SMALL)  // 1536 dim
.modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_LARGE)  // 3072 dim

// Cohere embeddings
.modelName(GitHubModelsEmbeddingModelName.COHERE_EMBED_V3_ENGLISH)        // 1024 dim
.modelName(GitHubModelsEmbeddingModelName.COHERE_EMBED_V3_MULTILINGUAL)   // 1024 dim

Error Handling

Handle HTTP errors

try {
    ChatResponse response = model.chat(request);
} catch (HttpResponseException e) {
    System.err.println("HTTP " + e.getResponse().getStatusCode());
    System.err.println(e.getMessage());
}

Check for content filtering

ChatResponse response = model.chat(request);
if (response.metadata().finishReason() == FinishReason.CONTENT_FILTER) {
    String filterMsg = response.aiMessage().text();
    // Handle filtered content
}

Check token usage

ChatResponse response = model.chat(request);
TokenUsage usage = response.metadata().tokenUsage();
System.out.println("Input: " + usage.inputTokenCount());
System.out.println("Output: " + usage.outputTokenCount());
System.out.println("Total: " + usage.totalTokenCount());

Semantic Search Pattern

// Create embedding model
GitHubModelsEmbeddingModel model = GitHubModelsEmbeddingModel.builder()
    .gitHubToken(token)
    .modelName(GitHubModelsEmbeddingModelName.TEXT_EMBEDDING_3_SMALL)
    .build();

// Embed documents
List<TextSegment> documents = Arrays.asList(
    TextSegment.from("Paris is the capital of France"),
    TextSegment.from("Berlin is the capital of Germany")
);
List<Embedding> docEmbeddings = model.embedAll(documents).content();

// Embed query
Embedding queryEmb = model.embedAll(Arrays.asList(
    TextSegment.from("What is the capital of France?")
)).content().get(0);

// Calculate similarity
for (int i = 0; i < docEmbeddings.size(); i++) {
    double similarity = cosineSimilarity(queryEmb, docEmbeddings.get(i));
    System.out.println("Doc " + i + ": " + similarity);
}

Cosine Similarity Helper

public static double cosineSimilarity(Embedding a, Embedding b) {
    float[] vectorA = a.vector();
    float[] vectorB = b.vector();

    double dotProduct = 0.0;
    double normA = 0.0;
    double normB = 0.0;

    for (int i = 0; i < vectorA.length; i++) {
        dotProduct += vectorA[i] * vectorB[i];
        normA += vectorA[i] * vectorA[i];
        normB += vectorB[i] * vectorB[i];
    }

    return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-github-models

docs

index.md

quick-reference.md

tile.json