Tessl Tile for pypi/together@1.5.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

audio.md batch.md chat-completions.md code-interpreter.md completions.md embeddings.md endpoints.md evaluation.md files.md fine-tuning.md images.md index.md models.md rerank.md

embeddings.mddocs/

0
# Embeddings
1

2
Text embedding generation for semantic search, clustering, classification, and similarity analysis. Supports multiple embedding models optimized for different use cases including retrieval, classification, and general-purpose semantic understanding.
3

4
## Capabilities
5

6
### Text Embeddings
7

8
Generate high-dimensional vector representations of text for semantic processing.
9

10
```python { .api }
11
def create(
12
    model: str,
13
    input: Union[str, List[str]],
14
    **kwargs
15
) -> EmbeddingResponse:
16
    """
17
    Create embeddings for input text.
18

19
    Args:
20
        model: Embedding model identifier
21
        input: Text string or list of strings to embed
22

23
    Returns:
24
        EmbeddingResponse with vector embeddings
25
    """
26
```
27

28
### Async Embeddings
29

30
Asynchronous embedding generation for concurrent processing.
31

32
```python { .api }
33
async def create(
34
    model: str,
35
    input: Union[str, List[str]],
36
    **kwargs
37
) -> EmbeddingResponse:
38
    """
39
    Asynchronously create text embeddings.
40

41
    Returns:
42
        EmbeddingResponse with vector embeddings
43
    """
44
```
45

46
## Usage Examples
47

48
### Basic Embedding Generation
49

50
```python
51
from together import Together
52

53
client = Together()
54

55
response = client.embeddings.create(
56
    model="togethercomputer/m2-bert-80M-8k-retrieval",
57
    input="Machine learning is transforming technology"
58
)
59

60
embedding = response.data[0].embedding
61
print(f"Embedding dimension: {len(embedding)}")
62
print(f"First 5 values: {embedding[:5]}")
63
```
64

65
### Batch Embedding Processing
66

67
```python
68
texts = [
69
    "Artificial intelligence and machine learning",
70
    "Deep learning neural networks",
71
    "Natural language processing",
72
    "Computer vision applications",
73
    "Reinforcement learning algorithms"
74
]
75

76
response = client.embeddings.create(
77
    model="togethercomputer/m2-bert-80M-8k-retrieval",
78
    input=texts
79
)
80

81
embeddings = [data.embedding for data in response.data]
82
print(f"Generated {len(embeddings)} embeddings")
83
print(f"Each embedding has {len(embeddings[0])} dimensions")
84
```
85

86
### Semantic Similarity
87

88
```python
89
import numpy as np
90
from sklearn.metrics.pairwise import cosine_similarity
91

92
def calculate_similarity(text1: str, text2: str, model: str) -> float:
93
    response = client.embeddings.create(
94
        model=model,
95
        input=[text1, text2]
96
    )
97
    
98
    embeddings = [data.embedding for data in response.data]
99
    similarity = cosine_similarity([embeddings[0]], [embeddings[1]])[0][0]
100
    return similarity
101

102
similarity_score = calculate_similarity(
103
    "I love programming in Python",
104
    "Python is my favorite programming language",
105
    "togethercomputer/m2-bert-80M-8k-retrieval"
106
)
107
print(f"Similarity score: {similarity_score:.4f}")
108
```
109

110
### Document Retrieval System
111

112
```python
113
def create_document_embeddings(documents: List[str], model: str) -> List[List[float]]:
114
    """Create embeddings for a collection of documents."""
115
    response = client.embeddings.create(model=model, input=documents)
116
    return [data.embedding for data in response.data]
117

118
def find_most_similar(query: str, documents: List[str], model: str, top_k: int = 3):
119
    """Find most similar documents to a query."""
120
    # Create embeddings for query and documents
121
    all_texts = [query] + documents
122
    response = client.embeddings.create(model=model, input=all_texts)
123
    embeddings = [data.embedding for data in response.data]
124
    
125
    query_embedding = embeddings[0]
126
    doc_embeddings = embeddings[1:]
127
    
128
    # Calculate similarities
129
    similarities = cosine_similarity([query_embedding], doc_embeddings)[0]
130
    
131
    # Get top-k most similar documents
132
    top_indices = np.argsort(similarities)[::-1][:top_k]
133
    
134
    results = []
135
    for idx in top_indices:
136
        results.append({
137
            'document': documents[idx],
138
            'similarity': similarities[idx],
139
            'index': idx
140
        })
141
    
142
    return results
143

144
# Example usage
145
documents = [
146
    "Python is a versatile programming language",
147
    "Machine learning models require large datasets",
148
    "Web development with JavaScript frameworks",
149
    "Database design and optimization techniques",
150
    "Cloud computing and distributed systems"
151
]
152

153
query = "What programming languages are popular?"
154
results = find_most_similar(
155
    query, 
156
    documents, 
157
    "togethercomputer/m2-bert-80M-8k-retrieval"
158
)
159

160
for result in results:
161
    print(f"Similarity: {result['similarity']:.4f} - {result['document']}")
162
```
163

164
### Async Embedding Processing
165

166
```python
167
import asyncio
168
from together import AsyncTogether
169

170
async def process_embeddings_async():
171
    client = AsyncTogether()
172
    
173
    text_batches = [
174
        ["AI research and development", "Machine learning applications"],
175
        ["Data science methodologies", "Statistical analysis techniques"],
176
        ["Neural network architectures", "Deep learning frameworks"]
177
    ]
178
    
179
    tasks = [
180
        client.embeddings.create(
181
            model="togethercomputer/m2-bert-80M-8k-retrieval",
182
            input=batch
183
        )
184
        for batch in text_batches
185
    ]
186
    
187
    responses = await asyncio.gather(*tasks)
188
    
189
    all_embeddings = []
190
    for response in responses:
191
        batch_embeddings = [data.embedding for data in response.data]
192
        all_embeddings.extend(batch_embeddings)
193
    
194
    print(f"Generated {len(all_embeddings)} embeddings asynchronously")
195
    return all_embeddings
196

197
embeddings = asyncio.run(process_embeddings_async())
198
```
199

200
## Types
201

202
### Request Types
203

204
```python { .api }
205
class EmbeddingRequest:
206
    model: str
207
    input: Union[str, List[str]]
208
```
209

210
### Response Types
211

212
```python { .api }
213
class EmbeddingResponse:
214
    object: str
215
    data: List[EmbeddingData]
216
    model: str
217
    usage: EmbeddingUsage
218

219
class EmbeddingData:
220
    object: str
221
    embedding: List[float]
222
    index: int
223

224
class EmbeddingUsage:
225
    prompt_tokens: int
226
    total_tokens: int
227
```
228

229
## Supported Models
230

231
Common embedding models available through Together:
232

233
- `togethercomputer/m2-bert-80M-8k-retrieval` - Optimized for retrieval tasks
234
- `togethercomputer/m2-bert-80M-32k-retrieval` - Extended context retrieval
235
- `WhereIsAI/UAE-Large-V1` - General-purpose embedding model
236
- `BAAI/bge-large-en-v1.5` - High-quality English embeddings
237
- `BAAI/bge-base-en-v1.5` - Balanced performance and efficiency

Version

Tile

Files

embeddings.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

embeddings.mddocs/