or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/npm-tensorflow-models--universal-sentence-encoder

Universal Sentence Encoder for generating text embeddings using TensorFlow.js

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/@tensorflow-models/universal-sentence-encoder@1.3.x

To install, run

npx @tessl/cli install tessl/npm-tensorflow-models--universal-sentence-encoder@1.3.0

0

# Universal Sentence Encoder

1

2

The Universal Sentence Encoder provides TensorFlow.js implementations for converting text into high-dimensional embeddings. It includes both the standard USE model that generates 512-dimensional embeddings for general text similarity and clustering tasks, and the USE QnA model that creates 100-dimensional embeddings specifically optimized for question-answering applications.

3

4

## Package Information

5

6

- **Package Name**: @tensorflow-models/universal-sentence-encoder

7

- **Package Type**: npm

8

- **Language**: TypeScript

9

- **Installation**: `npm install @tensorflow/tfjs @tensorflow-models/universal-sentence-encoder`

10

11

## Core Imports

12

13

```typescript

14

import * as use from '@tensorflow-models/universal-sentence-encoder';

15

```

16

17

For CommonJS:

18

19

```javascript

20

const use = require('@tensorflow-models/universal-sentence-encoder');

21

```

22

23

## Basic Usage

24

25

```typescript

26

import * as use from '@tensorflow-models/universal-sentence-encoder';

27

28

// Load the model

29

const model = await use.load();

30

31

// Embed sentences

32

const sentences = [

33

'Hello.',

34

'How are you?'

35

];

36

37

const embeddings = await model.embed(sentences);

38

// embeddings is a 2D tensor with shape [2, 512]

39

embeddings.print();

40

```

41

42

## Architecture

43

44

Universal Sentence Encoder is built around several key components:

45

46

- **Main USE Model**: Generates 512-dimensional embeddings using the Transformer architecture

47

- **USE QnA Model**: Specialized 100-dimensional embeddings for question-answering tasks

48

- **Tokenizer**: SentencePiece tokenization with 8k word piece vocabulary using Trie data structure

49

- **Model Loading**: Supports custom model and vocabulary URLs for flexibility

50

- **TensorFlow.js Integration**: Built on tfjs-converter and tfjs-core for browser and Node.js compatibility

51

52

## Capabilities

53

54

### Standard Text Embeddings

55

56

Core Universal Sentence Encoder functionality for generating 512-dimensional embeddings from text. Ideal for semantic similarity, clustering, and general NLP tasks.

57

58

```typescript { .api }

59

function load(config?: LoadConfig): Promise<UniversalSentenceEncoder>;

60

61

interface LoadConfig {

62

modelUrl?: string;

63

vocabUrl?: string;

64

}

65

66

class UniversalSentenceEncoder {

67

embed(inputs: string[] | string): Promise<tf.Tensor2D>;

68

}

69

```

70

71

[Standard Embeddings](./standard-embeddings.md)

72

73

### Question-Answering Embeddings

74

75

Specialized Universal Sentence Encoder for question-answering applications, generating 100-dimensional embeddings optimized for matching questions with answers.

76

77

```typescript { .api }

78

function loadQnA(): Promise<UniversalSentenceEncoderQnA>;

79

80

class UniversalSentenceEncoderQnA {

81

embed(input: ModelInput): ModelOutput;

82

}

83

84

interface ModelInput {

85

queries: string[];

86

responses: string[];

87

contexts?: string[];

88

}

89

90

interface ModelOutput {

91

queryEmbedding: tf.Tensor;

92

responseEmbedding: tf.Tensor;

93

}

94

```

95

96

[Question-Answering](./question-answering.md)

97

98

### Text Tokenization

99

100

Independent tokenizer functionality using SentencePiece algorithm for converting text into token sequences. Can be used separately from the embedding models.

101

102

```typescript { .api }

103

function loadTokenizer(pathToVocabulary?: string): Promise<Tokenizer>;

104

function loadVocabulary(pathToVocabulary: string): Promise<Vocabulary>;

105

function stringToChars(input: string): string[];

106

107

class Tokenizer {

108

constructor(vocabulary: Vocabulary, reservedSymbolsCount?: number);

109

encode(input: string): number[];

110

}

111

112

class Trie {

113

constructor();

114

insert(word: string, score: number, index: number): void;

115

commonPrefixSearch(symbols: string[]): Array<[string[], number, number]>;

116

}

117

```

118

119

[Tokenization](./tokenization.md)

120

121

## Types

122

123

```typescript { .api }

124

// TensorFlow.js tensors

125

import * as tf from '@tensorflow/tfjs-core';

126

127

// Core types

128

type Vocabulary = Array<[string, number]>;

129

130

// Version information

131

const version: string;

132

```