docs
0
# Multimodal APIs
1
2
Image generation, audio processing (speech-to-text, text-to-speech, translation), and content moderation capabilities with OpenAI compatibility.
3
4
## Capabilities
5
6
### Image Generation
7
8
```python { .api }
9
class Images:
10
def generate(self, **kwargs): ...
11
def create_variation(self, **kwargs): ...
12
def edit(self, **kwargs): ...
13
```
14
15
### Audio Processing
16
17
```python { .api }
18
class Audio:
19
transcriptions: Transcriptions
20
translations: Translations
21
speech: Speech
22
23
class Transcriptions:
24
def create(self, **kwargs): ...
25
26
class Translations:
27
def create(self, **kwargs): ...
28
29
class Speech:
30
def create(self, **kwargs): ...
31
```
32
33
### Content Moderation
34
35
```python { .api }
36
class Moderations:
37
def create(self, **kwargs): ...
38
```
39
40
## Usage Examples
41
42
```python
43
from portkey_ai import Portkey
44
45
portkey = Portkey(
46
api_key="PORTKEY_API_KEY",
47
virtual_key="VIRTUAL_KEY"
48
)
49
50
# Generate image
51
image = portkey.images.generate(
52
prompt="A cat in a spacesuit",
53
model="dall-e-3",
54
size="1024x1024"
55
)
56
57
# Transcribe audio
58
transcription = portkey.audio.transcriptions.create(
59
file=open("audio.mp3", "rb"),
60
model="whisper-1"
61
)
62
63
# Moderate content
64
moderation = portkey.moderations.create(
65
input="This is a test message"
66
)
67
```