0
# Core Word Cloud Generation
1
2
Primary functionality for creating and customizing word clouds, including text-to-visual conversion, frequency-based generation, layout control, and multiple output format support.
3
4
## Capabilities
5
6
### WordCloud Class
7
8
The main class for generating word clouds with comprehensive customization options for appearance, layout, text processing, and output formatting.
9
10
```python { .api }
11
class WordCloud:
12
def __init__(
13
self,
14
font_path=None,
15
width=400,
16
height=200,
17
margin=2,
18
ranks_only=None,
19
prefer_horizontal=0.9,
20
mask=None,
21
contour_width=0,
22
contour_color='black',
23
scale=1,
24
color_func=None,
25
colormap=None,
26
max_words=200,
27
min_font_size=4,
28
font_step=1,
29
stopwords=None,
30
random_state=None,
31
background_color='black',
32
max_font_size=None,
33
mode="RGB",
34
relative_scaling='auto',
35
regexp=None,
36
collocations=True,
37
normalize_plurals=True,
38
repeat=False,
39
include_numbers=False,
40
min_word_length=0,
41
collocation_threshold=30
42
):
43
"""
44
Initialize WordCloud object with customization parameters.
45
46
Parameters:
47
- font_path (str, optional): Path to font file (OTF or TTF)
48
- width (int): Canvas width in pixels (default: 400)
49
- height (int): Canvas height in pixels (default: 200)
50
- margin (int): Spacing around words in pixels (default: 2)
51
- ranks_only (optional): DEPRECATED parameter with no effect, use relative_scaling instead
52
- prefer_horizontal (float): Ratio of horizontal vs vertical placement attempts (0.0-1.0, default: 0.9)
53
- mask (numpy.ndarray, optional): Binary mask for word placement shape
54
- contour_width (float): Width of mask contour in pixels (default: 0)
55
- contour_color (str): Color of mask contour (default: 'black')
56
- scale (float): Scaling factor between computation and drawing (default: 1)
57
- color_func (callable, optional): Custom color generation function
58
- colormap (str, optional): Matplotlib colormap name (default: 'viridis')
59
- max_words (int): Maximum number of words to display (default: 200)
60
- min_font_size (int): Minimum font size in pixels (default: 4)
61
- font_step (int): Font size step increment (default: 1)
62
- stopwords (set, optional): Custom stopwords set
63
- random_state (int or Random, optional): Random seed for reproducibility
64
- background_color (str): Background color (default: 'black')
65
- max_font_size (int, optional): Maximum font size in pixels
66
- mode (str): Image mode 'RGB' or 'RGBA' (default: 'RGB')
67
- relative_scaling (float or str): Importance of word frequencies for sizing ('auto', 0.0-1.0, default: 'auto')
68
- regexp (str, optional): Regular expression for tokenization
69
- collocations (bool): Whether to include bigrams (default: True)
70
- normalize_plurals (bool): Whether to normalize plural forms (default: True)
71
- repeat (bool): Whether to repeat words until max_words reached (default: False)
72
- include_numbers (bool): Whether to include numbers (default: False)
73
- min_word_length (int): Minimum word length to include (default: 0)
74
- collocation_threshold (int): Threshold for bigram significance (default: 30)
75
"""
76
```
77
78
### Text-Based Generation
79
80
Generate word clouds directly from text strings with automatic tokenization and frequency calculation.
81
82
```python { .api }
83
def generate(self, text):
84
"""
85
Generate word cloud from text string.
86
87
Parameters:
88
- text (str): Input text for word cloud generation
89
90
Returns:
91
- WordCloud: Self for method chaining
92
"""
93
94
def generate_from_text(self, text):
95
"""
96
Generate word cloud from text string (alias for generate).
97
98
Parameters:
99
- text (str): Input text for word cloud generation
100
101
Returns:
102
- WordCloud: Self for method chaining
103
"""
104
```
105
106
### Frequency-Based Generation
107
108
Generate word clouds from pre-calculated word frequency dictionaries for precise control over word importance.
109
110
```python { .api }
111
def generate_from_frequencies(self, frequencies, max_font_size=None):
112
"""
113
Generate word cloud from word frequency dictionary.
114
115
Parameters:
116
- frequencies (dict): Dictionary mapping words to frequencies
117
- max_font_size (int, optional): Override maximum font size
118
119
Returns:
120
- WordCloud: Self for method chaining
121
"""
122
123
def fit_words(self, frequencies):
124
"""
125
Generate word cloud from word frequencies (alias for generate_from_frequencies).
126
127
Parameters:
128
- frequencies (dict): Dictionary mapping words to frequencies
129
130
Returns:
131
- WordCloud: Self for method chaining
132
"""
133
```
134
135
### Text Processing
136
137
Extract and process word frequencies from text with customizable tokenization and filtering.
138
139
```python { .api }
140
def process_text(self, text):
141
"""
142
Process text and return word frequencies.
143
144
Parameters:
145
- text (str): Input text to process
146
147
Returns:
148
- dict: Dictionary mapping words to frequencies
149
"""
150
```
151
152
### Output Generation
153
154
Convert generated word clouds to various output formats for display, saving, or further processing.
155
156
```python { .api }
157
def to_image(self):
158
"""
159
Convert word cloud to PIL Image object.
160
161
Returns:
162
- PIL.Image: Word cloud as PIL Image
163
"""
164
165
def to_array(self):
166
"""
167
Convert word cloud to numpy array.
168
169
Returns:
170
- numpy.ndarray: Word cloud as RGB array
171
"""
172
173
def __array__(self):
174
"""
175
Support numpy array conversion.
176
177
Returns:
178
- numpy.ndarray: Word cloud as RGB array
179
"""
180
181
def to_file(self, filename):
182
"""
183
Save word cloud to image file.
184
185
Parameters:
186
- filename (str): Output file path (supports PNG, JPEG, etc.)
187
188
Returns:
189
- WordCloud: Self for method chaining
190
"""
191
192
def to_svg(self, embed_font=False, optimize_embedded_font=True, embed_image=False):
193
"""
194
Export word cloud as SVG format.
195
196
Parameters:
197
- embed_font (bool): Whether to embed font data (default: False)
198
- optimize_embedded_font (bool): Whether to optimize embedded font (default: True)
199
- embed_image (bool): Whether to embed as image (default: False)
200
201
Returns:
202
- str: SVG markup string
203
"""
204
```
205
206
### Styling and Recoloring
207
208
Modify colors and appearance of existing word clouds without regenerating layout.
209
210
```python { .api }
211
def recolor(self, random_state=None, color_func=None, colormap=None):
212
"""
213
Recolor existing word cloud with new color scheme.
214
215
Parameters:
216
- random_state (int or Random, optional): Random seed for color generation
217
- color_func (callable, optional): Custom color function
218
- colormap (str, optional): Matplotlib colormap name
219
220
Returns:
221
- WordCloud: Self for method chaining
222
"""
223
```
224
225
### Word Cloud Attributes
226
227
Properties available after generation containing word and layout information.
228
229
```python { .api }
230
words_: dict[str, float] # Word frequencies (normalized)
231
layout_: list[tuple] # Layout data: (word_info, font_size, position, orientation, color)
232
```
233
234
## Usage Examples
235
236
### Basic Text Generation
237
238
```python
239
from wordcloud import WordCloud
240
241
# Create word cloud from text
242
wc = WordCloud(width=800, height=400, background_color='white')
243
wc.generate("Python is great for data science and machine learning")
244
245
# Save result
246
wc.to_file('wordcloud.png')
247
```
248
249
### Frequency-Based Generation
250
251
```python
252
from wordcloud import WordCloud
253
254
# Use custom word frequencies
255
frequencies = {'python': 10, 'data': 8, 'science': 6, 'analysis': 4}
256
wc = WordCloud().generate_from_frequencies(frequencies)
257
image = wc.to_image()
258
```
259
260
### Masked Shape Generation
261
262
```python
263
from wordcloud import WordCloud
264
import numpy as np
265
from PIL import Image
266
267
# Load mask image
268
mask_image = np.array(Image.open('mask.png'))
269
270
# Generate word cloud in custom shape
271
wc = WordCloud(mask=mask_image, contour_width=2, contour_color='blue')
272
wc.generate(text)
273
wc.to_file('shaped_wordcloud.png')
274
```