0
# Utilities and Management
1
2
Utility functions for tag parsing and formatting, plus management commands for maintaining tag data integrity. Django-taggit provides various utilities for customizing tag behavior and management commands for database maintenance.
3
4
## Capabilities
5
6
### Tag Parsing Functions
7
8
Functions for converting between string representations and tag lists.
9
10
```python { .api }
11
def parse_tags(tagstring):
12
"""
13
Parse tag string into list of tag names.
14
15
Supports comma-separated and space-separated formats,
16
with quoted strings for multi-word tags.
17
18
Parameters:
19
- tagstring (str): String containing tags, or None
20
21
Returns:
22
list: Sorted list of unique tag names (empty list if input is None/empty)
23
24
Examples:
25
- "python, django" -> ["django", "python"]
26
- "python django" -> ["django", "python"]
27
- '"web development", python' -> ["python", "web development"]
28
- None -> []
29
- "" -> []
30
"""
31
32
def edit_string_for_tags(tags):
33
"""
34
Convert tag list to editable string representation.
35
36
Creates a string suitable for form editing that can be
37
parsed back to the same tag list.
38
39
Parameters:
40
- tags: List of Tag objects or tag names
41
42
Returns:
43
str: Formatted string representation of tags
44
"""
45
```
46
47
### Internal Parsing Functions
48
49
Lower-level functions used internally by the parsing system.
50
51
```python { .api }
52
def _parse_tags(tagstring):
53
"""
54
Internal tag parsing implementation.
55
56
Handles the actual parsing logic with support for
57
quoted strings and various delimiters.
58
"""
59
60
def _edit_string_for_tags(tags):
61
"""
62
Internal tag formatting implementation.
63
64
Handles the conversion of tag objects to formatted strings
65
with proper quoting for multi-word tags.
66
"""
67
68
def split_strip(string, delimiter=","):
69
"""
70
Split string on delimiter and strip whitespace.
71
72
Parameters:
73
- string (str): String to split
74
- delimiter (str): Delimiter character (default: comma)
75
76
Returns:
77
list: List of non-empty stripped strings
78
"""
79
```
80
81
### Configuration Functions
82
83
Functions for customizing tag behavior through Django settings.
84
85
```python { .api }
86
def get_func(key, default):
87
"""
88
Get customizable function from Django settings.
89
90
Allows overriding default tag parsing and formatting
91
functions through Django settings.
92
93
Parameters:
94
- key (str): Settings key name
95
- default (callable): Default function to use
96
97
Returns:
98
callable: Function from settings or default
99
"""
100
```
101
102
### Utility Decorators
103
104
Decorators for tag manager methods.
105
106
```python { .api }
107
def require_instance_manager(func):
108
"""
109
Decorator ensuring method is called on instance manager.
110
111
Prevents calling instance-specific methods on class-level
112
tag managers, raising TypeError if misused.
113
114
Parameters:
115
- func (callable): Method to decorate
116
117
Returns:
118
callable: Decorated method with instance validation
119
"""
120
```
121
122
## Management Commands
123
124
### Remove Orphaned Tags
125
126
Command to clean up tags that are no longer associated with any objects.
127
128
```python { .api }
129
# Command class
130
class Command(BaseCommand):
131
"""
132
Remove orphaned tags from the database.
133
134
Finds and deletes tags that have no associated TaggedItem
135
relationships, cleaning up unused tags automatically.
136
"""
137
help = "Remove orphaned tags"
138
139
def handle(self, *args, **options):
140
"""Execute the orphaned tag cleanup process."""
141
```
142
143
Usage:
144
```bash
145
# Remove all orphaned tags
146
python manage.py remove_orphaned_tags
147
148
# Example output:
149
# Successfully removed 15 orphaned tags
150
```
151
152
### Deduplicate Tags
153
154
Command to identify and merge duplicate tags based on case-insensitive matching.
155
156
```python { .api }
157
# Command class
158
class Command(BaseCommand):
159
"""
160
Remove duplicate tags based on case insensitivity.
161
162
Identifies tags with the same name (case-insensitive) and
163
merges them into a single tag, preserving all relationships.
164
"""
165
help = "Identify and remove duplicate tags based on case insensitivity"
166
167
def handle(self, *args, **kwargs):
168
"""Execute the tag deduplication process."""
169
170
def _deduplicate_tags(self, existing_tag, tag_to_remove):
171
"""
172
Merge two duplicate tags.
173
174
Parameters:
175
- existing_tag: Tag to keep
176
- tag_to_remove: Tag to merge and delete
177
"""
178
```
179
180
Usage:
181
```bash
182
# Deduplicate tags (requires TAGGIT_CASE_INSENSITIVE = True)
183
python manage.py deduplicate_tags
184
185
# Example output:
186
# Tag deduplication complete.
187
```
188
189
## Custom Tag Parsing
190
191
### Implementing Custom Parsers
192
193
Creating custom tag parsing functions for specialized formats.
194
195
```python
196
def custom_parse_tags(tagstring):
197
"""
198
Custom tag parser with special rules.
199
200
Example: Parse hashtag-style tags (#python #django)
201
"""
202
if not tagstring:
203
return []
204
205
# Extract hashtags
206
import re
207
hashtags = re.findall(r'#(\w+)', tagstring)
208
209
# Also parse regular comma-separated tags
210
regular_tags = []
211
cleaned = re.sub(r'#\w+', '', tagstring)
212
if cleaned.strip():
213
from taggit.utils import _parse_tags
214
regular_tags = _parse_tags(cleaned)
215
216
# Combine and deduplicate
217
all_tags = list(set(hashtags + regular_tags))
218
all_tags.sort()
219
return all_tags
220
221
def custom_format_tags(tags):
222
"""
223
Custom tag formatter for display.
224
225
Example: Format as hashtags for social media style
226
"""
227
names = [f"#{tag.name}" for tag in tags]
228
return " ".join(sorted(names))
229
230
# Configure in settings.py
231
TAGGIT_TAGS_FROM_STRING = 'myapp.utils.custom_parse_tags'
232
TAGGIT_STRING_FROM_TAGS = 'myapp.utils.custom_format_tags'
233
```
234
235
### Advanced Parsing Examples
236
237
Different parsing strategies for various use cases.
238
239
```python
240
def scientific_tag_parser(tagstring):
241
"""
242
Parser for scientific/academic tags with categories.
243
244
Format: "category:value, category:value"
245
Example: "field:biology, method:pcr, organism:ecoli"
246
"""
247
if not tagstring:
248
return []
249
250
tags = []
251
parts = [part.strip() for part in tagstring.split(',')]
252
253
for part in parts:
254
if ':' in part:
255
# Structured tag
256
category, value = part.split(':', 1)
257
tags.append(f"{category.strip()}:{value.strip()}")
258
else:
259
# Regular tag
260
tags.append(part)
261
262
return sorted(set(tags))
263
264
def hierarchical_tag_parser(tagstring):
265
"""
266
Parser for hierarchical tags with path-like structure.
267
268
Format: "parent/child/grandchild"
269
Example: "technology/web/frontend, technology/web/backend"
270
"""
271
if not tagstring:
272
return []
273
274
tags = []
275
parts = [part.strip() for part in tagstring.split(',')]
276
277
for part in parts:
278
# Add all levels of hierarchy
279
levels = part.split('/')
280
for i in range(len(levels)):
281
hierarchical_tag = '/'.join(levels[:i+1])
282
tags.append(hierarchical_tag)
283
284
return sorted(set(tags))
285
```
286
287
## Configuration Settings
288
289
### Available Settings
290
291
Django settings that customize tagging behavior.
292
293
```python
294
# settings.py
295
296
# Case-insensitive tag matching and creation
297
TAGGIT_CASE_INSENSITIVE = True
298
299
# Strip unicode characters when creating slugs
300
TAGGIT_STRIP_UNICODE_WHEN_SLUGIFYING = True
301
302
# Custom tag parsing function
303
TAGGIT_TAGS_FROM_STRING = 'myapp.utils.custom_parse_tags'
304
305
# Custom tag formatting function
306
TAGGIT_STRING_FROM_TAGS = 'myapp.utils.custom_format_tags'
307
```
308
309
### Settings Usage Examples
310
311
How different settings affect tag behavior.
312
313
```python
314
# With TAGGIT_CASE_INSENSITIVE = True
315
tag1 = Tag.objects.create(name="Python")
316
tag2 = Tag.objects.get_or_create(name="python") # Gets existing tag1
317
318
# With TAGGIT_STRIP_UNICODE_WHEN_SLUGIFYING = True
319
tag = Tag.objects.create(name="café")
320
print(tag.slug) # "cafe" instead of "café"
321
322
# With custom parsing function
323
article.tags.set("#python #django #tutorial") # Uses custom parser
324
```
325
326
## Database Maintenance
327
328
### Manual Tag Maintenance
329
330
Programmatic approaches to tag maintenance beyond management commands.
331
332
```python
333
from django.db.models import Count
334
from taggit.models import Tag, TaggedItem
335
336
def cleanup_unused_tags():
337
"""Remove tags with no associated items."""
338
unused_tags = Tag.objects.annotate(
339
usage_count=Count('tagged_items')
340
).filter(usage_count=0)
341
342
count = unused_tags.count()
343
unused_tags.delete()
344
return count
345
346
def merge_similar_tags(similarity_threshold=0.8):
347
"""Merge tags with similar names."""
348
from difflib import SequenceMatcher
349
350
tags = list(Tag.objects.all())
351
merged_count = 0
352
353
for i, tag1 in enumerate(tags):
354
for tag2 in tags[i+1:]:
355
similarity = SequenceMatcher(
356
None, tag1.name.lower(), tag2.name.lower()
357
).ratio()
358
359
if similarity >= similarity_threshold:
360
# Merge tag2 into tag1
361
TaggedItem.objects.filter(tag=tag2).update(tag=tag1)
362
tag2.delete()
363
merged_count += 1
364
tags.remove(tag2)
365
366
return merged_count
367
368
def normalize_tag_names():
369
"""Normalize tag names (lowercase, strip whitespace)."""
370
tags = Tag.objects.all()
371
updated_count = 0
372
373
for tag in tags:
374
normalized_name = tag.name.strip().lower()
375
if tag.name != normalized_name:
376
# Check if normalized version already exists
377
existing = Tag.objects.filter(name=normalized_name).first()
378
if existing and existing != tag:
379
# Merge into existing tag
380
TaggedItem.objects.filter(tag=tag).update(tag=existing)
381
tag.delete()
382
else:
383
# Update tag name
384
tag.name = normalized_name
385
tag.save()
386
updated_count += 1
387
388
return updated_count
389
```
390
391
### Tag Statistics and Analysis
392
393
Functions for analyzing tag usage patterns.
394
395
```python
396
def get_tag_statistics():
397
"""Get comprehensive tag usage statistics."""
398
from django.db.models import Count, Avg
399
400
stats = Tag.objects.aggregate(
401
total_tags=Count('id'),
402
avg_usage=Avg('tagged_items__count')
403
)
404
405
# Most used tags
406
popular_tags = Tag.objects.annotate(
407
usage_count=Count('tagged_items')
408
).order_by('-usage_count')[:10]
409
410
# Unused tags
411
unused_count = Tag.objects.filter(tagged_items__isnull=True).count()
412
413
return {
414
'total_tags': stats['total_tags'],
415
'average_usage': stats['avg_usage'] or 0,
416
'popular_tags': list(popular_tags.values('name', 'usage_count')),
417
'unused_tags': unused_count
418
}
419
420
def find_tag_patterns():
421
"""Analyze tag naming patterns."""
422
import re
423
from collections import Counter
424
425
tags = Tag.objects.values_list('name', flat=True)
426
427
# Find common prefixes
428
prefixes = Counter()
429
for tag in tags:
430
if ':' in tag:
431
prefix = tag.split(':')[0]
432
prefixes[prefix] += 1
433
434
# Find common word patterns
435
words = []
436
for tag in tags:
437
words.extend(re.findall(r'\w+', tag.lower()))
438
word_freq = Counter(words)
439
440
return {
441
'common_prefixes': dict(prefixes.most_common(10)),
442
'common_words': dict(word_freq.most_common(20))
443
}
444
```
445
446
### Automated Maintenance Tasks
447
448
Setting up automated tag maintenance with Django management commands.
449
450
```python
451
# management/commands/maintain_tags.py
452
from django.core.management.base import BaseCommand
453
from django.core.management import call_command
454
455
class Command(BaseCommand):
456
help = 'Comprehensive tag maintenance'
457
458
def add_arguments(self, parser):
459
parser.add_argument(
460
'--dry-run',
461
action='store_true',
462
help='Show what would be done without making changes'
463
)
464
465
def handle(self, *args, **options):
466
dry_run = options['dry_run']
467
468
if not dry_run:
469
# Remove orphaned tags
470
call_command('remove_orphaned_tags')
471
472
# Deduplicate tags if case insensitive mode is enabled
473
from django.conf import settings
474
if getattr(settings, 'TAGGIT_CASE_INSENSITIVE', False):
475
call_command('deduplicate_tags')
476
477
# Show statistics
478
from myapp.utils import get_tag_statistics
479
stats = get_tag_statistics()
480
481
self.stdout.write(f"Total tags: {stats['total_tags']}")
482
self.stdout.write(f"Unused tags: {stats['unused_tags']}")
483
self.stdout.write(f"Average usage: {stats['average_usage']:.1f}")
484
485
# Usage: python manage.py maintain_tags --dry-run
486
```