CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-wordcloud

A little word cloud generator for creating visually appealing word clouds from text data.

Overview
Eval results
Files

cli-interface.mddocs/

Command Line Interface

Complete command-line tool for generating word clouds directly from text files with comprehensive parameter control, input/output redirection support, and all WordCloud customization options.

Capabilities

Main CLI Entry Point

Primary function for executing word cloud generation from command line with parsed arguments.

def main(args, text, imagefile):
    """
    Generate word cloud from command line arguments.

    Creates WordCloud instance with provided arguments, generates word cloud from text,
    and saves result to specified image file.

    Parameters:
    - args (dict): Parsed command line arguments as WordCloud parameters
    - text (str): Input text for word cloud generation  
    - imagefile (file-like): Output file object for PNG image

    Returns:
    - None: Saves word cloud directly to imagefile
    """

Argument Parsing

Comprehensive command line argument parsing with validation and type conversion.

def make_parser():
    """
    Create argument parser for wordcloud CLI.

    Builds ArgumentParser with all available WordCloud parameters as command line options,
    including input/output handling, styling options, and text processing controls.

    Returns:
    - argparse.ArgumentParser: Configured argument parser
    """

def parse_args(arguments):
    """
    Parse and validate command line arguments.

    Processes command line arguments, validates options, handles file I/O,
    and converts arguments to WordCloud-compatible format.

    Parameters:
    - arguments (list): Command line argument list (typically sys.argv[1:])

    Returns:
    - tuple: (args_dict, text_string, output_file) ready for main() function

    Raises:
    - ValueError: For incompatible argument combinations
    - ArgumentTypeError: For invalid file paths or regexp patterns
    """

Module Entry Point

Entry point function for python -m wordcloud execution.

def main():
    """
    Entry point for 'python -m wordcloud' command.

    Parses sys.argv arguments and calls wordcloud_cli_main with results.
    This function is registered as the script entry point in setup.py.
    """

Utility Classes

Helper classes for argument parsing and validation.

class FileType:
    def __init__(self, mode='r', bufsize=-1):
        """
        Factory for creating file object types with Unicode support.
        
        Parameters:
        - mode (str): File open mode (default: 'r')
        - bufsize (int): Buffer size (default: -1)
        """
        
    def __call__(self, string):
        """
        Convert string to file object, handling stdin/stdout redirection.
        
        Parameters:
        - string (str): File path or '-' for stdin/stdout
        
        Returns:
        - file-like: Opened file object with UTF-8 encoding
        """

class RegExpAction:
    def __call__(self, parser, namespace, values, option_string=None):
        """
        Validate regular expression arguments.
        
        Parameters:
        - parser (ArgumentParser): The argument parser
        - namespace (Namespace): Parsed arguments namespace
        - values (str): Regular expression string to validate
        - option_string (str): Option that triggered this action
        
        Raises:
        - ArgumentError: If regular expression is invalid
        """

Command Line Options

The CLI supports all WordCloud parameters through command line flags:

Input/Output Options

  • --text FILE: Input text file (default: stdin)
  • --imagefile FILE: Output PNG file (default: stdout)
  • --stopwords FILE: Custom stopwords file (one word per line)

Appearance Options

  • --width WIDTH: Canvas width in pixels (default: 400)
  • --height HEIGHT: Canvas height in pixels (default: 200)
  • --background COLOR: Background color (default: black)
  • --colormap COLORMAP: Matplotlib colormap name (default: viridis)
  • --color COLOR: Single color for all words
  • --fontfile PATH: Custom font file path

Layout Options

  • --mask FILE: Image file to use as shape mask
  • --contour_width WIDTH: Mask contour width (default: 0)
  • --contour_color COLOR: Mask contour color (default: black)
  • --prefer_horizontal RATIO: Horizontal vs vertical placement ratio (default: 0.9)
  • --scale SCALE: Scaling factor (default: 1)
  • --margin WIDTH: Spacing around words (default: 2)

Text Processing Options

  • --regexp PATTERN: Custom tokenization regular expression
  • --no_collocations: Disable bigram detection
  • --include_numbers: Include numbers in word cloud
  • --min_word_length LENGTH: Minimum word length (default: 0)
  • --no_normalize_plurals: Disable plural normalization

Font and Sizing Options

  • --max_words N: Maximum number of words (default: 200)
  • --min_font_size SIZE: Minimum font size (default: 4)
  • --max_font_size SIZE: Maximum font size
  • --font_step STEP: Font size increment (default: 1)
  • --relative_scaling RATIO: Word frequency scaling (default: 0)

Rendering Options

  • --mode MODE: Image mode RGB or RGBA (default: RGB)
  • --repeat: Repeat words until max_words reached
  • --random_state SEED: Random seed for reproducibility
  • --colormask FILE: Reference image for color extraction

Usage Examples

Basic Usage

# Generate word cloud from text file
wordcloud_cli --text input.txt --imagefile output.png

# Read from stdin, write to stdout
cat document.txt | wordcloud_cli > wordcloud.png

# Using python -m syntax
python -m wordcloud --text input.txt --imagefile output.png

Customization Examples

# Custom size and colors
wordcloud_cli --text input.txt --width 1200 --height 800 \
              --background white --colormap plasma --imagefile large.png

# Using mask for custom shape
wordcloud_cli --text input.txt --mask shape.png \
              --contour_width 2 --contour_color blue --imagefile shaped.png

# Single color variation
wordcloud_cli --text input.txt --color darkblue --imagefile blue.png

# Custom font and text processing
wordcloud_cli --text input.txt --fontfile /path/to/font.ttf \
              --stopwords custom_stops.txt --min_word_length 3 \
              --imagefile custom.png

Advanced Options

# Image-based coloring
wordcloud_cli --text input.txt --colormask reference.jpg \
              --imagefile colored.png

# Fine-tuned layout
wordcloud_cli --text input.txt --max_words 500 --relative_scaling 0.8 \
              --prefer_horizontal 0.7 --scale 2 --imagefile detailed.png

# Custom tokenization
wordcloud_cli --text input.txt --regexp "[a-zA-Z]{4,}" \
              --no_collocations --include_numbers --imagefile tokens.png

# Reproducible generation
wordcloud_cli --text input.txt --random_state 42 --imagefile consistent.png

Pipeline Usage

# Process PDF documents
pdftotext document.pdf - | wordcloud_cli --imagefile doc_cloud.png

# Filter and process
grep -E "important|critical|urgent" log.txt | \
wordcloud_cli --colormap Reds --imagefile alerts.png

# Multiple files
cat *.txt | wordcloud_cli --max_words 1000 --imagefile combined.png

Configuration Files

# Using custom stopwords
echo -e "said\nwould\ncould\nmight" > mystops.txt
wordcloud_cli --text input.txt --stopwords mystops.txt --imagefile filtered.png

# Batch processing with consistent settings
for file in *.txt; do
    wordcloud_cli --text "$file" --width 800 --height 600 \
                  --colormap viridis --imagefile "${file%.txt}.png"
done

Error Handling

The CLI provides comprehensive error handling for:

  • File I/O errors: Invalid input/output paths, permission issues
  • Image format errors: Unsupported mask or colormask formats
  • Regular expression errors: Invalid tokenization patterns
  • Color specification errors: Invalid color names or codes
  • Font loading errors: Missing or invalid font files
  • Argument conflicts: Incompatible option combinations (e.g., --color with --colormask)

Help and Version Information

# Display help message
wordcloud_cli --help
python -m wordcloud --help

# Show version
wordcloud_cli --version
python -m wordcloud --version

Install with Tessl CLI

npx tessl i tessl/pypi-wordcloud

docs

cli-interface.md

color-generation.md

core-generation.md

index.md

text-processing.md

tile.json