CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pyautogui

Cross-platform GUI automation library that enables programmatic control of mouse, keyboard, and screen interactions.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

screen-image.mddocs/

Screen Capture and Image Recognition

Screenshot capture and computer vision capabilities for finding images, text, and UI elements on screen with pixel-perfect matching and tolerance controls. Provides comprehensive screen analysis tools for automated testing and GUI interaction.

Capabilities

Screen Information

Get basic screen dimensions and mouse position information.

def size():
    """
    Get screen size as (width, height) tuple.
    
    Returns:
    Tuple[int, int]: Screen dimensions in pixels (width, height)
    """

def resolution():
    """Alias for size() - get screen resolution."""

def position():
    """
    Get current mouse position.
    
    Returns:
    Tuple[int, int]: Current mouse coordinates (x, y)
    """

def onScreen(x, y=None):
    """
    Check if coordinates are within screen bounds.
    
    Parameters:
    - x (int or tuple): X coordinate, or (x, y) tuple
    - y (int, optional): Y coordinate if x is not tuple
    
    Returns:
    bool: True if coordinates are on screen, False otherwise
    """

Screenshot Capture

Capture screenshots of the entire screen or specific regions with optional file saving.

def screenshot(imageFilename=None, region=None):
    """
    Capture screenshot of screen or region.
    
    Parameters:
    - imageFilename (str, optional): Path to save screenshot. If None, returns PIL Image
    - region (tuple, optional): (left, top, width, height) region to capture. If None, captures full screen
    
    Returns:
    PIL.Image: Screenshot image object (if imageFilename is None)
    str: Path to saved image file (if imageFilename provided)
    
    Examples:
    screenshot('fullscreen.png')                    # Save full screen
    screenshot('region.png', (100, 100, 300, 200)) # Save specific region
    img = screenshot()                              # Return PIL Image object
    """

Image Location - Single Match

Find single instances of images on screen with configurable matching parameters.

def locateOnScreen(image, **kwargs):
    """
    Find image on screen and return its location.
    
    Parameters:
    - image (str or PIL.Image): Path to template image or PIL Image object
    - region (tuple, optional): (left, top, width, height) search region
    - confidence (float, optional): Match confidence 0.0-1.0 (requires OpenCV)
    - grayscale (bool, optional): Convert to grayscale for faster matching (default: False)
    
    Returns:
    Box: Named tuple with (left, top, width, height) or None if not found
    
    Raises:
    ImageNotFoundException: If image not found and useImageNotFoundException() is True
    """

def locateCenterOnScreen(image, **kwargs):
    """
    Find image on screen and return center coordinates.
    
    Parameters:
    - image (str or PIL.Image): Path to template image or PIL Image object
    - Same parameters as locateOnScreen()
    
    Returns:
    Point: Named tuple with (x, y) center coordinates or None if not found
    """

def locate(needleImage, haystackImage, **kwargs):
    """
    Find needleImage within haystackImage.
    
    Parameters:
    - needleImage (str or PIL.Image): Template image to find
    - haystackImage (str or PIL.Image): Image to search within
    - region (tuple, optional): Search region within haystack
    - confidence (float, optional): Match confidence 0.0-1.0
    - grayscale (bool, optional): Use grayscale matching
    
    Returns:
    Box: Location of needle in haystack or None if not found
    """

Image Location - Multiple Matches

Find all instances of images on screen or within other images.

def locateAllOnScreen(image, **kwargs):
    """
    Find all instances of image on screen.
    
    Parameters:
    - image (str or PIL.Image): Path to template image or PIL Image object
    - region (tuple, optional): (left, top, width, height) search region
    - confidence (float, optional): Match confidence 0.0-1.0
    - grayscale (bool, optional): Use grayscale matching
    
    Returns:
    Generator[Box]: Generator yielding Box objects for each match
    
    Example:
    for match in pyautogui.locateAllOnScreen('button.png'):
        print(f"Found button at {match}")
    """

def locateAll(needleImage, haystackImage, **kwargs):
    """
    Find all instances of needleImage within haystackImage.
    
    Parameters:
    - needleImage (str or PIL.Image): Template image to find
    - haystackImage (str or PIL.Image): Image to search within
    - Same optional parameters as locateAllOnScreen()
    
    Returns:
    Generator[Box]: Generator yielding Box objects for each match
    """

Window-Specific Image Location

Find images within specific application windows (Windows platform only).

def locateOnWindow(image, window, **kwargs):
    """
    Find image within a specific window (Windows only).
    
    Parameters:
    - image (str or PIL.Image): Template image to find
    - window (Window): Window object to search within
    - Same optional parameters as locateOnScreen()
    
    Returns:
    Box: Location relative to window or None if not found
    
    Note: Requires PyGetWindow. Windows platform only.
    """

Pixel Analysis

Analyze individual pixels and colors on screen with tolerance matching.

def pixel(x, y):
    """
    Get RGB color of pixel at screen coordinates.
    
    Parameters:
    - x, y (int): Screen coordinates
    
    Returns:
    Tuple[int, int, int]: RGB color values (red, green, blue) 0-255
    """

def pixelMatchesColor(x, y, expectedRGBColor, tolerance=0):
    """
    Check if pixel color matches expected color within tolerance.
    
    Parameters:
    - x, y (int): Screen coordinates
    - expectedRGBColor (tuple): Expected RGB color (red, green, blue)
    - tolerance (int): Color tolerance 0-255 (default: 0 for exact match)
    
    Returns:
    bool: True if pixel matches color within tolerance
    
    Example:
    # Check if pixel is red (within tolerance of 10)
    is_red = pyautogui.pixelMatchesColor(100, 200, (255, 0, 0), tolerance=10)
    """

Utility Functions

Helper functions for working with image locations and regions.

def center(region):
    """
    Get center point of a region.
    
    Parameters:
    - region (Box or tuple): Region with (left, top, width, height)
    
    Returns:
    Point: Center coordinates (x, y)
    """

Image Recognition Configuration

Configure behavior of image recognition functions.

def useImageNotFoundException(value=None):
    """
    Configure whether image location functions raise exceptions.
    
    Parameters:
    - value (bool, optional): True to raise exceptions, False to return None.
                             If None, returns current setting.
    
    Returns:
    bool: Current setting (if value is None)
    None: (if value is provided)
    
    When True: locateOnScreen() raises ImageNotFoundException if image not found
    When False: locateOnScreen() returns None if image not found
    """

Image Formats and Requirements

Supported Image Formats

  • PNG (recommended for UI elements)
  • JPEG/JPG (for photographs)
  • BMP (Windows bitmap)
  • GIF (static images only)
  • TIFF (high quality images)

Template Matching Tips

  • Use PNG format for crisp UI elements
  • Template images should be pixel-perfect matches
  • Consider using confidence parameter for slight variations
  • Grayscale matching is faster but less precise
  • Screenshot template images directly from target application

Usage Examples

import pyautogui

# Get screen information
width, height = pyautogui.size()
print(f"Screen size: {width}x{height}")

current_pos = pyautogui.position()
print(f"Mouse position: {current_pos}")

# Take screenshots
screenshot = pyautogui.screenshot()  # Full screen PIL Image
pyautogui.screenshot('desktop.png')  # Save full screen
pyautogui.screenshot('region.png', region=(0, 0, 300, 400))  # Save region

# Find images on screen
button_location = pyautogui.locateOnScreen('submit_button.png')
if button_location:
    # Click the center of the found button
    center_point = pyautogui.center(button_location)
    pyautogui.click(center_point)
else:
    print("Button not found")

# Find image with confidence (requires OpenCV)
try:
    location = pyautogui.locateOnScreen('logo.png', confidence=0.8)
    pyautogui.click(location)
except pyautogui.ImageNotFoundException:
    print("Logo not found with 80% confidence")

# Find all instances of an image
for button in pyautogui.locateAllOnScreen('close_button.png'):
    print(f"Close button found at: {button}")
    # Click each close button found
    pyautogui.click(pyautogui.center(button))

# Pixel color analysis
pixel_color = pyautogui.pixel(100, 200)
print(f"Pixel color at (100, 200): RGB{pixel_color}")

# Check if pixel matches expected color
is_white = pyautogui.pixelMatchesColor(100, 200, (255, 255, 255), tolerance=5)
if is_white:
    print("Pixel is approximately white")

# Configure exception behavior
pyautogui.useImageNotFoundException(True)  # Raise exceptions
try:
    location = pyautogui.locateOnScreen('nonexistent.png')
except pyautogui.ImageNotFoundException:
    print("Image not found - exception raised")

# Complex image recognition workflow
def find_and_click_button(button_image, timeout=10):
    """Find and click a button with timeout"""
    import time
    start_time = time.time()
    
    while time.time() - start_time < timeout:
        try:
            button_pos = pyautogui.locateOnScreen(button_image, confidence=0.7)
            if button_pos:
                pyautogui.click(pyautogui.center(button_pos))
                return True
        except pyautogui.ImageNotFoundException:
            pass
        time.sleep(0.5)
    
    return False  # Button not found within timeout

# Use the function
if find_and_click_button('login_button.png'):
    print("Login button clicked successfully")
else:
    print("Login button not found within timeout")

Data Types

from collections import namedtuple
from typing import Tuple, Generator, Union, Optional
import PIL.Image

# Region and position types
Box = namedtuple('Box', ['left', 'top', 'width', 'height'])
Point = namedtuple('Point', ['x', 'y'])

# Color type
Color = Tuple[int, int, int]  # RGB values 0-255

# Region specification (for screenshot and search areas)
Region = Tuple[int, int, int, int]  # (left, top, width, height)

# Image input types
ImageInput = Union[str, PIL.Image.Image]  # File path or PIL Image object

Performance Notes

  • Grayscale matching: Faster but less precise than color matching
  • Confidence matching: Requires OpenCV-Python (pip install opencv-python)
  • Region limiting: Specify search regions to improve performance
  • Template size: Smaller templates match faster
  • Screen resolution: Higher resolutions increase matching time
  • Multiple matches: locateAllOnScreen() is slower than single match functions

Install with Tessl CLI

npx tessl i tessl/pypi-pyautogui

docs

index.md

keyboard-input.md

message-boxes.md

mouse-control.md

screen-image.md

utilities.md

window-management.md

tile.json