or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

category-detection.mdcategory-matching.mdcore-detection.mdindex.mdtype-management.md
tile.json

tessl/pypi-filetype

Small and dependency-free Python package to infer file type and MIME type checking the magic numbers signature of a file or buffer.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/filetype@1.2.x

To install, run

npx @tessl/cli install tessl/pypi-filetype@1.2.0

index.mddocs/

filetype

A small and dependency-free Python package to infer file type and MIME type by checking the magic numbers signature of a file or buffer. The package provides high-performance file type detection for 80+ file formats across multiple categories without requiring external dependencies.

Package Information

  • Package Name: filetype
  • Language: Python
  • Installation: pip install filetype

Core Imports

import filetype

Common usage patterns:

from filetype import guess, guess_mime, guess_extension

For category-specific operations:

from filetype import is_image, is_video, is_audio

Basic Usage

import filetype

# Basic file type detection from file path
kind = filetype.guess('/path/to/file.jpg')
if kind is None:
    print('Cannot guess file type!')
else:
    print('File extension: %s' % kind.extension)
    print('File MIME type: %s' % kind.mime)

# Detection from bytes buffer
with open('/path/to/file.jpg', 'rb') as f:
    header = f.read(261)  # Read first 261 bytes
    kind = filetype.guess(header)
    print('File type:', kind.extension if kind else 'Unknown')

# Direct MIME type detection
mime = filetype.guess_mime('/path/to/file.pdf')
print('MIME type:', mime)  # 'application/pdf'

# Direct extension detection
ext = filetype.guess_extension('/path/to/file.png')
print('Extension:', ext)  # 'png'

Architecture

The filetype package is built around a modular type matcher system:

  • Type System: Base Type class with specialized implementations for each file format
  • Magic Number Detection: Analyzes first 8192 bytes of files to identify format signatures
  • Category Organization: File types grouped into logical categories (image, video, audio, etc.)
  • Extensible Design: Custom type matchers can be added via the add_type() function

The package supports multiple input types (file paths, bytes, bytearray, file-like objects) and provides both general detection and category-specific matching functions.

Constants and Variables

The package exposes several module-level constants and variables for advanced usage:

types: List[Type]
"""List of all supported type matcher instances. Allows iteration over all file types."""

__version__: str
"""Package version string (e.g., '1.2.0')."""

version: str  
"""Package version string (alias for __version__)."""

Usage Examples:

import filetype

# Get package version
print(f'filetype version: {filetype.__version__}')

# Iterate over all supported types
for file_type in filetype.types:
    print(f'{file_type.extension} -> {file_type.mime}')

# Filter types by category
image_types = [t for t in filetype.types if 'image/' in t.mime]
print(f'Supported image formats: {len(image_types)}')

Capabilities

Core Detection Functions

Primary file type detection functionality that analyzes magic number signatures to identify file types and return structured type information.

def guess(obj):
    """
    Infers the type of the given input.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        The matched type instance. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

def guess_mime(obj):
    """
    Infers the file type and returns its MIME type.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        The matched MIME type as string. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

def guess_extension(obj):
    """
    Infers the file type and returns its file extension.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        The matched file extension as string. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

Core Detection

Type Management

Functions for managing and querying the type matcher system, including support for custom type matchers and type lookup by MIME type or extension.

def get_type(mime=None, ext=None):
    """
    Returns the file type instance searching by MIME type or file extension.

    Args:
        ext: file extension string. E.g: jpg, png, mp4, mp3
        mime: MIME string. E.g: image/jpeg, video/mpeg

    Returns:
        The matched file type instance. Otherwise None.
    """

def add_type(instance):
    """
    Adds a new type matcher instance to the supported types.

    Args:
        instance: Type inherited instance.

    Returns:
        None

    Raises:
        TypeError: if instance doesn't inherit from filetype.types.Type
    """

Type Management

Category Detection Functions

High-level functions for checking if a file belongs to specific categories like images, videos, or documents, providing quick boolean results for common use cases. Also includes utility functions for checking extension and MIME type support.

def is_extension_supported(ext):
    """
    Checks if the given extension string is supported by the file matchers.

    Args:
        ext (str): file extension string. E.g: jpg, png, mp4, mp3

    Returns:
        True if the file extension is supported. Otherwise False.
    """

def is_mime_supported(mime):
    """
    Checks if the given MIME type string is supported by the file matchers.

    Args:
        mime (str): MIME string. E.g: image/jpeg, video/mpeg

    Returns:
        True if the MIME type is supported. Otherwise False.
    """

def is_image(obj):
    """
    Checks if a given input is a supported type image.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        True if obj is a valid image. Otherwise False.

    Raises:
        TypeError: if obj is not a supported type.
    """

def is_video(obj):
    """
    Checks if a given input is a supported type video.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        True if obj is a valid video. Otherwise False.

    Raises:
        TypeError: if obj is not a supported type.
    """

def is_audio(obj):
    """
    Checks if a given input is a supported type audio.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        True if obj is a valid audio. Otherwise False.

    Raises:
        TypeError: if obj is not a supported type.
    """

Category Detection

Category-Specific Matching

Advanced matching functions that search within specific file type categories, providing more targeted detection when you know the expected file category.

def image_match(obj):
    """
    Matches the given input against the available image type matchers.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        Type instance if matches. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

def video_match(obj):
    """
    Matches the given input against the available video type matchers.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        Type instance if matches. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

def audio_match(obj):
    """
    Matches the given input against the available audio type matchers.

    Args:
        obj: path to file, bytes or bytearray.

    Returns:
        Type instance if matches. Otherwise None.

    Raises:
        TypeError: if obj is not a supported type.
    """

Category Matching

Supported File Types

The package supports 80+ file formats across 7 categories:

  • Images (17 types): JPG, PNG, GIF, WebP, TIFF, BMP, PSD, ICO, HEIC, AVIF, CR2, JXR, DCM, DWG, XCF, JPX, APNG
  • Videos (10 types): MP4, MKV, AVI, MOV, WebM, FLV, MPEG, WMV, M4V, 3GP
  • Audio (9 types): MP3, WAV, OGG, FLAC, AAC, MIDI, M4A, AMR, AIFF
  • Fonts (4 types): WOFF, WOFF2, TTF, OTF
  • Documents (9 types): DOC, DOCX, PDF, XLS, XLSX, PPT, PPTX, ODT, ODS, ODP
  • Archives (36 types): ZIP, TAR, RAR, 7Z, GZ, BZ2, XZ, DEB, RPM, CAB, EXE, SWF, RTF, PS, SQLITE, AR, Z, LZOP, LZ, ELF, LZ4, ZSTD, BR, DCM, EPUB, NES, CRX, EOT, and others
  • Applications (1 type): WASM

CLI Usage

The package includes a command-line interface for file type detection:

# Install provides the 'filetype' command
pip install filetype

# Check file types
filetype -f file1.jpg file2.png file3.pdf

# Show version
filetype --version