CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-etils

Collection of common python utils for machine learning and scientific computing workflows

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

index.mddocs/

Etils

Etils (eclectic utils) is a comprehensive collection of Python utilities designed for machine learning and scientific computing workflows. The package is architected as a collection of independent, self-contained submodules that can be imported individually to avoid unnecessary dependencies.

Package Information

  • Package Name: etils
  • Language: Python
  • Installation: pip install etils or pip install etils[array_types,epath,epy] (selective modules)

Core Imports

Etils follows a modular import pattern where each submodule is imported individually:

from etils import epath  # Path utils
from etils import etree  # Tree utils
from etils import enp    # NumPy utils
from etils import ecolab # Colab utils
from etils import array_types # Type annotations
from etils import edc    # Dataclass utils
from etils import epy    # Python utils
from etils import eapp   # Absl app utils
from etils import etqdm  # TQDM utils
from etils import exm    # XManager utils
from etils import lazy_imports # Lazy import utils

Basic Usage

# Path operations with cloud storage support
from etils import epath
path = epath.Path('gs://my-bucket/data.txt')
path.write_text('Hello, world!')
content = path.read_text()

# Tree operations compatible with ML frameworks
from etils import etree
data = {'a': [1, 2], 'b': {'c': 3}}
mapped = etree.py.map(lambda x: x * 2, data)
# Result: {'a': [2, 4], 'b': {'c': 6}}

# Enhanced NumPy utilities
from etils import enp
import numpy as np
arrays = [np.array([1, 2, 3]), np.array([4, 5, 6])]
normalized = enp.check_and_normalize_arrays(arrays)

# Colab-specific utilities
from etils import ecolab
ecolab.auto_display(['item1', 'item2'])  # Enhanced display in Colab

# Type annotations for ML arrays
from etils import array_types
def process_data(data: array_types.FloatArray) -> array_types.IntArray:
    return data.astype(int)

Architecture

Etils is designed around independent submodules with minimal cross-dependencies:

  • Modular Design: Each submodule can be imported separately
  • Optional Dependencies: Dependencies loaded only when needed
  • Cloud Integration: Native support for gs://, s3:// through epath
  • ML Framework Compatibility: Works with TensorFlow, JAX, PyTorch, NumPy
  • Development Environment: Enhanced support for Jupyter/Colab workflows

Capabilities

Path Operations (epath)

Pathlib-compatible API that extends standard file operations to cloud storage systems including Google Cloud Storage (gs://), AWS S3 (s3://), and other remote filesystems.

class Path:
    def __init__(self, path: str | PathLike) -> None: ...
    def read_text(self, encoding: str = 'utf-8') -> str: ...
    def write_text(self, data: str, encoding: str = 'utf-8') -> int: ...
    def exists(self) -> bool: ...
    def mkdir(self, parents: bool = False, exist_ok: bool = False) -> None: ...
    def glob(self, pattern: str) -> Iterator[Path]: ...

def register_path_cls(cls: type[Path]) -> None: ...
def resource_path(package: str, resource: str) -> Path: ...

Path Operations

Tree Manipulation (etree)

Universal tree manipulation utilities compatible with TensorFlow nest, JAX tree_utils, DeepMind tree, and pure Python data structures.

# Core API objects
jax: TreeAPI  # JAX tree operations
nest: TreeAPI  # TensorFlow nest operations  
tree: TreeAPI  # DeepMind tree operations
py: TreeAPI    # Pure Python operations

# Core functions (via py API)
def map(fn: Callable, tree: Tree) -> Tree: ...
def parallel_map(fn: Callable, tree: Tree) -> Tree: ...
def unzip(tree: Tree) -> Tree: ...
def stack(tree: Tree) -> Tree: ...

Tree Manipulation

NumPy Utilities (enp)

Enhanced NumPy utilities providing array specifications, compatibility layers, mathematical operations, and geometry utilities for scientific computing.

class ArraySpec:
    def __init__(self, shape: tuple, dtype: np.dtype) -> None: ...
    
def check_and_normalize_arrays(*arrays) -> list[np.ndarray]: ...
def is_array_str(arr: np.ndarray) -> bool: ...
def flatten(arr: np.ndarray, pattern: str) -> np.ndarray: ...
def angle_between(v1: np.ndarray, v2: np.ndarray) -> float: ...

NumPy Utilities

Google Colab Integration (ecolab)

Utilities specifically designed for Google Colab environments including enhanced display functions, code inspection, HTML rendering, and Python-JavaScript communication.

def auto_display(obj: Any) -> None: ...
def collapse(content: str, title: str = 'Details') -> None: ...
def inspect(obj: Any) -> None: ...
def highlight_html(code: str, language: str = 'python') -> str: ...

Colab Integration

Array Type Annotations (array_types)

Comprehensive type annotations for NumPy, JAX, TensorFlow, and PyTorch arrays with specific precision types for type-safe ML code development.

# Core array types
ArrayLike = Union[np.ndarray, list, tuple]
Array = np.ndarray
FloatArray = np.ndarray  # Float arrays
IntArray = np.ndarray    # Integer arrays
BoolArray = np.ndarray   # Boolean arrays

# Precision-specific types
f32 = np.ndarray  # 32-bit float
f64 = np.ndarray  # 64-bit float
i32 = np.ndarray  # 32-bit int
ui32 = np.ndarray # 32-bit uint

Array Types

Dataclass Enhancements (edc)

Enhanced dataclass functionality with automatic type casting, context management, and improved representation for robust data structures.

class AutoCast:
    def __init__(self, cast_fn: Callable) -> None: ...

def dataclass(cls: type) -> type: ...
def field(**kwargs) -> Any: ...
def repr(obj: Any) -> str: ...

Dataclass Enhancements

Python Utilities (epy)

Collection of general-purpose Python utilities including environment detection, iteration helpers, text processing, error handling, and language feature enhancements.

def is_notebook() -> bool: ...
def is_test() -> bool: ...
def groupby(iterable: Iterable, key: Callable) -> dict: ...
def zip_dict(*dicts: dict) -> dict: ...
def lazy_imports(**modules) -> Any: ...

Python Utilities

Application Framework (eapp)

Absl flags and application utilities for building command-line applications with dataclass-based flag parsing and enhanced logging.

def make_flags_parser(dataclass_cls: type) -> Callable: ...
def better_logging(level: str = 'INFO') -> None: ...

Application Framework

Progress Bars (etqdm)

Enhanced TQDM progress bars with smart defaults and improved integration for iterative operations.

def tqdm(iterable: Optional[Iterable] = None, **kwargs) -> Any: ...

XManager Integration (exm)

Google XManager experiment management utilities for distributed machine learning workflows and experiment tracking.

def current_experiment() -> Any: ...
def current_work_unit() -> Any: ...
def is_running_under_xmanager() -> bool: ...
def add_experiment_artifact(name: str, path: str) -> None: ...
def add_work_unit_artifact(name: str, path: str) -> None: ...
def curr_job_name() -> str: ...
def url_to_python_only_logs() -> str: ...
def set_citc_source(source: str) -> None: ...

Lazy Import Management (lazy_imports)

Utilities for managing lazy imports and module loading to optimize startup time and memory usage.

def print_current_imports() -> None: ...
def __dir__() -> list[str]: ...
LAZY_MODULES: dict[str, Any]

Version Information

__version__: str  # Current version: "1.13.0"

docs

application-framework.md

array-types.md

colab-integration.md

dataclass-enhancements.md

index.md

numpy-utilities.md

path-operations.md

python-utilities.md

tree-manipulation.md

tile.json