Scalable Python data science, in an API compatible & lightning fast way.
Core functions for initializing, managing, and shutting down Xorbits runtime environments. These functions handle both local runtime creation and connection to distributed Xorbits clusters.
Initialize Xorbits runtime locally or connect to an existing Xorbits cluster. This is typically the first function called when using Xorbits.
def init(
address: Optional[str] = None,
init_local: bool = no_default,
session_id: Optional[str] = None,
timeout: Optional[float] = None,
n_worker: int = 1,
n_cpu: Union[int, str] = "auto",
mem_bytes: Union[int, str] = "auto",
cuda_devices: Union[List[int], List[List[int]], str] = "auto",
web: Union[bool, str] = "auto",
new: bool = True,
storage_config: Optional[Dict] = None,
**kwargs
) -> None:
"""
Init Xorbits runtime locally or connect to an Xorbits cluster.
Parameters:
- address: str, optional
- if None (default), address will be "127.0.0.1", a local runtime will be initialized
- if specify an address for creating a new local runtime, specify like `<ip>:<port>`
- if connect to a Xorbits cluster address, e.g. `http://<supervisor_ip>:<supervisor_web_port>`
- init_local: bool, no default value
Indicates if creating a new local runtime.
- If has initialized, `init_local` cannot be True, it will skip creating
- When address is None and not initialized, `init_local` will be True
- Otherwise, if it's not specified, False will be set
- session_id: str, optional
Session ID, if not specified, a new ID will be auto generated
- timeout: float
Timeout about creating a new runtime or connecting to an existing cluster
- n_worker: int, optional
How many workers to start when creating a local runtime (takes effect only when `init_local` is True)
- n_cpu: int, str
Number of CPUs, if "auto", the number of cores will be specified (takes effect only when `init_local` is True)
- mem_bytes: int, str
Memory to use, in bytes, if "auto", total memory bytes will be specified (takes effect only when `init_local` is True)
- cuda_devices: list of int, list of list
- when "auto" (default), all visible GPU devices will be used
- When n_worker is 1, list of int can be specified, means the device indexes to use
- When n_worker > 1, list of list can be specified for each worker
(takes effect only when `init_local` is True)
- web: bool, str
If creating a web UI (takes effect only when `init_local` is True)
- new: bool
If creating a new session when connecting to an existing cluster (takes effect only when `init_local` is False)
- storage_config: dict, optional
Storage backend and its configuration when init a new local cluster.
Using `shared_memory` storage backend by default.
Currently, support `shared_memory` and `mmap` two options.
(takes effect only when `init_local` is True)
"""Usage Examples:
# Initialize local runtime with default settings
import xorbits
xorbits.init()
# Initialize local runtime with custom settings
xorbits.init(n_worker=4, n_cpu=8, mem_bytes="8GB")
# Connect to existing cluster
xorbits.init(address="http://192.168.1.100:7103")
# Initialize with GPU support
xorbits.init(cuda_devices=[0, 1])
# Initialize with custom storage backend
xorbits.init(storage_config={"mmap": {"root_dirs": "/tmp/xorbits"}})Shutdown the current local runtime and clean up resources.
def shutdown(**kw) -> None:
"""
Shutdown current local runtime.
Parameters:
- **kw: Additional keyword arguments passed to the shutdown process
"""Usage Examples:
# Basic shutdown
xorbits.shutdown()
# Shutdown when connecting to cluster releases session resources
# but doesn't affect the cluster itselfManually trigger execution of DataRef objects. Xorbits uses lazy evaluation by default, so computations are not executed until explicitly triggered.
def run(obj, **kwargs):
"""
Manually trigger execution of DataRef objects.
Parameters:
- obj: DataRef or collection of DataRefs to execute
- **kwargs: Additional execution parameters
Returns:
- Computed results as concrete objects (pandas DataFrame/Series, numpy arrays, etc.)
"""Usage Examples:
import xorbits
import xorbits.pandas as pd
import xorbits.numpy as np
xorbits.init()
# Create lazy computations
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
result = df.sum()
# Execute computation
computed_result = xorbits.run(result)
print(computed_result) # Actual pandas Series
# Execute multiple objects
arr = np.array([1, 2, 3])
df_mean = df.mean()
computed_arr, computed_mean = xorbits.run(arr, df_mean)
xorbits.shutdown()Install with Tessl CLI
npx tessl i tessl/pypi-xorbits