CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-bson

Independent BSON codec for Python that doesn't depend on MongoDB

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

objectid.mddocs/

ObjectId Operations

MongoDB-compatible ObjectId implementation providing unique 12-byte identifiers. ObjectIds consist of a timestamp, machine identifier, process ID, and counter, ensuring uniqueness across distributed systems.

Capabilities

ObjectId Creation

Create new ObjectIds or parse existing ones from various formats including hex strings, bytes, or automatic generation.

class ObjectId:
    def __init__(self, oid=None):
        """
        Initialize a new ObjectId.

        Parameters:
        - oid: None (generate new), ObjectId instance (copy), 
               bytes (12-byte binary), or str (24-char hex)

        Raises:
        InvalidId: If oid is not valid format
        TypeError: If oid is not accepted type
        """

Usage examples:

from bson import ObjectId

# Generate new ObjectId
oid1 = ObjectId()
print(str(oid1))  # e.g., '507f1f77bcf86cd799439011'

# From hex string
oid2 = ObjectId('507f1f77bcf86cd799439011')

# From bytes
binary_data = b'foo-bar-quux'  # any 12 bytes
oid3 = ObjectId(binary_data)

# Copy existing ObjectId
oid4 = ObjectId(oid1)

ObjectId Validation

Check if strings or objects represent valid ObjectIds before attempting to create them.

@classmethod
def is_valid(cls, oid):
    """
    Check if oid string/object is valid ObjectId format.

    Parameters:
    - oid: Value to validate (str, bytes, ObjectId, or other)

    Returns:
    bool: True if valid ObjectId format, False otherwise
    """

Usage example:

from bson import ObjectId

# Valid cases
print(ObjectId.is_valid('507f1f77bcf86cd799439011'))  # True
print(ObjectId.is_valid(ObjectId()))  # True
print(ObjectId.is_valid(b'123456789012'))  # True

# Invalid cases  
print(ObjectId.is_valid('invalid'))  # False
print(ObjectId.is_valid('507f1f77bcf86cd79943901'))  # False (23 chars)
print(ObjectId.is_valid(None))  # False
print(ObjectId.is_valid(123))  # False

DateTime-based ObjectId Generation

Create ObjectIds with specific generation timestamps for range queries and time-based operations.

@classmethod
def from_datetime(cls, generation_time):
    """
    Create ObjectId with specific generation time for queries.

    Warning: Not safe for insertion - eliminates uniqueness guarantee.
    Use only for range queries.

    Parameters:
    - generation_time: datetime object (converted to UTC if timezone-aware)

    Returns:
    ObjectId: ObjectId with specified timestamp, other fields zeroed
    """

Usage example:

from bson import ObjectId
from datetime import datetime

# Create ObjectId for range query
query_time = datetime(2010, 1, 1)
dummy_id = ObjectId.from_datetime(query_time)

# Use in MongoDB-style range query
# collection.find({"_id": {"$lt": dummy_id}})

# Compare with current ObjectIds
current_id = ObjectId()
print(dummy_id < current_id)  # True (older timestamp)

ObjectId Properties

Access ObjectId components and metadata including binary representation and generation timestamp.

@property
def binary(self):
    """
    12-byte binary representation of ObjectId.
    
    Returns:
    bytes: Raw 12-byte ObjectId data
    """

@property  
def generation_time(self):
    """
    Generation time as timezone-aware datetime.
    
    Returns:
    datetime: UTC datetime of ObjectId creation (precise to second)
    """

Usage example:

from bson import ObjectId
import datetime

oid = ObjectId()

# Get binary representation
binary_data = oid.binary
print(len(binary_data))  # 12
print(type(binary_data))  # <class 'bytes'>

# Get generation time
gen_time = oid.generation_time
print(type(gen_time))  # <class 'datetime.datetime'>
print(gen_time.tzinfo)  # <bson.tz_util.FixedOffset object>

# Time comparison
now = datetime.datetime.now(datetime.timezone.utc)
print(gen_time <= now)  # True

ObjectId Comparison and Hashing

Compare ObjectIds chronologically and use them as dictionary keys or in sets.

def __eq__(self, other): ...
def __ne__(self, other): ...  
def __lt__(self, other): ...
def __le__(self, other): ...
def __gt__(self, other): ...
def __ge__(self, other): ...
def __hash__(self): ...

Usage example:

from bson import ObjectId
import time

# Create ObjectIds at different times
oid1 = ObjectId()
time.sleep(0.1)
oid2 = ObjectId()

# Chronological comparison
print(oid1 < oid2)  # True (oid1 created first)
print(oid1 == oid1)  # True
print(oid1 != oid2)  # True

# Use as dictionary keys
data = {oid1: "first", oid2: "second"}
print(data[oid1])  # "first"

# Use in sets
oid_set = {oid1, oid2, oid1}  # Duplicates removed
print(len(oid_set))  # 2

String Representation

Convert ObjectIds to various string formats for display and serialization.

def __str__(self): 
    """24-character hex string representation"""

def __repr__(self):
    """ObjectId('hex_string') representation"""

Usage example:

from bson import ObjectId

oid = ObjectId()

# String formats
hex_string = str(oid)  # '507f1f77bcf86cd799439011'
repr_string = repr(oid)  # "ObjectId('507f1f77bcf86cd799439011')"

print(f"ObjectId: {oid}")
print(f"Repr: {oid!r}")

# Roundtrip conversion
oid_copy = ObjectId(str(oid))
print(oid == oid_copy)  # True

Error Handling

ObjectId Exceptions

class InvalidId(ValueError):
    """Raised when creating ObjectId from invalid data"""

Common causes of InvalidId:

  • Hex strings not exactly 24 characters
  • Bytes not exactly 12 bytes
  • Non-hex characters in string
  • Invalid types (numbers, lists, etc.)

Usage example:

from bson import ObjectId
from bson.objectid import InvalidId

try:
    # Invalid hex string (23 chars)
    bad_oid = ObjectId('507f1f77bcf86cd79943901')
except InvalidId as e:
    print(f"Invalid ObjectId: {e}")

try:
    # Invalid type
    bad_oid = ObjectId(12345)
except TypeError as e:
    print(f"Type error: {e}")

ObjectId Structure

ObjectIds contain the following components in 12 bytes:

  1. Timestamp (4 bytes): Seconds since Unix epoch
  2. Machine ID (3 bytes): Hash of hostname
  3. Process ID (2 bytes): Process ID modulo 65535
  4. Counter (3 bytes): Incrementing counter starting from random value

This structure ensures uniqueness across distributed systems and provides natural chronological ordering.

Install with Tessl CLI

npx tessl i tessl/pypi-bson

docs

custom-objects.md

index.md

network.md

objectid.md

serialization.md

types.md

tile.json