CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-cx-oracle

Python interface to Oracle Database implementing DB API 2.0 with Oracle-specific extensions

Pending
Overview
Eval results
Files

lobs.mddocs/

Large Objects (LOBs)

Handling of CLOB, BLOB, NCLOB, and BFILE objects with streaming read/write operations, temporary LOB creation, and file-based LOB operations for efficient processing of large data.

Capabilities

LOB Types and Creation

Oracle supports several Large Object types for different data requirements.

# LOB type constants
DB_TYPE_CLOB: DbType      # Character LOB (up to 4GB)
DB_TYPE_BLOB: DbType      # Binary LOB (up to 4GB)  
DB_TYPE_NCLOB: DbType     # National Character LOB (up to 4GB)
DB_TYPE_BFILE: DbType     # Binary file LOB (external file reference)

Create temporary LOBs for data processing:

class Connection:
    def createlob(self, lobtype) -> LOB:
        """
        Create temporary LOB object.
        
        Parameters:
        - lobtype: LOB type (DB_TYPE_CLOB, DB_TYPE_BLOB, DB_TYPE_NCLOB)
        
        Returns:
        Temporary LOB object
        """

Usage examples:

# Create temporary LOBs
clob = connection.createlob(cx_Oracle.DB_TYPE_CLOB)
blob = connection.createlob(cx_Oracle.DB_TYPE_BLOB)
nclob = connection.createlob(cx_Oracle.DB_TYPE_NCLOB)

# Use in SQL operations
cursor.execute("INSERT INTO documents (id, content) VALUES (:1, :2)", 
               (1, clob))

LOB Reading Operations

Read data from LOB objects with various methods and options.

class LOB:
    def read(self, offset=1, amount=None) -> Union[str, bytes]:
        """
        Read data from LOB starting at specified offset.
        
        Parameters:
        - offset (int): Starting position (1-based)
        - amount (int): Number of characters/bytes to read (None for all)
        
        Returns:
        str for CLOB/NCLOB, bytes for BLOB/BFILE
        """
        
    def getvalue(self) -> Union[str, bytes]:
        """
        Read entire LOB value as string or bytes.
        
        Returns:
        Complete LOB content (str for CLOB/NCLOB, bytes for BLOB/BFILE)
        """
        
    @property
    def size(self) -> int:
        """
        Get LOB size in characters (CLOB/NCLOB) or bytes (BLOB/BFILE).
        
        Returns:
        Size of LOB data
        """
        
    @property 
    def chunksize(self) -> int:
        """
        Get optimal chunk size for reading/writing LOB data.
        
        Returns:
        Recommended chunk size in bytes
        """

Usage examples:

# Fetch LOB from database
cursor.execute("SELECT document_content FROM documents WHERE id = :1", (1,))
lob = cursor.fetchone()[0]

# Read entire LOB
full_content = lob.getvalue()
print(f"Document size: {len(full_content)} characters")

# Read LOB in chunks
chunk_size = lob.chunksize
offset = 1
while offset <= lob.size:
    chunk = lob.read(offset, chunk_size)
    if not chunk:
        break
    # Process chunk...
    print(f"Read {len(chunk)} characters from offset {offset}")
    offset += len(chunk)

# Read specific portion
first_1000_chars = lob.read(1, 1000)
middle_section = lob.read(5000, 2000)

LOB Writing Operations

Write data to LOB objects with streaming support for large data.

class LOB:
    def write(self, data: Union[str, bytes], offset=1) -> int:
        """
        Write data to LOB at specified offset.
        
        Parameters:
        - data: Data to write (str for CLOB/NCLOB, bytes for BLOB)
        - offset (int): Starting position (1-based)
        
        Returns:
        Number of characters/bytes written
        """
        
    def trim(self, newSize=0) -> None:
        """
        Trim LOB to specified size.
        
        Parameters:
        - newSize (int): New size in characters/bytes (0 to empty)
        """

Usage examples:

# Create and populate temporary CLOB
clob = connection.createlob(cx_Oracle.DB_TYPE_CLOB)
clob.write("Initial content at the beginning")

# Append more data
current_size = clob.size
clob.write(" Additional content appended", current_size + 1)

# Write large data in chunks
large_text = "Very large document content..." * 10000
chunk_size = 32768  # 32KB chunks

offset = 1
for i in range(0, len(large_text), chunk_size):
    chunk = large_text[i:i + chunk_size]
    bytes_written = clob.write(chunk, offset)
    offset += bytes_written

# Insert into database
cursor.execute("INSERT INTO documents (id, content) VALUES (:1, :2)", 
               (2, clob))

# Update existing LOB in database
cursor.execute("SELECT content FROM documents WHERE id = :1 FOR UPDATE", (1,))
existing_lob = cursor.fetchone()[0]
existing_lob.write("Updated content", 1)
connection.commit()

LOB File Operations

Manage LOB lifecycle and external file references.

class LOB:
    def open(self) -> None:
        """Open LOB for reading or writing"""
        
    def close(self) -> None:
        """Close LOB and free resources"""
        
    def isopen(self) -> bool:
        """Check if LOB is currently open"""

BFILE-specific operations for external file references:

class LOB:
    def fileexists(self) -> bool:
        """
        Check if BFILE external file exists (BFILE only).
        
        Returns:
        True if file exists, False otherwise
        """
        
    def getfilename(self) -> tuple:
        """
        Get BFILE directory alias and filename (BFILE only).
        
        Returns:
        Tuple of (directory_alias, filename)
        """
        
    def setfilename(self, dir_alias: str, filename: str) -> None:
        """
        Set BFILE directory alias and filename (BFILE only).
        
        Parameters:
        - dir_alias (str): Oracle directory alias
        - filename (str): File name within directory
        """

Usage examples:

# Work with BFILEs (external files)
cursor.execute("SELECT file_content FROM file_references WHERE id = :1", (1,))
bfile = cursor.fetchone()[0]

# Check if external file exists
if bfile.fileexists():
    dir_alias, filename = bfile.getfilename()
    print(f"File: {filename} in directory {dir_alias}")
    
    # Read external file content
    bfile.open()
    try:
        content = bfile.read()
        print(f"File size: {len(content)} bytes")
    finally:
        bfile.close()
else:
    print("External file not found")

# Create new BFILE reference
new_bfile = connection.createlob(cx_Oracle.DB_TYPE_BFILE)
new_bfile.setfilename("DATA_DIR", "document.pdf")
cursor.execute("INSERT INTO file_references (id, file_content) VALUES (:1, :2)",
               (2, new_bfile))

LOB Context Management

LOBs support Python context managers for automatic resource cleanup:

# Automatic LOB cleanup
cursor.execute("SELECT content FROM documents WHERE id = :1", (1,))
with cursor.fetchone()[0] as lob:
    content = lob.read()
    # LOB automatically closed when exiting context

# Temporary LOB with context management
with connection.createlob(cx_Oracle.DB_TYPE_CLOB) as temp_lob:
    temp_lob.write("Temporary content")
    cursor.execute("INSERT INTO temp_docs VALUES (:1, :2)", (1, temp_lob))
    # Temporary LOB cleaned up automatically

Working with Large Data

Streaming Large Files

Efficiently handle very large files using streaming operations:

def stream_file_to_lob(filename, lob):
    """Stream file content to LOB in chunks"""
    chunk_size = lob.chunksize
    
    with open(filename, 'rb') as file:
        offset = 1
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            lob.write(chunk, offset)
            offset += len(chunk)

def stream_lob_to_file(lob, filename):
    """Stream LOB content to file in chunks"""
    chunk_size = lob.chunksize
    
    with open(filename, 'wb') as file:
        offset = 1
        while offset <= lob.size:
            chunk = lob.read(offset, chunk_size)
            if not chunk:
                break
            file.write(chunk)
            offset += len(chunk)

# Usage
blob = connection.createlob(cx_Oracle.DB_TYPE_BLOB)
stream_file_to_lob("large_document.pdf", blob)

cursor.execute("INSERT INTO documents (id, content) VALUES (:1, :2)", 
               (1, blob))
connection.commit()

# Later, retrieve and save to file
cursor.execute("SELECT content FROM documents WHERE id = :1", (1,))
retrieved_blob = cursor.fetchone()[0]
stream_lob_to_file(retrieved_blob, "retrieved_document.pdf")

LOB Performance Optimization

Tips for optimal LOB performance:

# Use appropriate chunk sizes
def optimal_lob_copy(source_lob, target_lob):
    """Copy LOB using optimal chunk size"""
    chunk_size = max(source_lob.chunksize, target_lob.chunksize)
    
    offset = 1
    while offset <= source_lob.size:
        chunk = source_lob.read(offset, chunk_size)
        if not chunk:
            break
        target_lob.write(chunk, offset)
        offset += len(chunk)

# Batch LOB operations
def batch_lob_inserts(lob_data_list):
    """Insert multiple LOBs efficiently"""
    cursor = connection.cursor()
    
    # Prepare statement once
    cursor.prepare("INSERT INTO documents (id, content) VALUES (:1, :2)")
    
    for doc_id, content in lob_data_list:
        lob = connection.createlob(cx_Oracle.DB_TYPE_CLOB)
        lob.write(content)
        cursor.execute(None, (doc_id, lob))
    
    connection.commit()
    cursor.close()

LOB Data Type Conversion

Convert between different LOB types and Python data types:

# Convert string to CLOB
def string_to_clob(text_content):
    """Convert string to CLOB"""
    clob = connection.createlob(cx_Oracle.DB_TYPE_CLOB)
    clob.write(text_content)
    return clob

# Convert bytes to BLOB  
def bytes_to_blob(binary_content):
    """Convert bytes to BLOB"""
    blob = connection.createlob(cx_Oracle.DB_TYPE_BLOB)
    blob.write(binary_content)
    return blob

# Convert CLOB to string
def clob_to_string(clob_obj):
    """Convert CLOB to Python string"""
    return clob_obj.getvalue() if clob_obj else None

# Convert BLOB to bytes
def blob_to_bytes(blob_obj):
    """Convert BLOB to Python bytes"""
    return blob_obj.getvalue() if blob_obj else None

# Usage examples
text_data = "Large text document content..."
clob = string_to_clob(text_data)

with open("image.jpg", "rb") as f:
    binary_data = f.read()
blob = bytes_to_blob(binary_data)

# Insert both LOBs
cursor.execute("INSERT INTO mixed_content (id, text_data, binary_data) VALUES (:1, :2, :3)",
               (1, clob, blob))

Error Handling

Common LOB error handling patterns:

try:
    # LOB operations that might fail
    lob = connection.createlob(cx_Oracle.DB_TYPE_CLOB)
    lob.write(content)
    
except cx_Oracle.DatabaseError as e:
    error_obj, = e.args
    if error_obj.code == 22275:  # LOB write past end
        print("Attempted to write past end of LOB")
    elif error_obj.code == 1403:  # No data found
        print("LOB not found")
    else:
        print(f"LOB error: {error_obj.message}")
        
except IOError as e:
    print(f"File I/O error: {e}")
    
finally:
    # Always clean up LOB resources
    if 'lob' in locals() and lob:
        try:
            lob.close()
        except:
            pass  # Ignore cleanup errors

LOB Best Practices

  1. Use streaming for large data: Always process large LOBs in chunks
  2. Clean up resources: Use context managers or explicit close() calls
  3. Optimize chunk sizes: Use LOB.chunksize for optimal performance
  4. Handle encoding properly: Ensure correct character encoding for CLOBs
  5. Use temporary LOBs: Create temporary LOBs for intermediate processing
  6. Batch operations: Group LOB operations for better performance
  7. Monitor memory usage: Be aware of memory consumption with large LOBs

Install with Tessl CLI

npx tessl i tessl/pypi-cx-oracle

docs

advanced-queueing.md

connections.md

cursors.md

index.md

lobs.md

notifications.md

object-types.md

session-pools.md

soda.md

tile.json