or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

bucket.mdclient.mdcluster.mdimage.mdindex.mdmultipart.mdobject.mdrtmp.mdsts.md
tile.json

cluster.mddocs/

Cluster Client

The Cluster Client provides high-availability OSS access through automatic failover, load balancing, and health checking across multiple OSS endpoints. It ensures resilient cloud storage operations by distributing requests and handling endpoint failures transparently.

Cluster Client Creation

ClusterClient Factory

function ClusterClient(options: ClusterOptions): ClusterClient;

interface ClusterOptions {
  clusters: ClusterConfig[];
  schedule?: 'roundRobin' | 'masterSlave';
  retryMax?: number;
  retryDelay?: number;
  timeout?: number | number[];
  headers?: Record<string, string>;
  masterOnly?: boolean;
  heartbeatInterval?: number;
  ignoreStatusFile?: boolean;
}

interface ClusterConfig {
  region: string;
  accessKeyId: string;
  accessKeySecret: string;
  stsToken?: string;
  bucket?: string;
  endpoint?: string;
  internal?: boolean;
  secure?: boolean;
  timeout?: number | number[];
  weight?: number;
}

Usage

const OSS = require('ali-oss');

// Create cluster client with multiple endpoints
const cluster = new OSS.ClusterClient({
  clusters: [
    {
      region: 'oss-cn-hangzhou',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'my-bucket',
      weight: 10  // Higher weight = more requests
    },
    {
      region: 'oss-cn-beijing',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'my-bucket-backup',
      weight: 5   // Lower weight = fewer requests
    },
    {
      region: 'oss-cn-shenzhen',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'my-bucket-south',
      weight: 3
    }
  ],
  schedule: 'roundRobin',  // or 'masterSlave'
  retryMax: 3,
  retryDelay: 1000
});

Scheduling Strategies

Round Robin Scheduling

interface RoundRobinConfig {
  schedule: 'roundRobin';
  clusters: ClusterConfig[];
}

Characteristics:

  • Distributes requests across all healthy endpoints
  • Considers weight configuration for load distribution
  • Automatically excludes failed endpoints
  • Balances load for optimal performance

Master-Slave Scheduling

interface MasterSlaveConfig {
  schedule: 'masterSlave';
  clusters: ClusterConfig[];
}

Characteristics:

  • Uses first healthy endpoint as master
  • Falls back to other endpoints if master fails
  • Provides consistent endpoint selection
  • Ideal for read-after-write consistency requirements

All OSS Operations Support

The Cluster Client supports all standard OSS operations with automatic endpoint selection and failover:

Object Operations

// All object operations work transparently
async function put(name: string, file: ObjectSource, options?: PutObjectOptions): Promise<PutObjectResult>;
async function get(name: string, file?: string | WriteStream, options?: GetObjectOptions): Promise<GetObjectResult>;
async function delete(name: string, options?: RequestOptions): Promise<DeleteObjectResult>;
async function list(query?: ListObjectsQuery, options?: RequestOptions): Promise<ListObjectsResult>;
async function head(name: string, options?: HeadObjectOptions): Promise<HeadObjectResult>;
async function copy(name: string, sourceName: string, options?: CopyObjectOptions): Promise<CopyObjectResult>;

Bucket Operations

// All bucket operations work transparently  
async function listBuckets(query?: ListBucketsQuery, options?: RequestOptions): Promise<ListBucketsResult>;
async function getBucketInfo(name: string, options?: RequestOptions): Promise<BucketInfoResult>;
async function putBucketACL(name: string, acl: string, options?: RequestOptions): Promise<PutBucketACLResult>;
async function getBucketACL(name: string, options?: RequestOptions): Promise<GetBucketACLResult>;

Multipart Operations

// All multipart operations work transparently
async function multipartUpload(name: string, file: MultipartSource, options?: MultipartUploadOptions): Promise<MultipartUploadResult>;
async function initMultipartUpload(name: string, options?: InitMultipartUploadOptions): Promise<InitMultipartUploadResult>;
async function uploadPart(name: string, uploadId: string, partNumber: number, file: PartSource, start?: number, end?: number, options?: RequestOptions): Promise<UploadPartResult>;
async function completeMultipartUpload(name: string, uploadId: string, parts: CompletedPart[], options?: CompleteMultipartUploadOptions): Promise<CompleteMultipartUploadResult>;

Configuration Examples

Geographic Distribution

// Multi-region cluster for global access
const globalCluster = new OSS.ClusterClient({
  clusters: [
    {
      region: 'oss-cn-hangzhou',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'asia-bucket',
      weight: 10
    },
    {
      region: 'oss-us-west-1',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'us-bucket',
      weight: 8
    },
    {
      region: 'oss-eu-central-1',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'eu-bucket',
      weight: 6
    }
  ],
  schedule: 'roundRobin',
  retryMax: 2,
  retryDelay: 500
});

Master-Slave Configuration

// Primary-backup configuration
const masterSlaveCluster = new OSS.ClusterClient({
  clusters: [
    {
      // Primary endpoint
      region: 'oss-cn-hangzhou',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'primary-bucket',
      internal: true,  // Use internal endpoint for better performance
      weight: 100
    },
    {
      // Backup endpoint
      region: 'oss-cn-beijing',
      accessKeyId: 'your-access-key-id',
      accessKeySecret: 'your-access-key-secret',
      bucket: 'backup-bucket',
      weight: 1  // Only used when primary fails
    }
  ],
  schedule: 'masterSlave',
  retryMax: 3,
  retryDelay: 2000
});

STS Token Configuration

// Cluster with STS tokens
const stsCluster = new OSS.ClusterClient({
  clusters: [
    {
      region: 'oss-cn-hangzhou',
      accessKeyId: 'temp-access-key-1',
      accessKeySecret: 'temp-access-secret-1',
      stsToken: 'sts-token-1',
      bucket: 'secure-bucket-1'
    },
    {
      region: 'oss-cn-beijing',
      accessKeyId: 'temp-access-key-2',
      accessKeySecret: 'temp-access-secret-2',
      stsToken: 'sts-token-2',
      bucket: 'secure-bucket-2'
    }
  ],
  schedule: 'roundRobin'
});

Health Checking and Failover

Automatic Health Monitoring

The cluster client automatically monitors endpoint health through:

  • Request Success Rate: Tracks successful vs failed requests
  • Response Time: Monitors endpoint performance
  • Connection Status: Detects network connectivity issues
  • Error Patterns: Identifies persistent endpoint problems

Failover Behavior

// Failover is automatic and transparent
try {
  // This request will automatically failover if needed
  const result = await cluster.put('important-file.txt', fileContent);
  console.log('Upload successful');
} catch (error) {
  // Only throws if ALL endpoints fail
  console.error('All endpoints failed:', error);
}

Manual Health Check

// Health checking methods (conceptual - actual implementation may vary)
async function checkClusterHealth(): Promise<ClusterHealthReport>;

interface ClusterHealthReport {
  totalEndpoints: number;
  healthyEndpoints: number;
  unhealthyEndpoints: number;
  endpoints: EndpointHealth[];
}

interface EndpointHealth {
  region: string;
  status: 'healthy' | 'unhealthy' | 'unknown';
  lastCheck: Date;
  responseTime?: number;
  errorRate?: number;
}

Usage Patterns

High-Availability Upload

// Upload with automatic failover
async function reliableUpload(filename, content, options = {}) {
  const uploadOptions = {
    ...options,
    timeout: [5000, 30000],  // 5s connect, 30s response
    headers: {
      'x-oss-storage-class': 'Standard',
      ...options.headers
    }
  };
  
  try {
    const result = await cluster.put(filename, content, uploadOptions);
    console.log(`Uploaded ${filename} successfully`);
    return result;
  } catch (error) {
    console.error(`Failed to upload ${filename}:`, error.message);
    throw error;
  }
}

// Usage
await reliableUpload('report.pdf', pdfBuffer, {
  meta: { author: 'John Doe', version: '1.0' }
});

Resilient File Processing

// Process files with automatic retry across endpoints
async function processFiles(filenames) {
  const results = [];
  
  for (const filename of filenames) {
    try {
      // Download with automatic failover
      const object = await cluster.get(filename);
      
      // Process the file
      const processedContent = processFile(object.content);
      
      // Upload processed version with failover
      const uploadResult = await cluster.put(
        `processed/${filename}`,
        processedContent,
        { meta: { processed: 'true', timestamp: new Date().toISOString() } }
      );
      
      results.push({ filename, status: 'success', url: uploadResult.url });
      
    } catch (error) {
      console.error(`Failed to process ${filename}:`, error);
      results.push({ filename, status: 'failed', error: error.message });
    }
  }
  
  return results;
}

Multipart Upload with Cluster

// Large file upload with cluster resilience
async function clusterMultipartUpload(filename, filePath) {
  const options = {
    parallel: 3,
    partSize: 1024 * 1024,  // 1MB parts
    progress: (percentage, checkpoint) => {
      console.log(`Upload progress: ${Math.round(percentage * 100)}%`);
      // Save checkpoint for resume capability
      saveCheckpoint(filename, checkpoint);
    },
    timeout: [10000, 60000],  // Longer timeouts for large uploads
    retryMax: 2  // Retry failed parts
  };
  
  return cluster.multipartUpload(filename, filePath, options);
}

Performance Optimization

Endpoint Weight Tuning

// Adjust weights based on performance characteristics
const optimizedCluster = new OSS.ClusterClient({
  clusters: [
    {
      region: 'oss-cn-hangzhou',
      accessKeyId: 'your-key',
      accessKeySecret: 'your-secret',
      bucket: 'fast-endpoint',
      internal: true,     // Internal network for better performance
      weight: 15          // Higher weight for faster endpoint
    },
    {
      region: 'oss-cn-beijing',
      accessKeyId: 'your-key',
      accessKeySecret: 'your-secret',
      bucket: 'standard-endpoint',
      weight: 10          // Standard weight
    },
    {
      region: 'oss-cn-shenzhen',
      accessKeyId: 'your-key',
      accessKeySecret: 'your-secret',
      bucket: 'backup-endpoint',
      weight: 5           // Lower weight for backup
    }
  ],
  schedule: 'roundRobin'
});

Connection Pool Optimization

const http = require('http');
const https = require('https');

// Shared agents for connection pooling
const httpAgent = new http.Agent({
  keepAlive: true,
  maxSockets: 20,
  maxFreeSockets: 10
});

const httpsAgent = new https.Agent({
  keepAlive: true,
  maxSockets: 20,
  maxFreeSockets: 10
});

const cluster = new OSS.ClusterClient({
  clusters: clusters.map(config => ({
    ...config,
    agent: httpAgent,
    httpsAgent: httpsAgent,
    timeout: [5000, 30000]  // Optimized timeouts
  })),
  schedule: 'roundRobin'
});

Error Handling and Monitoring

Comprehensive Error Handling

async function robustClusterOperation(operation) {
  const maxAttempts = 3;
  let attempt = 0;
  
  while (attempt < maxAttempts) {
    try {
      const result = await operation();
      return result;
    } catch (error) {
      attempt++;
      
      // Log error details
      console.error(`Attempt ${attempt} failed:`, {
        error: error.message,
        code: error.code,
        status: error.status,
        endpoint: error.hostId || 'unknown'
      });
      
      // Check if error is retryable
      if (isRetryableError(error) && attempt < maxAttempts) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        console.log(`Retrying in ${delay}ms...`);
        await sleep(delay);
        continue;
      }
      
      throw error;
    }
  }
}

function isRetryableError(error) {
  const retryableCodes = [
    'RequestTimeout',
    'ConnectionTimeout',
    'ServiceUnavailable',
    'InternalError',
    'SlowDown'
  ];
  
  return retryableCodes.includes(error.code) || 
         (error.status >= 500 && error.status < 600);
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Monitoring and Metrics

class ClusterMonitor {
  constructor(cluster) {
    this.cluster = cluster;
    this.metrics = {
      totalRequests: 0,
      successfulRequests: 0,
      failedRequests: 0,
      endpointStats: new Map()
    };
  }
  
  recordRequest(endpoint, success, responseTime, error = null) {
    this.metrics.totalRequests++;
    
    if (success) {
      this.metrics.successfulRequests++;
    } else {
      this.metrics.failedRequests++;
    }
    
    // Track per-endpoint stats
    if (!this.metrics.endpointStats.has(endpoint)) {
      this.metrics.endpointStats.set(endpoint, {
        requests: 0,
        successes: 0,
        failures: 0,
        totalResponseTime: 0,
        errors: []
      });
    }
    
    const stats = this.metrics.endpointStats.get(endpoint);
    stats.requests++;
    stats.totalResponseTime += responseTime || 0;
    
    if (success) {
      stats.successes++;
    } else {
      stats.failures++;
      if (error) {
        stats.errors.push({
          timestamp: new Date(),
          error: error.message,
          code: error.code
        });
        
        // Keep only recent errors
        if (stats.errors.length > 10) {
          stats.errors.shift();
        }
      }
    }
  }
  
  getHealthReport() {
    const report = {
      overall: {
        totalRequests: this.metrics.totalRequests,
        successRate: this.metrics.successfulRequests / this.metrics.totalRequests,
        failureRate: this.metrics.failedRequests / this.metrics.totalRequests
      },
      endpoints: []
    };
    
    for (const [endpoint, stats] of this.metrics.endpointStats) {
      report.endpoints.push({
        endpoint,
        requests: stats.requests,
        successRate: stats.successes / stats.requests,
        averageResponseTime: stats.totalResponseTime / stats.requests,
        recentErrors: stats.errors.slice(-3)
      });
    }
    
    return report;
  }
}

Cluster Management

Lifecycle Management

close(): void;

Usage:

// Cleanup cluster resources when done
cluster.close();

Health Checking and Availability

The cluster client includes automatic health checking mechanisms:

  • Status File Mechanism: Monitors endpoint availability through status files
  • Heartbeat Interval: Configurable interval for health checks (default: 30000ms)
  • Automatic Recovery: Failed endpoints are periodically re-checked and restored when healthy

Configuration Options Details

  • masterOnly: When true, forces master-slave mode regardless of schedule setting
  • heartbeatInterval: Milliseconds between health checks (default: 30000)
  • ignoreStatusFile: Skip status file based health checking when true

Implementation Details

The cluster client maintains availability state through:

  1. _checkAvailable(): Internal method to verify endpoint health
  2. _checkStatus(): Status file verification for availability
  3. Automatic Failover: Seamless switching between healthy endpoints
  4. Error Retry Logic: Intelligent retry across different cluster nodes

Best Practices

Configuration Best Practices

  1. Use Internal Endpoints: When possible, use internal endpoints for better performance
  2. Set Appropriate Weights: Distribute load based on endpoint capabilities
  3. Configure Timeouts: Set reasonable connection and response timeouts
  4. Plan for Geographic Distribution: Choose regions close to your users

Operational Best Practices

  1. Monitor Endpoint Health: Regularly check cluster health and performance
  2. Implement Circuit Breakers: Temporarily disable failing endpoints
  3. Use Exponential Backoff: Implement intelligent retry strategies
  4. Cache Aggressively: Reduce load on OSS endpoints through caching
  5. Plan Capacity: Ensure endpoints can handle expected load

Security Best Practices

  1. Rotate Credentials: Regularly update access keys across all endpoints
  2. Use STS Tokens: Prefer temporary credentials for enhanced security
  3. Network Security: Use VPC endpoints and security groups when available
  4. Audit Access: Monitor and log all cluster operations

Performance Best Practices

  1. Connection Pooling: Use shared HTTP agents for better connection reuse
  2. Parallel Operations: Leverage multiple endpoints for concurrent operations
  3. Optimize Part Sizes: Use appropriate part sizes for multipart uploads
  4. Regional Optimization: Route requests to geographically closer endpoints