Core foundational classes and utilities for the aiSSEMBLE platform, providing authentication, metadata management, configuration, file storage, and policy management capabilities.
npx @tessl/cli install tessl/pypi-aissemble-foundation-core-python@1.12.00
# aiSSEMBLE Foundation Core Python
1
2
A comprehensive Python foundation package for the aiSSEMBLE platform that provides essential building blocks for machine learning, data engineering, and enterprise-grade applications. This package offers unified APIs for configuration management, cloud storage, metadata tracking, authentication, ML inference, and policy-based governance across distributed systems.
3
4
## Package Information
5
6
- **Package Name**: aissemble-foundation-core-python
7
- **Language**: Python
8
- **Installation**: Available through aiSSEMBLE distribution
9
- **Dependencies**: Pydantic, Krausening, LibCloud, Kafka-Python, PyJWT, PyJKS, AIOHttp
10
11
## Core Imports
12
13
```python
14
# Bill of Materials for ML training
15
from aissemble_core_bom.training_bom import TrainingBOM
16
17
# Configuration classes for Spark and databases
18
from aissemble_core_config import SparkRDBMSConfig, SparkElasticsearchConfig, SparkNeo4jConfig, MessagingConfig
19
20
# Cloud storage abstractions
21
from aissemble_core_filestore.file_store_factory import FileStoreFactory
22
23
# Metadata management API
24
from aissemble_core_metadata.metadata_api import MetadataAPI
25
from aissemble_core_metadata.metadata_model import MetadataModel
26
from aissemble_core_metadata.hive_metadata_api_service import HiveMetadataAPIService
27
from aissemble_core_metadata.logging_metadata_api_service import LoggingMetadataAPIService
28
29
# Authentication and JWT utilities
30
from aissembleauth.auth_config import AuthConfig
31
from aissembleauth.json_web_token_util import JsonWebTokenUtil, AissembleSecurityException
32
33
# ML inference client framework
34
from inference.inference_client import InferenceClient
35
from inference.inference_config import InferenceConfig
36
from inference.inference_request import InferenceRequest, InferenceRequestBatch
37
from inference.inference_result import InferenceResult, InferenceResultBatch
38
from inference.rest_inference_client import RestInferenceClient
39
40
# Policy-based configuration management
41
from policy_manager import AbstractPolicyManager, DefaultPolicyManager
42
```
43
44
## Basic Usage
45
46
```python
47
# Configure Spark for database connections
48
from aissemble_core_config import SparkRDBMSConfig
49
50
# Initialize database configuration
51
db_config = SparkRDBMSConfig()
52
jdbc_url = db_config.jdbc_url() # Gets JDBC URL from properties
53
driver = db_config.jdbc_driver() # Gets driver class name
54
55
# Create cloud file store
56
from aissemble_core_filestore.file_store_factory import FileStoreFactory
57
58
file_store = FileStoreFactory.create_file_store("my-s3-store")
59
# Now use libcloud StorageDriver interface for file operations
60
61
# Track metadata for ML workflows
62
from aissemble_core_metadata.metadata_model import MetadataModel
63
from aissemble_core_metadata.hive_metadata_api_service import HiveMetadataAPIService
64
65
metadata = MetadataModel(
66
resource="training-dataset-v1.0",
67
subject="ml-pipeline",
68
action="TRAINING_STARTED"
69
)
70
71
metadata_service = HiveMetadataAPIService()
72
metadata_service.create_metadata(metadata)
73
74
# Authenticate and validate JWT tokens
75
from aissembleauth.json_web_token_util import JsonWebTokenUtil
76
77
jwt_util = JsonWebTokenUtil()
78
try:
79
parsed_token = jwt_util.parse_token(token_string)
80
jwt_util.validate_token(token_string) # Raises exception if invalid
81
print("Token is valid")
82
except AissembleSecurityException as e:
83
print(f"Authentication failed: {e}")
84
85
# Perform ML inference
86
from inference.rest_inference_client import RestInferenceClient
87
from inference.inference_request import InferenceRequest
88
89
client = RestInferenceClient()
90
request = InferenceRequest(
91
source_ip_address="192.168.1.100",
92
kind="security-scan",
93
category="network-traffic"
94
)
95
96
result = await client.infer(request)
97
if result.threat_detected:
98
print(f"Threat detected with score: {result.score}")
99
```
100
101
## Architecture
102
103
The aiSSEMBLE Foundation Core provides 7 major functional areas that work together to support enterprise ML and data engineering workflows:
104
105
### Configuration Layer
106
Unified configuration management for distributed systems including Spark clusters, databases (PostgreSQL, Elasticsearch, Neo4j), and messaging systems (Kafka). Supports property-based configuration with environment overrides.
107
108
### Storage Abstraction
109
Cloud-agnostic file storage through LibCloud integration, supporting local filesystem, AWS S3, and other cloud providers with a consistent API interface.
110
111
### Metadata Management
112
Comprehensive metadata tracking for ML workflows, data lineage, and audit trails. Supports multiple backends including Kafka-based streaming and logging-based implementations.
113
114
### Security Framework
115
Enterprise-grade authentication and authorization using JWT tokens, Java keystore integration, and policy-based access control with pluggable security providers.
116
117
### ML Inference Platform
118
Standardized client framework for ML model inference supporting both REST and gRPC protocols, with batch processing capabilities for high-throughput scenarios.
119
120
### Policy Engine
121
Policy-based configuration and governance system supporting JSON-defined rules, targets, and alerts with extensible rule evaluation framework.
122
123
### Training Metadata
124
Bill of Materials tracking for ML training workflows including dataset information, feature engineering details, model specifications, and MLflow integration.
125
126
## Capabilities
127
128
### Training BOM Management
129
130
Complete lifecycle tracking for machine learning training processes with structured metadata capture including dataset origins, feature engineering details, model architecture, and MLflow integration for experiment tracking.
131
132
```python { .api }
133
class TrainingBOM(BaseModel):
134
id: str
135
start_time: str
136
end_time: str
137
dataset_info: DatasetInfo
138
feature_info: FeatureInfo
139
model_info: ModelInfo
140
mlflow_params: Dict
141
mlflow_metrics: Dict
142
143
class TrainingBOM.DatasetInfo(BaseModel):
144
origin: str
145
size: int = 0
146
147
class TrainingBOM.FeatureInfo(BaseModel):
148
original_features: List[str] = []
149
selected_features: List[str] = []
150
151
class TrainingBOM.ModelInfo(BaseModel):
152
type: str
153
architecture: str
154
```
155
156
[Training BOM](./bom.md)
157
158
### Spark and Database Configuration
159
160
Comprehensive configuration management for Spark clusters and database connections including PostgreSQL, Elasticsearch, and Neo4j with property-based settings and environment variable overrides.
161
162
```python { .api }
163
class SparkRDBMSConfig:
164
def __init__(self) -> None: ...
165
def jdbc_url(self) -> str: ...
166
def jdbc_driver(self) -> str: ...
167
def user(self) -> str: ...
168
def password(self) -> str: ...
169
170
class SparkElasticsearchConfig:
171
def __init__(self) -> None: ...
172
def spark_es_nodes(self) -> str: ...
173
def spark_es_port(self) -> str: ...
174
def get_es_configs(self) -> dict: ...
175
176
class SparkNeo4jConfig:
177
def __init__(self) -> None: ...
178
def url(self) -> str: ...
179
def get_spark_options(self) -> Dict[str, str]: ...
180
181
class MessagingConfig:
182
def __init__(self) -> None: ...
183
def server(self) -> str: ...
184
def metadata_topic(self) -> str: ...
185
```
186
187
[Configuration Management](./config.md)
188
189
### Cloud File Storage
190
191
Cloud-agnostic file storage abstraction using LibCloud to provide consistent API across local filesystem, AWS S3, and other cloud storage providers with automatic provider detection and configuration.
192
193
```python { .api }
194
class FileStoreFactory:
195
@staticmethod
196
def create_file_store(name: str) -> StorageDriver: ...
197
198
@staticmethod
199
def create_local_file_store(name: str, filtered, cls) -> StorageDriver: ...
200
201
@staticmethod
202
def create_s3_file_store(name: str, filtered, provider) -> StorageDriver: ...
203
```
204
205
[File Storage](./filestore.md)
206
207
### Metadata API and Services
208
209
Comprehensive metadata management system for tracking data lineage, ML workflows, and audit trails with support for multiple storage backends including Kafka streaming and logging-based implementations.
210
211
```python { .api }
212
class MetadataAPI(ABC):
213
@abstractmethod
214
def create_metadata(self, metadata: MetadataModel) -> None: ...
215
216
@abstractmethod
217
def get_metadata(self, search_params: Dict[str, any]) -> List[MetadataModel]: ...
218
219
class MetadataModel(BaseModel):
220
resource: str = uuid4().hex
221
subject: str = ""
222
action: str = ""
223
timestamp: float = datetime.now().timestamp()
224
additionalValues: Dict[str, str] = dict()
225
226
class HiveMetadataAPIService(MetadataAPI):
227
def __init__(self) -> None: ...
228
def create_metadata(self, metadata: MetadataModel) -> None: ...
229
def get_metadata(self, search_params: Dict[str, any]) -> List[MetadataModel]: ...
230
```
231
232
[Metadata Management](./metadata.md)
233
234
### Authentication and JWT Handling
235
236
Enterprise-grade security framework with JWT token management, Java keystore integration, and configurable authentication providers supporting both token validation and generation capabilities.
237
238
```python { .api }
239
class AuthConfig:
240
def __init__(self) -> None: ...
241
def public_key_path(self) -> str: ...
242
def jks_path(self) -> str: ...
243
def jks_password(self) -> str: ...
244
def jks_key_alias(self) -> str: ...
245
def pdp_host_url(self) -> str: ...
246
def is_authorization_enabled(self) -> bool: ...
247
248
class JsonWebTokenUtil:
249
def __init__(self) -> None: ...
250
def parse_token(self, token: str): ...
251
def create_token(self): ...
252
def validate_token(self, token: str) -> None: ...
253
def get_sign_key(self) -> str: ...
254
255
class AissembleSecurityException(Exception): ...
256
```
257
258
[Authentication](./auth.md)
259
260
### ML Inference Client
261
262
Standardized client framework for machine learning model inference supporting both individual and batch processing with REST and gRPC protocol support for high-performance model serving.
263
264
```python { .api }
265
class InferenceClient(ABC):
266
def __init__(self) -> None: ...
267
268
@abstractmethod
269
def infer(self, inference_request: InferenceRequest) -> InferenceResult: ...
270
271
@abstractmethod
272
def infer_batch(self, inference_request_batch: InferenceRequestBatch) -> list[InferenceResultBatch]: ...
273
274
class InferenceConfig:
275
def __init__(self) -> None: ...
276
def rest_service_url(self) -> str: ...
277
def rest_service_port(self) -> str: ...
278
def grpc_service_url(self) -> str: ...
279
def grpc_service_port(self) -> str: ...
280
281
class InferenceRequest:
282
def __init__(self, source_ip_address: str = "", created: int = 0, kind: str = "", category: str = "", outcome: str = "") -> None: ...
283
284
class InferenceRequestBatch:
285
def __init__(self, row_id_key: str, data: list[InferenceRequest]) -> None: ...
286
287
class InferenceResult:
288
def __init__(self, threat_detected: bool = False, score: int = 0) -> None: ...
289
290
class InferenceResultBatch:
291
def __init__(self, row_id_key: str, result: InferenceResult) -> None: ...
292
293
class RestInferenceClient(InferenceClient):
294
async def infer(self, inference_request: InferenceRequest) -> InferenceResult: ...
295
async def infer_batch(self, inference_request_batch: InferenceRequestBatch) -> list[InferenceResultBatch]: ...
296
```
297
298
[ML Inference](./inference.md)
299
300
### Policy Management
301
302
Policy-based configuration and governance system with JSON-defined rules, configurable targets, and extensible rule evaluation framework supporting complex business logic and compliance requirements.
303
304
```python { .api }
305
class AbstractPolicyManager(ABC):
306
def __init__(self) -> None: ...
307
def getPolicy(self, policyIdentifier: str) -> Policy: ...
308
def loadPolicyConfigurations(self, policiesLocation: str) -> None: ...
309
310
@property
311
def policies(self) -> Dict[str, Policy]: ...
312
313
class DefaultPolicyManager(AbstractPolicyManager):
314
@staticmethod
315
def getInstance() -> DefaultPolicyManager: ...
316
317
class AlertOptions:
318
ALWAYS: str = "ALWAYS"
319
ON_DETECTION: str = "ON_DETECTION"
320
NEVER: str = "NEVER"
321
322
class Target(BaseModel):
323
retrieve_url: Optional[str] = None
324
type: Optional[str] = None
325
326
class ConfiguredTarget(Target):
327
target_configurations: Dict[str, Any]
328
329
class ConfiguredRule(BaseModel):
330
className: str
331
configurations: Optional[Dict[str, Any]] = None
332
configuredTargets: Optional[List[ConfiguredTarget]] = []
333
334
class Policy(BaseModel):
335
alertOptions: AlertOptions = AlertOptions.ON_DETECTION
336
identifier: str
337
description: Optional[str] = None
338
targets: Optional[List[Target]] = []
339
rules: List[ConfiguredRule] = []
340
```
341
342
[Policy Management](./policy.md)