Tessl Tile for pypi/cinemagoer@2022.12.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

advanced-features.md character-company-operations.md config-utilities.md core-access.md data-containers.md index.md movie-operations.md person-operations.md

core-access.mddocs/

0
# Core Data Access
1

2
Primary functions for creating IMDb access instances and retrieving basic system information. These form the foundation for all IMDb data operations across different access methods.
3

4
## Capabilities
5

6
### IMDb Instance Creation
7

8
Creates IMDb access system instances with configurable data sources and parameters. The factory function automatically selects appropriate parsers based on the specified access system.
9

10
```python { .api }
11
def IMDb(accessSystem=None, *arguments, **keywords):
12
    """
13
    Create an instance of the appropriate IMDb access system.
14
    
15
    Parameters:
16
    - accessSystem: str, optional - Access method ('http', 'sql', 's3', 'auto', 'config')
17
    - results: int - Default number of search results (default: 20)
18
    - keywordsResults: int - Default number of keyword results (default: 100)
19
    - reraiseExceptions: bool - Whether to re-raise exceptions (default: True)
20
    - loggingLevel: int - Logging level
21
    - loggingConfig: str - Path to logging configuration file
22
    - imdbURL_base: str - Base IMDb URL (default: 'https://www.imdb.com/')
23
    
24
    Returns:
25
    IMDbBase subclass instance (IMDbHTTPAccessSystem, IMDbSqlAccessSystem, or IMDbS3AccessSystem)
26
    """
27
```
28

29
**Usage Example:**
30

31
```python
32
from imdb import IMDb
33

34
# Default HTTP access
35
ia = IMDb()
36

37
# Explicit HTTP access with custom settings
38
ia = IMDb('http', results=50, reraiseExceptions=False)
39

40
# SQL database access
41
ia = IMDb('sql', host='localhost', database='imdb')
42

43
# S3 dataset access
44
ia = IMDb('s3')
45

46
# Configuration file-based access
47
ia = IMDb('config')
48
```
49

50
### Cinemagoer Alias
51

52
Alias for the IMDb function providing identical functionality with updated branding.
53

54
```python { .api }
55
Cinemagoer = IMDb
56
```
57

58
**Usage Example:**
59

60
```python
61
from imdb import Cinemagoer
62

63
# Identical to IMDb() function
64
ia = Cinemagoer()
65
```
66

67
### Available Access Systems
68

69
Returns the list of currently available data access systems based on installed dependencies and system configuration.
70

71
```python { .api }
72
def available_access_systems():
73
    """
74
    Return the list of available data access systems.
75
    
76
    Returns:
77
    list: Available access system names (e.g., ['http', 'sql'])
78
    """
79
```
80

81
**Usage Example:**
82

83
```python
84
from imdb import available_access_systems
85

86
# Check what access systems are available
87
systems = available_access_systems()
88
print(f"Available systems: {systems}")
89
# Output: ['http'] or ['http', 'sql'] depending on installation
90
```
91

92
## Access System Types
93

94
### HTTP Access System
95

96
**Access Methods**: `'http'`, `'https'`, `'web'`, `'html'`
97
- Web scraping access to IMDb website
98
- Default access method
99
- No additional dependencies beyond base requirements
100
- Rate-limited by IMDb's website policies
101

102
### SQL Database Access System
103

104
**Access Methods**: `'sql'`, `'db'`, `'database'`
105
- Direct SQL database access to local IMDb data
106
- Requires separate IMDb database setup
107
- Fastest access for bulk operations
108
- Requires additional SQL database dependencies
109

110
### S3 Dataset Access System  
111

112
**Access Methods**: `'s3'`, `'s3dataset'`, `'imdbws'`
113
- Access to IMDb S3 datasets and web services
114
- Official IMDb data source
115
- Requires AWS credentials and network access
116
- Most up-to-date and authoritative data
117

118
## Configuration System
119

120
### Automatic Configuration
121

122
The IMDb function can automatically load configuration from files when `accessSystem='config'` or `accessSystem='auto'`.
123

124
**Configuration File Locations** (searched in order):
125
1. `./cinemagoer.cfg` or `./imdbpy.cfg` (current directory)
126
2. `./.cinemagoer.cfg` or `./.imdbpy.cfg` (current directory, hidden)
127
3. `~/cinemagoer.cfg` or `~/imdbpy.cfg` (home directory)
128
4. `~/.cinemagoer.cfg` or `~/.imdbpy.cfg` (home directory, hidden)
129
5. `/etc/cinemagoer.cfg` or `/etc/imdbpy.cfg` (Unix systems)
130
6. `/etc/conf.d/cinemagoer.cfg` or `/etc/conf.d/imdbpy.cfg` (Unix systems)
131

132
**Configuration File Format:**
133

134
```ini
135
[imdbpy]
136
accessSystem = http
137
results = 30
138
keywordsResults = 150
139
reraiseExceptions = true
140
imdbURL_base = https://www.imdb.com/
141
```
142

143
### Custom Configuration
144

145
```python { .api }
146
class ConfigParserWithCase:
147
    """
148
    Case-sensitive configuration parser for IMDb settings.
149
    
150
    Methods:
151
    - get(section, option, *args, **kwds): Get configuration value
152
    - getDict(section): Get section as dictionary
153
    - items(section, *args, **kwds): Get section items as list
154
    """
155
```
156

157
## Error Handling
158

159
All core access functions can raise IMDb-specific exceptions:
160

161
```python
162
from imdb import IMDb, IMDbError, IMDbDataAccessError
163

164
try:
165
    ia = IMDb('invalid_system')
166
except IMDbError as e:
167
    print(f"IMDb error: {e}")
168

169
try:
170
    ia = IMDb('sql')  # If SQL system not available
171
except IMDbError as e:
172
    print(f"SQL access not available: {e}")
173
```
174

175
## Performance Best Practices
176

177
Optimize performance for different use cases and access patterns:
178

179
### Access System Selection
180

181
**HTTP Access (Default):**
182
- Best for: Small to medium applications, one-off scripts, development
183
- Performance: Moderate, dependent on network latency
184
- Rate limiting: Subject to IMDb's rate limits
185
- Best practices: Cache results, use batch operations when possible
186

187
```python
188
# HTTP access - good for most use cases
189
ia = IMDb()  # Default HTTP access
190
```
191

192
**SQL Access:**
193
- Best for: Large-scale applications, high-volume queries, analytics
194
- Performance: Excellent for complex queries and bulk operations
195
- Setup required: Local IMDb database installation
196
- Best practices: Use for production applications with heavy usage
197

198
```python
199
# SQL access - optimal for large-scale applications
200
ia = IMDb('sql', host='localhost', user='imdb', password='password')
201
```
202

203
**S3 Access:**
204
- Best for: Cloud applications, AWS-integrated systems
205
- Performance: Good for bulk data processing
206
- Requirements: AWS credentials and S3 dataset access
207
- Best practices: Use for batch processing and analytics
208

209
```python
210
# S3 access - good for cloud-based bulk processing
211
ia = IMDb('s3')
212
```
213

214
### Information Set Optimization
215

216
**Selective Information Loading:**
217
```python
218
# Efficient - only load needed information
219
movie = ia.get_movie('0133093', info=['main', 'plot'])
220

221
# Inefficient - loads all available information
222
movie = ia.get_movie('0133093', info='all')
223
```
224

225
**Batch Updates:**
226
```python
227
# Efficient - batch processing
228
movies = ia.search_movie('Matrix')
229
for movie in movies[:5]:  # Limit results
230
    ia.update(movie, info=['main'])  # Minimal info for listings
231

232
# Inefficient - individual detailed updates
233
for movie in movies:
234
    ia.update(movie, info='all')  # Excessive information
235
```
236

237
### Memory Management
238

239
**Large Dataset Handling:**
240
```python
241
# Process results in batches to manage memory
242
def process_large_chart():
243
    top_movies = ia.get_top250_movies()
244
    
245
    # Process in smaller chunks
246
    chunk_size = 50
247
    for i in range(0, len(top_movies), chunk_size):
248
        chunk = top_movies[i:i + chunk_size]
249
        # Process chunk
250
        for movie in chunk:
251
            # Minimal processing to conserve memory
252
            print(f"{movie['title']} ({movie['year']})")
253
```
254

255
### Caching Strategies
256

257
**Results Caching:**
258
```python
259
from functools import lru_cache
260

261
# Cache expensive operations
262
@lru_cache(maxsize=100)
263
def cached_movie_search(title):
264
    return ia.search_movie(title, results=5)
265

266
# Reuse cached results
267
movies1 = cached_movie_search('Matrix')  # Network call
268
movies2 = cached_movie_search('Matrix')  # Cached result
269
```

Version

Tile

Files

core-access.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

core-access.mddocs/