Python package for retrieving and managing data from the Internet Movie Database (IMDb) about movies, people, characters and companies
npx @tessl/cli install tessl/pypi-cinemagoer@2022.12.00
# Cinemagoer
1
2
A comprehensive Python library for accessing and managing data from the Internet Movie Database (IMDb). Cinemagoer provides a complete API for retrieving information about movies, people, characters, and companies from IMDb's database through multiple access methods including web scraping, direct SQL database access, and S3 dataset integration.
3
4
## Package Information
5
6
- **Package Name**: cinemagoer
7
- **Language**: Python
8
- **Installation**: `pip install cinemagoer`
9
- **Dependencies**: SQLAlchemy, lxml
10
11
## Core Imports
12
13
```python
14
from imdb import IMDb, Cinemagoer
15
```
16
17
Import specific classes:
18
19
```python
20
from imdb import Movie, Person, Character, Company, IMDbError
21
```
22
23
## Basic Usage
24
25
```python
26
from imdb import IMDb
27
28
# Create IMDb instance (default HTTP access)
29
ia = IMDb()
30
31
# Search for movies
32
movies = ia.search_movie('The Matrix')
33
movie = movies[0]
34
print(f"Title: {movie['title']}")
35
print(f"Year: {movie['year']}")
36
37
# Get detailed movie information
38
ia.update(movie)
39
print(f"Director: {movie['director'][0]['name']}")
40
print(f"Cast: {[actor['name'] for actor in movie['cast'][:5]]}")
41
42
# Search for people
43
people = ia.search_person('Keanu Reeves')
44
person = people[0]
45
ia.update(person)
46
print(f"Name: {person['name']}")
47
print(f"Birth date: {person.get('birth date')}")
48
```
49
50
## Architecture
51
52
Cinemagoer uses a modular access system architecture:
53
54
- **Factory Function**: `IMDb()` creates appropriate access system instances
55
- **Access Systems**: Multiple parsers (HTTP, SQL, S3) for different data sources
56
- **Container Classes**: Movie, Person, Character, Company objects with dictionary-like access
57
- **Data Retrieval**: Lazy loading through `update()` method for detailed information
58
- **Configuration**: Flexible configuration system supporting multiple file formats and locations
59
60
This design allows seamless switching between data sources while maintaining a consistent API for accessing IMDb data across different use cases from simple web scraping to enterprise database applications.
61
62
## Capabilities
63
64
### Core Data Access
65
66
Primary functions for creating IMDb access instances and retrieving basic system information. These form the foundation for all IMDb data operations.
67
68
```python { .api }
69
def IMDb(accessSystem=None, *arguments, **keywords):
70
"""Create IMDb access system instance."""
71
72
def available_access_systems():
73
"""Return list of available access systems."""
74
```
75
76
[Core Data Access](./core-access.md)
77
78
### Movie Operations
79
80
Comprehensive movie search, retrieval, and information management including search functionality, detailed information retrieval, and specialized movie lists and charts.
81
82
```python { .api }
83
def search_movie(title, results=None):
84
"""Search for movies by title."""
85
86
def get_movie(movieID, info=('main', 'plot'), modFunct=None):
87
"""Get movie by ID with specified information sets."""
88
89
def get_top250_movies():
90
"""Get top 250 movies list."""
91
```
92
93
[Movie Operations](./movie-operations.md)
94
95
### Person Operations
96
97
Person search, retrieval, and biographical information management including comprehensive filmography and career details.
98
99
```python { .api }
100
def search_person(name, results=None):
101
"""Search for people by name."""
102
103
def get_person(personID, info=('main', 'filmography', 'biography'), modFunct=None):
104
"""Get person by ID with specified information sets."""
105
```
106
107
[Person Operations](./person-operations.md)
108
109
### Character and Company Operations
110
111
Character and company search and retrieval functionality for accessing information about fictional characters and production companies.
112
113
```python { .api }
114
def search_character(name, results=None):
115
"""Search for characters by name."""
116
117
def search_company(name, results=None):
118
"""Search for companies by name."""
119
120
def get_character(characterID, info=('main', 'filmography', 'biography'), modFunct=None):
121
"""Get character by ID."""
122
123
def get_company(companyID, info=('main',), modFunct=None):
124
"""Get company by ID."""
125
```
126
127
[Character and Company Operations](./character-company-operations.md)
128
129
### Data Container Classes
130
131
Core classes for representing and manipulating IMDb data objects with dictionary-like access and specialized methods.
132
133
```python { .api }
134
class Movie:
135
"""Movie data container with dictionary-like access."""
136
137
class Person:
138
"""Person data container with dictionary-like access."""
139
140
class Character:
141
"""Character data container with dictionary-like access."""
142
143
class Company:
144
"""Company data container with dictionary-like access."""
145
```
146
147
[Data Container Classes](./data-containers.md)
148
149
### Advanced Features
150
151
Advanced functionality including URL/ID conversion, specialized charts, keyword operations, and data updates.
152
153
```python { .api }
154
def get_imdbID(mop):
155
"""Get IMDb ID for Movie/Person/Character/Company object."""
156
157
def search_keyword(keyword, results=None):
158
"""Search for existing keywords."""
159
160
def update(mop, info=None, override=0):
161
"""Update object with additional information."""
162
```
163
164
[Advanced Features](./advanced-features.md)
165
166
### Configuration and Utilities
167
168
Configuration management, utility functions, and helper methods for customizing behavior and processing IMDb data.
169
170
```python { .api }
171
class ConfigParserWithCase:
172
"""Case-sensitive configuration parser."""
173
174
def canonicalName(name):
175
"""Convert name to canonical format."""
176
177
def canonicalTitle(title, lang=None, imdbIndex=None):
178
"""Convert title to canonical format."""
179
```
180
181
[Configuration and Utilities](./config-utilities.md)
182
183
### Command-Line Interface
184
185
Command-line tools and console scripts for interactive IMDb data access and batch operations.
186
187
```python { .api }
188
def main():
189
"""Main entry point for imdbpy command-line interface."""
190
```
191
192
**Console Script**: `imdbpy` (installed with package)
193
194
**CLI Tools** (in bin/ directory):
195
- `search_movie.py`, `search_person.py`, `search_company.py` - Search operations
196
- `get_movie.py`, `get_person.py`, `get_company.py` - Data retrieval
197
- `get_keyword.py`, `get_movie_list.py` - Specialized retrieval
198
- `get_top_bottom_movies.py` - Chart access
199
- `imdbpy2sql.py` - Database conversion utility
200
- `s32cinemagoer.py` - S3 dataset converter
201
202
## Exception Handling
203
204
```python { .api }
205
class IMDbError(Exception):
206
"""Base exception for IMDb operations."""
207
208
class IMDbDataAccessError(IMDbError):
209
"""Exception for data access problems."""
210
211
class IMDbParserError(IMDbError):
212
"""Exception for parsing errors."""
213
```
214
215
## Constants
216
217
```python { .api }
218
VERSION: str # Package version
219
encoding: str # Default character encoding
220
imdbURL_base: str # Base IMDb URL
221
```