0
# Utilities
1
2
Additional utility functions for date manipulation, timezone handling, date range generation, and text processing that support complex date processing workflows and integration scenarios.
3
4
## Capabilities
5
6
### Date Range Generation
7
8
Generate sequences of dates between specified start and end points with flexible step configurations.
9
10
```python { .api }
11
def date_range(begin, end, **kwargs):
12
"""
13
Generate sequence of dates between begin and end.
14
15
Parameters:
16
- begin (datetime): Start date
17
- end (datetime): End date (exclusive)
18
- **kwargs: Step parameters (days, weeks, months, years, hours, minutes, seconds)
19
Note: Cannot use year, month, week, day, hour, minute, second as these
20
are reserved and will raise ValueError
21
22
Returns:
23
generator: Generator yielding datetime objects
24
25
Raises:
26
ValueError: If invalid step arguments are provided
27
"""
28
```
29
30
**Usage Examples:**
31
32
```python
33
from dateparser.date import date_range
34
from datetime import datetime
35
36
# Daily range
37
start = datetime(2023, 1, 1)
38
end = datetime(2023, 1, 10)
39
for date in date_range(start, end):
40
print(date) # 2023-01-01, 2023-01-02, ..., 2023-01-09
41
42
# Weekly intervals
43
for date in date_range(start, end, weeks=1):
44
print(date) # Every week from start to end
45
46
# Monthly intervals
47
start = datetime(2023, 1, 1)
48
end = datetime(2023, 6, 1)
49
for date in date_range(start, end, months=1):
50
print(date) # First of each month
51
52
# Custom step sizes
53
for date in date_range(start, end, days=3):
54
print(date) # Every 3 days
55
56
# Hourly intervals
57
start = datetime(2023, 1, 1, 0, 0)
58
end = datetime(2023, 1, 1, 12, 0)
59
for date in date_range(start, end, hours=2):
60
print(date) # Every 2 hours
61
```
62
63
### Time Period Analysis
64
65
Analyze and find intersecting time periods for scheduling and temporal data analysis.
66
67
```python { .api }
68
def get_intersecting_periods(low, high, period="day"):
69
"""
70
Get periods that intersect with given range.
71
72
Parameters:
73
- low (datetime): Start of time range
74
- high (datetime): End of time range
75
- period (str): Period type ('year', 'month', 'week', 'day', 'hour', 'minute', 'second', 'microsecond')
76
77
Returns:
78
generator: Generator yielding period boundaries that intersect with range
79
80
Raises:
81
ValueError: If invalid period type is provided
82
"""
83
```
84
85
**Usage Examples:**
86
87
```python
88
from dateparser.date import get_intersecting_periods
89
from datetime import datetime
90
91
# Find intersecting days
92
start = datetime(2023, 1, 15, 14, 30)
93
end = datetime(2023, 1, 18, 10, 15)
94
days = list(get_intersecting_periods(start, end, "day"))
95
# Returns day boundaries that intersect with the range
96
97
# Find intersecting months
98
start = datetime(2023, 1, 15)
99
end = datetime(2023, 3, 20)
100
months = list(get_intersecting_periods(start, end, "month"))
101
# Returns month boundaries (Feb 1, Mar 1) that intersect
102
103
# Hourly intersections for scheduling
104
start = datetime(2023, 1, 1, 9, 30)
105
end = datetime(2023, 1, 1, 14, 45)
106
hours = list(get_intersecting_periods(start, end, "hour"))
107
# Returns hour boundaries (10:00, 11:00, 12:00, 13:00, 14:00)
108
```
109
110
### Date String Processing
111
112
Utilities for cleaning, normalizing, and preprocessing date strings before parsing.
113
114
```python { .api }
115
def sanitize_date(date_string):
116
"""
117
Sanitize and normalize date strings for better parsing.
118
119
Removes unwanted characters, normalizes whitespace, handles
120
special Unicode characters, and prepares strings for parsing.
121
122
Parameters:
123
- date_string (str): Raw date string to clean
124
125
Returns:
126
str: Cleaned and normalized date string
127
"""
128
129
def sanitize_spaces(date_string):
130
"""
131
Normalize whitespace in date strings.
132
133
Parameters:
134
- date_string (str): Date string with irregular spacing
135
136
Returns:
137
str: Date string with normalized spaces
138
"""
139
```
140
141
**Usage Examples:**
142
143
```python
144
from dateparser.date import sanitize_date, sanitize_spaces
145
146
# Clean messy date strings
147
messy_date = " Jan\t15,\n\n2023 \xa0 "
148
clean_date = sanitize_date(messy_date)
149
# Returns: "Jan 15, 2023"
150
151
normalized = sanitize_spaces("Jan 15, 2023")
152
# Returns: "Jan 15, 2023"
153
154
# Use in preprocessing pipeline
155
import dateparser
156
157
def robust_parse(date_string):
158
cleaned = sanitize_date(date_string)
159
return dateparser.parse(cleaned)
160
161
date = robust_parse(" \tJanuary\n15\xa0, 2023 ")
162
```
163
164
### Timezone Utilities
165
166
Comprehensive timezone handling functions for parsing, conversion, and normalization.
167
168
```python { .api }
169
class StaticTzInfo(tzinfo):
170
"""
171
Static timezone information class for representing fixed timezone offsets.
172
173
Used internally by dateparser for timezone-aware datetime objects when
174
parsing dates with timezone information.
175
"""
176
177
def __init__(self, name, offset):
178
"""
179
Initialize static timezone.
180
181
Parameters:
182
- name (str): Timezone name or abbreviation
183
- offset (timedelta): UTC offset for this timezone
184
"""
185
186
def tzname(self, dt):
187
"""Return timezone name."""
188
189
def utcoffset(self, dt):
190
"""Return UTC offset."""
191
192
def dst(self, dt):
193
"""Return DST offset (always zero for static timezones)."""
194
195
def localize(self, dt, is_dst=False):
196
"""
197
Localize naive datetime to this timezone.
198
199
Parameters:
200
- dt (datetime): Naive datetime to localize
201
- is_dst (bool): DST flag (ignored for static timezones)
202
203
Returns:
204
datetime: Timezone-aware datetime
205
"""
206
207
def get_timezone_from_tz_string(tz_string):
208
"""
209
Parse timezone string and return timezone object.
210
211
Parameters:
212
- tz_string (str): Timezone identifier or abbreviation
213
214
Returns:
215
tzinfo: Timezone object for the given string
216
"""
217
218
def apply_timezone(date_time, tz_string):
219
"""
220
Apply timezone to datetime object.
221
222
Parameters:
223
- date_time (datetime): Datetime to apply timezone to
224
- tz_string (str): Timezone identifier
225
226
Returns:
227
datetime: Timezone-aware datetime object
228
"""
229
230
def apply_timezone_from_settings(date_obj, settings):
231
"""
232
Apply timezone based on settings configuration.
233
234
Parameters:
235
- date_obj (datetime): Datetime object
236
- settings (Settings): Settings containing timezone preferences
237
238
Returns:
239
datetime: Datetime with applied timezone settings
240
"""
241
242
def localize_timezone(date_time, tz_string):
243
"""
244
Localize naive datetime to specific timezone.
245
246
Parameters:
247
- date_time (datetime): Naive datetime object
248
- tz_string (str): Target timezone
249
250
Returns:
251
datetime: Localized datetime object
252
"""
253
254
def pop_tz_offset_from_string(date_string, as_offset=True):
255
"""
256
Extract timezone offset from date string.
257
258
Parameters:
259
- date_string (str): Date string potentially containing timezone info
260
- as_offset (bool): Return as offset object rather than string
261
262
Returns:
263
tuple: (cleaned_date_string, timezone_offset_or_name)
264
"""
265
266
def convert_to_local_tz(datetime_obj, datetime_tz_offset):
267
"""
268
Convert datetime with timezone offset to local timezone.
269
270
Parameters:
271
- datetime_obj (datetime): Datetime object to convert
272
- datetime_tz_offset: Timezone offset information
273
274
Returns:
275
datetime: Datetime converted to local timezone
276
"""
277
```
278
279
**Usage Examples:**
280
281
```python
282
from dateparser.utils import (
283
get_timezone_from_tz_string,
284
apply_timezone,
285
apply_timezone_from_settings,
286
localize_timezone
287
)
288
from dateparser.conf import Settings
289
from datetime import datetime
290
291
# Parse timezone strings
292
tz = get_timezone_from_tz_string("America/New_York")
293
utc_tz = get_timezone_from_tz_string("UTC")
294
295
# Apply timezone to datetime
296
naive_dt = datetime(2023, 1, 15, 14, 30)
297
aware_dt = apply_timezone(naive_dt, "Europe/London")
298
299
# Use settings for timezone application
300
settings = Settings({
301
'TIMEZONE': 'America/Los_Angeles',
302
'TO_TIMEZONE': 'UTC'
303
})
304
converted_dt = apply_timezone_from_settings(naive_dt, settings)
305
306
# Localize naive datetime
307
localized = localize_timezone(naive_dt, "Asia/Tokyo")
308
309
# Timezone conversion pipeline
310
def parse_with_timezone(date_string, target_tz="UTC"):
311
import dateparser
312
313
# Parse with automatic timezone detection
314
date = dateparser.parse(date_string)
315
if date:
316
# Apply target timezone
317
return apply_timezone(date, target_tz)
318
return None
319
320
# Usage
321
date = parse_with_timezone("2023-01-15 2:30 PM EST", "Europe/Paris")
322
```
323
324
### Text Processing Utilities
325
326
Helper functions for text processing and Unicode handling in date parsing contexts.
327
328
```python { .api }
329
def strip_braces(date_string):
330
"""
331
Remove braces from date string.
332
333
Parameters:
334
- date_string (str): String potentially containing braces
335
336
Returns:
337
str: String with braces removed
338
"""
339
340
def normalize_unicode(string, form="NFKD"):
341
"""
342
Normalize Unicode string for consistent processing.
343
344
Parameters:
345
- string (str): Unicode string to normalize
346
- form (str): Normalization form ('NFC', 'NFKC', 'NFD', 'NFKD')
347
348
Returns:
349
str: Normalized Unicode string
350
"""
351
352
def combine_dicts(primary_dict, supplementary_dict):
353
"""
354
Combine dictionaries with primary taking precedence.
355
356
Parameters:
357
- primary_dict (dict): Primary dictionary
358
- supplementary_dict (dict): Supplementary values
359
360
Returns:
361
dict: Combined dictionary
362
"""
363
```
364
365
**Usage Examples:**
366
367
```python
368
from dateparser.utils import strip_braces, normalize_unicode, combine_dicts
369
370
# Clean bracketed dates
371
date_with_braces = "[January 15, 2023]"
372
clean_date = strip_braces(date_with_braces)
373
# Returns: "January 15, 2023"
374
375
# Unicode normalization
376
unicode_date = "Jänüary 15, 2023" # Contains non-ASCII characters
377
normalized = normalize_unicode(unicode_date)
378
# Returns normalized ASCII-compatible string
379
380
# Configuration merging
381
default_config = {'TIMEZONE': 'UTC', 'STRICT_PARSING': False}
382
user_config = {'TIMEZONE': 'America/New_York'}
383
final_config = combine_dicts(user_config, default_config)
384
# Returns: {'TIMEZONE': 'America/New_York', 'STRICT_PARSING': False}
385
386
# Preprocessing pipeline
387
def preprocess_date_string(raw_string):
388
# Remove braces
389
cleaned = strip_braces(raw_string)
390
# Normalize Unicode
391
normalized = normalize_unicode(cleaned)
392
# Sanitize spacing
393
from dateparser.date import sanitize_spaces
394
final = sanitize_spaces(normalized)
395
return final
396
397
processed = preprocess_date_string("[Jänüary 15,\t2023]")
398
```
399
400
### Calendar Utilities
401
402
Utilities for working with calendar-specific operations and date calculations.
403
404
```python { .api }
405
def get_last_day_of_month(year, month):
406
"""
407
Get the last day of a specific month and year.
408
409
Parameters:
410
- year (int): Year
411
- month (int): Month (1-12)
412
413
Returns:
414
int: Last day of the month
415
"""
416
417
def get_previous_leap_year(year):
418
"""
419
Find the previous leap year before given year.
420
421
Parameters:
422
- year (int): Reference year
423
424
Returns:
425
int: Previous leap year
426
"""
427
428
def get_next_leap_year(year):
429
"""
430
Find the next leap year after given year.
431
432
Parameters:
433
- year (int): Reference year
434
435
Returns:
436
int: Next leap year
437
"""
438
439
def set_correct_day_from_settings(date_obj, settings, current_day=None):
440
"""
441
Adjust day based on settings preferences.
442
443
Parameters:
444
- date_obj (datetime): Date to adjust
445
- settings (Settings): Settings with day preferences
446
- current_day (int, optional): Current day reference
447
448
Returns:
449
datetime: Date with adjusted day
450
"""
451
452
def set_correct_month_from_settings(date_obj, settings, current_month=None):
453
"""
454
Adjust month based on settings preferences.
455
456
Parameters:
457
- date_obj (datetime): Date to adjust
458
- settings (Settings): Settings with month preferences
459
- current_month (int, optional): Current month reference
460
461
Returns:
462
datetime: Date with adjusted month
463
"""
464
```
465
466
**Usage Examples:**
467
468
```python
469
from dateparser.utils import (
470
get_last_day_of_month,
471
get_previous_leap_year, get_next_leap_year,
472
set_correct_day_from_settings, set_correct_month_from_settings
473
)
474
from dateparser.conf import Settings
475
from datetime import datetime
476
477
# Calendar calculations
478
last_day = get_last_day_of_month(2023, 2) # 28 (not a leap year)
479
last_day_leap = get_last_day_of_month(2024, 2) # 29 (leap year)
480
481
prev_leap = get_previous_leap_year(2023) # 2020
482
next_leap = get_next_leap_year(2023) # 2024
483
484
# Settings-based date adjustment
485
date = datetime(2023, 1, 15)
486
settings = Settings({'PREFER_DAY_OF_MONTH': 'first'})
487
adjusted = set_correct_day_from_settings(date, settings)
488
# Adjusts to first day of month based on settings
489
490
settings = Settings({'PREFER_MONTH_OF_YEAR': 'current'})
491
adjusted = set_correct_month_from_settings(date, settings, current_month=3)
492
# Adjusts month based on preference and current context
493
```
494
495
## Additional Timezone Parsing Functions
496
497
Essential timezone parsing and conversion utilities for advanced timezone handling scenarios.
498
499
```python { .api }
500
def pop_tz_offset_from_string(date_string, as_offset=True):
501
"""
502
Extract timezone offset from date string and return cleaned string.
503
504
Args:
505
date_string (str): Date string potentially containing timezone info
506
as_offset (bool): If True, return StaticTzInfo object; if False, return timezone name
507
508
Returns:
509
tuple: (cleaned_date_string, timezone_info_or_name)
510
511
Examples:
512
>>> pop_tz_offset_from_string("2023-01-15 14:30 UTC")
513
("2023-01-15 14:30 ", StaticTzInfo('UTC', timedelta(0)))
514
515
>>> pop_tz_offset_from_string("2023-01-15 14:30 EST", as_offset=False)
516
("2023-01-15 14:30 ", "EST")
517
"""
518
519
def word_is_tz(word):
520
"""
521
Check if a word represents a timezone abbreviation.
522
523
Args:
524
word (str): Word to check for timezone abbreviation
525
526
Returns:
527
bool: True if word is a recognized timezone abbreviation
528
529
Examples:
530
>>> word_is_tz("UTC")
531
True
532
>>> word_is_tz("EST")
533
True
534
>>> word_is_tz("hello")
535
False
536
"""
537
538
def convert_to_local_tz(datetime_obj, datetime_tz_offset):
539
"""
540
Convert datetime with timezone offset to local timezone.
541
542
Args:
543
datetime_obj (datetime): Datetime object to convert
544
datetime_tz_offset (timedelta): Timezone offset of the datetime
545
546
Returns:
547
datetime: Datetime converted to local timezone
548
549
Examples:
550
>>> from datetime import datetime, timedelta
551
>>> dt = datetime(2023, 1, 15, 14, 30)
552
>>> offset = timedelta(hours=-5) # EST offset
553
>>> local_dt = convert_to_local_tz(dt, offset)
554
"""
555
```
556
557
### Advanced Timezone Integration
558
559
```python
560
from dateparser.timezone_parser import pop_tz_offset_from_string, word_is_tz, convert_to_local_tz
561
562
# Extract timezone from date string
563
date_string = "Meeting at 2:30 PM EST on January 15th"
564
cleaned_string, tz_info = pop_tz_offset_from_string(date_string)
565
print(f"Cleaned: {cleaned_string}")
566
print(f"Timezone: {tz_info}")
567
568
# Check if word is timezone
569
words = ["UTC", "EST", "hello", "PST", "world"]
570
timezones = [word for word in words if word_is_tz(word)]
571
print(f"Timezones found: {timezones}") # ['UTC', 'EST', 'PST']
572
573
# Convert to local timezone
574
from datetime import datetime, timedelta
575
utc_time = datetime(2023, 1, 15, 19, 30) # 7:30 PM UTC
576
est_offset = timedelta(hours=-5)
577
local_time = convert_to_local_tz(utc_time, est_offset)
578
```