or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-data-structures.mddata-manipulation.mdexpression-system.mdfile-io.mdindex.mdmathematical-functions.mdreductions-aggregations.mdrow-operations.mdset-operations.mdstring-operations.mdtime-operations.mdtype-system.md

string-operations.mddocs/

0

# String Operations

1

2

Text processing and manipulation functions for string columns in datatable.

3

4

## Capabilities

5

6

### String Functions

7

8

```python { .api }

9

def str.len(x):

10

"""

11

String length function.

12

13

Parameters:

14

- x: String column expression

15

16

Returns:

17

Integer column with string lengths

18

"""

19

20

def str.slice(x, start, stop=None):

21

"""

22

String slicing function.

23

24

Parameters:

25

- x: String column expression

26

- start: Starting index

27

- stop: Ending index (optional)

28

29

Returns:

30

String column with sliced strings

31

"""

32

33

def str.split_into_nhot(x):

34

"""

35

Split strings into n-hot encoding.

36

37

Parameters:

38

- x: String column expression

39

40

Returns:

41

Frame with n-hot encoded columns

42

"""

43

```

44

45

### Regular Expression Functions

46

47

```python { .api }

48

def re.match(x, pattern):

49

"""

50

Regular expression matching.

51

52

Parameters:

53

- x: String column expression

54

- pattern: Regular expression pattern

55

56

Returns:

57

Boolean column indicating matches

58

"""

59

```

60

61

## Examples

62

63

```python

64

import datatable as dt

65

66

DT = dt.Frame({

67

'text': ['hello', 'world', 'datatable', 'python'],

68

'codes': ['ABC-123', 'DEF-456', 'GHI-789', 'JKL-012']

69

})

70

71

# String operations

72

result = DT[:, dt.update(

73

text_length=dt.str.len(f.text),

74

first_3_chars=dt.str.slice(f.text, 0, 3),

75

last_2_chars=dt.str.slice(f.text, -2),

76

matches_pattern=dt.re.match(f.codes, r'[A-Z]{3}-\d{3}')

77

)]

78

```