or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-data-structures.mddata-manipulation.mdexpression-system.mdfile-io.mdindex.mdmathematical-functions.mdreductions-aggregations.mdrow-operations.mdset-operations.mdstring-operations.mdtime-operations.mdtype-system.md

set-operations.mddocs/

0

# Set Operations

1

2

Mathematical set operations for combining and comparing data frames using standard set theory operations.

3

4

## Capabilities

5

6

### Set Operation Functions

7

8

```python { .api }

9

def union(*frames) -> Frame:

10

"""

11

Union of data frames (all unique rows from all frames).

12

13

Parameters:

14

- *frames: Frame objects to combine

15

16

Returns:

17

Frame with all unique rows from input frames

18

"""

19

20

def intersect(*frames) -> Frame:

21

"""

22

Intersection of data frames (rows common to all frames).

23

24

Parameters:

25

- *frames: Frame objects to intersect

26

27

Returns:

28

Frame with rows present in all input frames

29

"""

30

31

def setdiff(frame1, frame2) -> Frame:

32

"""

33

Set difference (rows in frame1 but not in frame2).

34

35

Parameters:

36

- frame1: First Frame

37

- frame2: Second Frame

38

39

Returns:

40

Frame with rows in frame1 that are not in frame2

41

"""

42

43

def symdiff(frame1, frame2) -> Frame:

44

"""

45

Symmetric difference (rows in either frame but not both).

46

47

Parameters:

48

- frame1: First Frame

49

- frame2: Second Frame

50

51

Returns:

52

Frame with rows in either frame but not in both

53

"""

54

```

55

56

## Examples

57

58

```python

59

import datatable as dt

60

61

# Create sample frames

62

A = dt.Frame({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})

63

B = dt.Frame({'x': [2, 3, 4], 'y': ['b', 'c', 'd']})

64

C = dt.Frame({'x': [3, 4, 5], 'y': ['c', 'd', 'e']})

65

66

# Union - all unique rows

67

union_AB = dt.union(A, B)

68

union_ABC = dt.union(A, B, C)

69

70

# Intersection - common rows

71

intersect_AB = dt.intersect(A, B)

72

intersect_ABC = dt.intersect(A, B, C)

73

74

# Set difference

75

diff_AB = dt.setdiff(A, B) # Rows in A but not B

76

diff_BA = dt.setdiff(B, A) # Rows in B but not A

77

78

# Symmetric difference

79

symdiff_AB = dt.symdiff(A, B) # Rows in A or B but not both

80

```