0
# Memory Management
1
2
Spark Unsafe provides sophisticated memory management capabilities supporting both heap and off-heap allocation strategies. The memory management system includes allocators, memory blocks, and memory locations designed for high-performance data processing workloads.
3
4
## Core Imports
5
6
```java
7
import org.apache.spark.unsafe.memory.MemoryAllocator;
8
import org.apache.spark.unsafe.memory.MemoryBlock;
9
import org.apache.spark.unsafe.memory.MemoryLocation;
10
import org.apache.spark.unsafe.memory.HeapMemoryAllocator;
11
import org.apache.spark.unsafe.memory.UnsafeMemoryAllocator;
12
```
13
14
## Usage Examples
15
16
### Basic Memory Allocation
17
18
```java
19
// Use heap memory allocator
20
MemoryAllocator heapAllocator = new HeapMemoryAllocator();
21
MemoryBlock block = heapAllocator.allocate(1024);
22
23
// Fill the block with data
24
block.fill((byte) 0xFF);
25
26
// Clean up
27
heapAllocator.free(block);
28
```
29
30
### Off-heap Memory Allocation
31
32
```java
33
// Use unsafe (off-heap) memory allocator
34
MemoryAllocator unsafeAllocator = new UnsafeMemoryAllocator();
35
MemoryBlock offHeapBlock = unsafeAllocator.allocate(2048);
36
37
// Use the memory block
38
System.out.println("Block size: " + offHeapBlock.size());
39
40
// Clean up
41
unsafeAllocator.free(offHeapBlock);
42
```
43
44
### Global Allocator Instances
45
46
```java
47
// Use global allocator instances
48
MemoryBlock heapBlock = MemoryAllocator.HEAP.allocate(512);
49
MemoryBlock unsafeBlock = MemoryAllocator.UNSAFE.allocate(512);
50
51
// Clean up
52
MemoryAllocator.HEAP.free(heapBlock);
53
MemoryAllocator.UNSAFE.free(unsafeBlock);
54
```
55
56
### Working with Memory Locations
57
58
```java
59
// Create memory location
60
MemoryLocation location = new MemoryLocation(null, 0);
61
62
// Update location
63
byte[] data = new byte[100];
64
location.setObjAndOffset(data, Platform.BYTE_ARRAY_OFFSET);
65
66
// Access location properties
67
Object baseObject = location.getBaseObject();
68
long baseOffset = location.getBaseOffset();
69
```
70
71
### Creating Memory Block from Arrays
72
73
```java
74
long[] array = {1L, 2L, 3L, 4L, 5L};
75
MemoryBlock arrayBlock = MemoryBlock.fromLongArray(array);
76
77
System.out.println("Array block size: " + arrayBlock.size());
78
```
79
80
## API Reference
81
82
### MemoryAllocator Interface
83
84
```java { .api }
85
public interface MemoryAllocator {
86
// Debug configuration constants
87
public static final boolean MEMORY_DEBUG_FILL_ENABLED;
88
public static final byte MEMORY_DEBUG_FILL_CLEAN_VALUE;
89
public static final byte MEMORY_DEBUG_FILL_FREED_VALUE;
90
91
// Global allocator instances
92
public static final MemoryAllocator UNSAFE;
93
public static final MemoryAllocator HEAP;
94
95
/**
96
* Allocates a contiguous memory block of the specified size.
97
*/
98
MemoryBlock allocate(long size);
99
100
/**
101
* Frees a previously allocated memory block.
102
*/
103
void free(MemoryBlock memory);
104
}
105
```
106
107
### MemoryBlock Class
108
109
```java { .api }
110
public class MemoryBlock {
111
// Page number constants for TaskMemoryManager integration
112
public static final int NO_PAGE_NUMBER;
113
public static final int FREED_IN_TMM_PAGE_NUMBER;
114
public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER;
115
116
// Optional page number for TaskMemoryManager allocated pages
117
public int pageNumber;
118
119
/**
120
* Creates memory block with specified base object, offset, and length.
121
*/
122
public MemoryBlock(Object obj, long offset, long length);
123
124
/**
125
* Returns the size of this memory block in bytes.
126
*/
127
public long size();
128
129
/**
130
* Fills the entire memory block with the specified byte value.
131
*/
132
public void fill(byte value);
133
134
/**
135
* Creates a memory block wrapping a long array.
136
*/
137
public static MemoryBlock fromLongArray(long[] array);
138
}
139
```
140
141
### MemoryLocation Class
142
143
```java { .api }
144
public class MemoryLocation {
145
/**
146
* Creates memory location with specified base object and offset.
147
*/
148
public MemoryLocation(Object obj, long offset);
149
150
/**
151
* Creates memory location with null base and zero offset.
152
*/
153
public MemoryLocation();
154
155
/**
156
* Updates the base object and offset of this memory location.
157
*/
158
public void setObjAndOffset(Object newObj, long newOffset);
159
160
/**
161
* Returns the base object for memory access.
162
*/
163
public Object getBaseObject();
164
165
/**
166
* Returns the base offset for memory access.
167
*/
168
public long getBaseOffset();
169
}
170
```
171
172
### HeapMemoryAllocator Class
173
174
```java { .api }
175
public class HeapMemoryAllocator implements MemoryAllocator {
176
/**
177
* Allocates heap memory block of specified size.
178
*/
179
public MemoryBlock allocate(long size);
180
181
/**
182
* Frees previously allocated heap memory block.
183
*/
184
public void free(MemoryBlock memory);
185
}
186
```
187
188
### UnsafeMemoryAllocator Class
189
190
```java { .api }
191
public class UnsafeMemoryAllocator implements MemoryAllocator {
192
/**
193
* Allocates off-heap memory block using unsafe operations.
194
*/
195
public MemoryBlock allocate(long size);
196
197
/**
198
* Frees previously allocated off-heap memory block.
199
*/
200
public void free(MemoryBlock memory);
201
}
202
```
203
204
## Memory Allocation Strategies
205
206
### Heap Allocation
207
208
- **Use Case**: When you need memory that's managed by the JVM garbage collector
209
- **Advantages**: Automatic garbage collection, safer memory management
210
- **Disadvantages**: Subject to GC pauses, limited by heap size
211
- **Implementation**: Uses Java byte arrays internally
212
213
### Off-heap Allocation
214
215
- **Use Case**: When you need large amounts of memory without GC overhead
216
- **Advantages**: No GC impact, can exceed heap size limits
217
- **Disadvantages**: Manual memory management required, potential for memory leaks
218
- **Implementation**: Uses `Platform.allocateMemory()` directly
219
220
## Debug Support
221
222
The memory allocator system includes debug support for tracking memory allocation and deallocation:
223
224
- `MEMORY_DEBUG_FILL_ENABLED`: When true, fills allocated and freed memory with specific patterns
225
- `MEMORY_DEBUG_FILL_CLEAN_VALUE`: Value used to fill newly allocated memory
226
- `MEMORY_DEBUG_FILL_FREED_VALUE`: Value used to fill freed memory
227
228
## Integration with TaskMemoryManager
229
230
Memory blocks can be integrated with Spark's TaskMemoryManager for memory tracking:
231
232
- `pageNumber` field tracks pages allocated by TaskMemoryManager
233
- Special constants indicate different memory management states
234
- Supports both heap and off-heap memory tracking
235
236
## Usage Notes
237
238
1. **Always Free Memory**: Pair every `allocate()` call with a corresponding `free()` call to prevent memory leaks.
239
240
2. **Choose Appropriate Allocator**: Use heap allocation for smaller, short-lived objects and off-heap allocation for large, long-lived data structures.
241
242
3. **Debug Mode**: Enable debug filling in development to catch use-after-free bugs.
243
244
4. **Thread Safety**: Memory allocators are thread-safe, but individual memory blocks are not.
245
246
5. **Memory Block Reuse**: Memory blocks should not be used after being freed.