0
# Bags and Counting Collections
1
2
Bags (also known as MultiSets) are collections that count the number of times an object appears in the collection. Unlike regular sets which only track presence/absence, bags maintain occurrence counts for each unique element.
3
4
## Core Interfaces
5
6
### Bag<E> Interface
7
8
The primary interface for bag collections that count element occurrences.
9
10
```java { .api }
11
import org.apache.commons.collections4.Bag;
12
import org.apache.commons.collections4.bag.HashBag;
13
import java.util.Set;
14
15
Bag<String> bag = new HashBag<>();
16
17
// Add elements (returns true if collection changed)
18
boolean added = bag.add("apple"); // Count: apple=1
19
bag.add("apple", 3); // Count: apple=4
20
bag.add("banana", 2); // Count: banana=2
21
22
// Query counts
23
int appleCount = bag.getCount("apple"); // Returns 4
24
int bananaCount = bag.getCount("banana"); // Returns 2
25
int orangeCount = bag.getCount("orange"); // Returns 0 (not present)
26
27
// Collection properties
28
int totalSize = bag.size(); // Returns 6 (4+2)
29
Set<String> unique = bag.uniqueSet(); // Returns {"apple", "banana"}
30
int uniqueCount = unique.size(); // Returns 2
31
32
// Remove elements
33
boolean removed = bag.remove("apple"); // Removes 1, count now 3
34
bag.remove("apple", 2); // Removes 2, count now 1
35
bag.remove("banana", 5); // Removes all (only 2 existed)
36
37
// Check presence
38
boolean hasApple = bag.contains("apple"); // true (count > 0)
39
boolean hasBanana = bag.contains("banana"); // false (count = 0)
40
```
41
42
### SortedBag<E> Interface
43
44
A bag that maintains elements in sorted order by their natural ordering or a provided comparator.
45
46
```java { .api }
47
import org.apache.commons.collections4.SortedBag;
48
import org.apache.commons.collections4.bag.TreeBag;
49
import java.util.Comparator;
50
51
// Natural ordering
52
SortedBag<String> sortedBag = new TreeBag<>();
53
sortedBag.add("zebra", 2);
54
sortedBag.add("apple", 3);
55
sortedBag.add("banana", 1);
56
57
// Elements maintain sorted order
58
String first = sortedBag.first(); // Returns "apple"
59
String last = sortedBag.last(); // Returns "zebra"
60
61
// Custom comparator (reverse order)
62
SortedBag<String> reverseBag = new TreeBag<>(Comparator.reverseOrder());
63
reverseBag.addAll(sortedBag);
64
String firstReverse = reverseBag.first(); // Returns "zebra"
65
String lastReverse = reverseBag.last(); // Returns "apple"
66
```
67
68
### MultiSet<E> Interface
69
70
An alternative interface to Bag with slightly different semantics and additional methods.
71
72
```java { .api }
73
import org.apache.commons.collections4.MultiSet;
74
import org.apache.commons.collections4.multiset.HashMultiSet;
75
import java.util.Set;
76
77
MultiSet<String> multiset = new HashMultiSet<>();
78
79
// Add and set counts
80
multiset.add("item", 5); // Add 5 occurrences
81
int oldCount = multiset.setCount("item", 3); // Set to 3, returns old count (5)
82
83
// Entry iteration with counts
84
Set<MultiSet.Entry<String>> entries = multiset.entrySet();
85
for (MultiSet.Entry<String> entry : entries) {
86
String element = entry.getElement();
87
int count = entry.getCount();
88
System.out.println(element + ": " + count);
89
}
90
91
// Unique elements
92
Set<String> uniqueElements = multiset.uniqueSet();
93
```
94
95
## Concrete Implementations
96
97
### HashBag<E>
98
99
A hash-based bag implementation offering O(1) average case performance for basic operations.
100
101
```java { .api }
102
import org.apache.commons.collections4.bag.HashBag;
103
import java.util.Collection;
104
import java.util.Arrays;
105
106
// Create empty bag
107
HashBag<Integer> numbers = new HashBag<>();
108
109
// Create from existing collection
110
Collection<String> words = Arrays.asList("the", "quick", "brown", "the", "fox");
111
HashBag<String> wordCounts = new HashBag<>(words);
112
// Result: {the=2, quick=1, brown=1, fox=1}
113
114
// Bulk operations
115
numbers.addAll(Arrays.asList(1, 2, 2, 3, 3, 3));
116
System.out.println(numbers.getCount(1)); // 1
117
System.out.println(numbers.getCount(2)); // 2
118
System.out.println(numbers.getCount(3)); // 3
119
```
120
121
### TreeBag<E>
122
123
A tree-based sorted bag implementation with O(log n) performance and element ordering.
124
125
```java { .api }
126
import org.apache.commons.collections4.bag.TreeBag;
127
import java.util.Comparator;
128
129
// Natural ordering
130
TreeBag<String> words = new TreeBag<>();
131
words.add("zebra", 1);
132
words.add("apple", 3);
133
words.add("banana", 2);
134
135
// Iterate in sorted order
136
for (String word : words.uniqueSet()) {
137
System.out.println(word + ": " + words.getCount(word));
138
}
139
// Output: apple: 3, banana: 2, zebra: 1
140
141
// Custom comparator (by string length)
142
TreeBag<String> byLength = new TreeBag<>(Comparator.comparing(String::length));
143
byLength.add("a", 1);
144
byLength.add("hello", 2);
145
byLength.add("hi", 3);
146
147
String shortest = byLength.first(); // "a" (length 1)
148
String longest = byLength.last(); // "hello" (length 5)
149
```
150
151
### HashMultiSet<E>
152
153
Hash-based implementation of the MultiSet interface.
154
155
```java { .api }
156
import org.apache.commons.collections4.multiset.HashMultiSet;
157
158
HashMultiSet<Character> charCounts = new HashMultiSet<>();
159
160
// Count character frequencies
161
String text = "hello world";
162
for (char c : text.toCharArray()) {
163
if (c != ' ') {
164
charCounts.add(c);
165
}
166
}
167
168
// Get most frequent character
169
char mostFrequent = charCounts.entrySet().stream()
170
.max(Comparator.comparing(MultiSet.Entry::getCount))
171
.map(MultiSet.Entry::getElement)
172
.orElse('\0');
173
System.out.println("Most frequent: " + mostFrequent); // 'l' (count: 3)
174
```
175
176
### CollectionBag<E>
177
178
A bag implementation that wraps an existing collection and maintains element counts.
179
180
```java { .api }
181
import org.apache.commons.collections4.bag.CollectionBag;
182
import java.util.ArrayList;
183
import java.util.List;
184
185
// Wrap an existing collection
186
List<String> existingList = new ArrayList<>();
187
existingList.add("apple");
188
existingList.add("banana");
189
existingList.add("apple");
190
191
CollectionBag<String> collectionBag = new CollectionBag<>(existingList);
192
193
// Bag operations work on the underlying collection
194
int appleCount = collectionBag.getCount("apple"); // 2
195
collectionBag.add("cherry", 3);
196
197
// Changes are reflected in original collection
198
System.out.println(existingList.size()); // 6 (apple, banana, apple, cherry, cherry, cherry)
199
```
200
201
### CollectionSortedBag<E>
202
203
A sorted bag implementation that wraps an existing sorted collection.
204
205
```java { .api }
206
import org.apache.commons.collections4.bag.CollectionSortedBag;
207
import java.util.TreeSet;
208
import java.util.Set;
209
210
// Wrap an existing sorted collection
211
Set<String> sortedSet = new TreeSet<>();
212
sortedSet.add("zebra");
213
sortedSet.add("apple");
214
sortedSet.add("banana");
215
216
CollectionSortedBag<String> sortedCollectionBag = new CollectionSortedBag<>(sortedSet);
217
218
// Maintains sorted order
219
String first = sortedCollectionBag.first(); // "apple"
220
String last = sortedCollectionBag.last(); // "zebra"
221
```
222
223
### UnmodifiableBag<E> and UnmodifiableSortedBag<E>
224
225
Immutable views of bags that prevent modification.
226
227
```java { .api }
228
import org.apache.commons.collections4.bag.UnmodifiableBag;
229
import org.apache.commons.collections4.bag.UnmodifiableSortedBag;
230
import org.apache.commons.collections4.BagUtils;
231
232
Bag<String> mutableBag = new HashBag<>();
233
mutableBag.add("apple", 3);
234
mutableBag.add("banana", 2);
235
236
// Create unmodifiable view
237
Bag<String> immutableBag = UnmodifiableBag.unmodifiableBag(mutableBag);
238
// Or use utility method
239
Bag<String> immutableBag2 = BagUtils.unmodifiableBag(mutableBag);
240
241
// Reading operations work
242
int count = immutableBag.getCount("apple"); // 3
243
244
// Modification attempts throw UnsupportedOperationException
245
// immutableBag.add("cherry"); // Throws exception
246
247
// For sorted bags
248
SortedBag<String> mutableSorted = new TreeBag<>();
249
mutableSorted.add("zebra", 2);
250
mutableSorted.add("apple", 1);
251
252
SortedBag<String> immutableSorted = UnmodifiableSortedBag.unmodifiableSortedBag(mutableSorted);
253
String first = immutableSorted.first(); // "apple" (reading works)
254
// immutableSorted.add("banana"); // Throws exception
255
```
256
257
## Bag Decorators
258
259
### Thread-Safe Bags
260
261
```java { .api }
262
import org.apache.commons.collections4.BagUtils;
263
import org.apache.commons.collections4.bag.HashBag;
264
import org.apache.commons.collections4.bag.SynchronizedBag;
265
266
// Create thread-safe bag
267
Bag<String> safeBag = BagUtils.synchronizedBag(new HashBag<>());
268
269
// Or create directly
270
Bag<String> directSafe = SynchronizedBag.synchronizedBag(new HashBag<>());
271
272
// Thread-safe operations
273
safeBag.add("item", 5); // Safe for concurrent access
274
int count = safeBag.getCount("item"); // Safe for concurrent access
275
276
// Note: Iteration requires external synchronization
277
synchronized(safeBag) {
278
for (String item : safeBag.uniqueSet()) {
279
System.out.println(item + ": " + safeBag.getCount(item));
280
}
281
}
282
```
283
284
### Unmodifiable Bags
285
286
```java { .api }
287
import org.apache.commons.collections4.BagUtils;
288
import org.apache.commons.collections4.bag.HashBag;
289
290
Bag<String> mutableBag = new HashBag<>();
291
mutableBag.add("read-only", 3);
292
293
// Create unmodifiable view
294
Bag<String> readOnlyBag = BagUtils.unmodifiableBag(mutableBag);
295
296
// Reading works fine
297
int count = readOnlyBag.getCount("read-only"); // Returns 3
298
boolean contains = readOnlyBag.contains("read-only"); // true
299
300
// Modifications throw UnsupportedOperationException
301
try {
302
readOnlyBag.add("new-item");
303
} catch (UnsupportedOperationException e) {
304
System.out.println("Cannot modify unmodifiable bag");
305
}
306
```
307
308
### Predicated Bags
309
310
Bags that validate elements before adding them.
311
312
```java { .api }
313
import org.apache.commons.collections4.BagUtils;
314
import org.apache.commons.collections4.Predicate;
315
import org.apache.commons.collections4.bag.HashBag;
316
317
// Only allow positive integers
318
Predicate<Integer> positiveOnly = n -> n > 0;
319
Bag<Integer> positiveBag = BagUtils.predicatedBag(new HashBag<>(), positiveOnly);
320
321
positiveBag.add(5, 2); // Succeeds, count: 5=2
322
positiveBag.add(10); // Succeeds, count: 10=1
323
324
try {
325
positiveBag.add(-1); // Throws IllegalArgumentException
326
} catch (IllegalArgumentException e) {
327
System.out.println("Negative numbers not allowed");
328
}
329
330
// String length validation
331
Predicate<String> maxLength = s -> s.length() <= 10;
332
Bag<String> shortStrings = BagUtils.predicatedBag(new HashBag<>(), maxLength);
333
```
334
335
### Transformed Bags
336
337
Bags that transform elements before storing them.
338
339
```java { .api }
340
import org.apache.commons.collections4.BagUtils;
341
import org.apache.commons.collections4.Transformer;
342
import org.apache.commons.collections4.bag.HashBag;
343
344
// Transform to uppercase
345
Transformer<String, String> upperCase = String::toUpperCase;
346
Bag<String> upperBag = BagUtils.transformedBag(new HashBag<>(), upperCase);
347
348
upperBag.add("hello"); // Stored as "HELLO"
349
upperBag.add("world"); // Stored as "WORLD"
350
upperBag.add("Hello"); // Adds to existing "HELLO" count
351
352
int helloCount = upperBag.getCount("HELLO"); // Returns 2
353
int worldCount = upperBag.getCount("WORLD"); // Returns 1
354
355
// Numeric transformation
356
Transformer<Integer, Integer> absolute = Math::abs;
357
Bag<Integer> absBag = BagUtils.transformedBag(new HashBag<>(), absolute);
358
absBag.add(-5); // Stored as 5
359
absBag.add(5); // Adds to existing 5 count
360
int fiveCount = absBag.getCount(5); // Returns 2
361
```
362
363
## Utility Operations
364
365
### BagUtils Class
366
367
Comprehensive utility methods for bag operations.
368
369
```java { .api }
370
import org.apache.commons.collections4.BagUtils;
371
import org.apache.commons.collections4.bag.HashBag;
372
373
// Empty immutable bag
374
Bag<String> empty = BagUtils.emptyBag();
375
System.out.println(empty.isEmpty()); // true
376
377
// Create collection-backed bag
378
Bag<String> baseBag = new HashBag<>();
379
baseBag.add("item1", 2);
380
Bag<String> collectionBag = BagUtils.collectionBag(baseBag);
381
382
// Synchronization
383
Bag<Integer> concurrent = BagUtils.synchronizedBag(new HashBag<>());
384
385
// Sorted bag synchronization
386
SortedBag<String> sortedConcurrent = BagUtils.synchronizedSortedBag(new TreeBag<>());
387
388
// Predicate validation
389
Predicate<String> nonEmpty = s -> !s.isEmpty();
390
Bag<String> validatedBag = BagUtils.predicatedBag(new HashBag<>(), nonEmpty);
391
392
// Transformation
393
Transformer<String, String> trim = String::trim;
394
Bag<String> trimmedBag = BagUtils.transformingBag(new HashBag<>(), trim);
395
```
396
397
## Common Use Cases
398
399
### Frequency Analysis
400
401
```java { .api }
402
import java.util.Arrays;
403
import java.util.List;
404
import java.util.Map;
405
import java.util.stream.Collectors;
406
407
// Count word frequencies in text
408
List<String> words = Arrays.asList("the", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog", "the");
409
410
Bag<String> frequencies = new HashBag<>(words);
411
412
// Find most common word
413
String mostCommon = frequencies.uniqueSet().stream()
414
.max(Comparator.comparing(frequencies::getCount))
415
.orElse(null);
416
System.out.println("Most common: " + mostCommon); // "the" (appears 3 times)
417
418
// Get frequency distribution
419
Map<String, Integer> distribution = frequencies.uniqueSet().stream()
420
.collect(Collectors.toMap(
421
word -> word,
422
frequencies::getCount
423
));
424
```
425
426
### Statistical Operations
427
428
```java { .api }
429
import java.util.DoubleSummaryStatistics;
430
431
Bag<Integer> scores = new HashBag<>();
432
scores.add(85, 3); // 3 students scored 85
433
scores.add(90, 2); // 2 students scored 90
434
scores.add(75, 1); // 1 student scored 75
435
scores.add(95, 1); // 1 student scored 95
436
437
// Calculate statistics considering frequencies
438
DoubleSummaryStatistics stats = scores.uniqueSet().stream()
439
.flatMapToInt(score -> IntStream.generate(() -> score).limit(scores.getCount(score)))
440
.summaryStatistics();
441
442
double average = stats.getAverage(); // 85.71
443
int totalStudents = (int)stats.getCount(); // 7
444
int maxScore = (int)stats.getMax(); // 95
445
int minScore = (int)stats.getMin(); // 75
446
```
447
448
### Inventory Management
449
450
```java { .api }
451
public class Inventory {
452
private final Bag<String> stock = new HashBag<>();
453
454
public void addStock(String product, int quantity) {
455
stock.add(product, quantity);
456
}
457
458
public boolean sellProduct(String product, int quantity) {
459
if (stock.getCount(product) >= quantity) {
460
stock.remove(product, quantity);
461
return true;
462
}
463
return false;
464
}
465
466
public int getStockLevel(String product) {
467
return stock.getCount(product);
468
}
469
470
public Set<String> getProductsInStock() {
471
return stock.uniqueSet();
472
}
473
474
public int getTotalItems() {
475
return stock.size();
476
}
477
478
public boolean isInStock(String product) {
479
return stock.contains(product);
480
}
481
}
482
```
483
484
## Performance Considerations
485
486
### Choosing the Right Implementation
487
488
- **HashBag**: Use for fast lookups and when element ordering is not important
489
- Add/Remove/Contains: O(1) average case
490
- Memory efficient for sparse data
491
492
- **TreeBag**: Use when you need sorted iteration or range operations
493
- Add/Remove/Contains: O(log n)
494
- Maintains elements in sorted order
495
496
- **HashMultiSet**: Use when you need MultiSet interface semantics
497
- Similar performance to HashBag
498
- Different API for entry iteration
499
500
### Memory Usage
501
502
```java { .api }
503
// Memory-efficient for high counts
504
Bag<String> efficient = new HashBag<>();
505
efficient.add("repeated-item", 1_000_000); // Stores count, not 1M objects
506
507
// Less efficient with standard collections
508
List<String> inefficient = new ArrayList<>();
509
for (int i = 0; i < 1_000_000; i++) {
510
inefficient.add("repeated-item"); // Stores 1M string references
511
}
512
```
513
514
### Thread Safety Performance
515
516
```java { .api }
517
// For high-concurrency scenarios, consider external synchronization
518
ConcurrentHashMap<String, AtomicInteger> concurrentCounts = new ConcurrentHashMap<>();
519
520
// Atomic increment
521
concurrentCounts.computeIfAbsent("key", k -> new AtomicInteger(0)).incrementAndGet();
522
523
// Or use synchronized bag with careful synchronization
524
Bag<String> syncBag = BagUtils.synchronizedBag(new HashBag<>());
525
```
526
527
Bags and MultiSets provide powerful abstractions for counting and frequency analysis scenarios. Choose the appropriate implementation based on your performance requirements, thread safety needs, and whether you need ordered iteration.