Tessl Tile for maven/org.apache.flink/flink-sql-connector-hive-3.1.2_2.11@1.14.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

catalog.md configuration.md data-source.md functions.md index.md table-api.md

functions.mddocs/

0
# Function Module
1

2
Integration with Hive built-in functions through Flink's module system, enabling access to hundreds of Hive functions within Flink SQL queries and providing seamless function compatibility between Hive and Flink.
3

4
## Capabilities
5

6
### HiveModule
7

8
Main module class that provides access to Hive built-in functions within Flink.
9

10
```java { .api }
11
/**
12
 * Module providing Hive built-in functions to Flink
13
 */
14
public class HiveModule implements Module {
15
    
16
    /**
17
     * List all available Hive functions
18
     * @return Set of function names
19
     */
20
    public Set<String> listFunctions();
21
    
22
    /**
23
     * Get function definition by name
24
     * @param name - Function name
25
     * @return Optional function definition
26
     */
27
    public Optional<FunctionDefinition> getFunctionDefinition(String name);
28
    
29
    /**
30
     * Get Hive version used by this module
31
     * @return Hive version string
32
     */
33
    public String getHiveVersion();
34
}
35
```
36

37
**Usage Examples:**
38

39
```java
40
import org.apache.flink.table.api.TableEnvironment;
41
import org.apache.flink.table.module.hive.HiveModule;
42

43
// Create table environment
44
TableEnvironment tableEnv = TableEnvironment.create(settings);
45

46
// Load HiveModule to access Hive functions
47
tableEnv.loadModule("hive", new HiveModule());
48

49
// Use Hive functions in SQL queries
50
tableEnv.executeSql(
51
    "SELECT " +
52
    "  regexp_replace(name, '[0-9]', '') as clean_name, " +  // Hive regex function
53
    "  from_unixtime(timestamp) as formatted_time, " +        // Hive date function
54
    "  size(split(tags, ',')) as tag_count " +                // Hive array functions
55
    "FROM my_table"
56
);
57

58
// Use Hive aggregate functions
59
tableEnv.executeSql(
60
    "SELECT " +
61
    "  department, " +
62
    "  percentile_approx(salary, 0.5) as median_salary, " +   // Hive percentile
63
    "  collect_set(role) as unique_roles " +                  // Hive collection function
64
    "FROM employees " +
65
    "GROUP BY department"
66
);
67
```
68

69
### HiveModuleFactory
70

71
Factory for creating HiveModule instances from configuration properties.
72

73
```java { .api }
74
/**
75
 * Factory for creating HiveModule instances
76
 */
77
public class HiveModuleFactory implements ModuleFactory {
78
    
79
    /**
80
     * Create HiveModule from configuration
81
     * @param properties - Configuration properties
82
     * @return HiveModule instance
83
     */
84
    public Module createModule(Map<String, String> properties);
85
    
86
    /**
87
     * Get factory identifier
88
     * @return Factory identifier string
89
     */
90
    public String factoryIdentifier();
91
    
92
    /**
93
     * Get required configuration options
94
     * @return Set of required options
95
     */
96
    public Set<ConfigOption<?>> requiredOptions();
97
    
98
    /**
99
     * Get optional configuration options
100
     * @return Set of optional options
101
     */
102
    public Set<ConfigOption<?>> optionalOptions();
103
}
104
```
105

106
### Configuration Options
107

108
Configuration options for HiveModule behavior and version specification.
109

110
```java { .api }
111
/**
112
 * Configuration options for HiveModule
113
 */
114
@PublicEvolving
115
public class HiveModuleOptions {
116
    
117
    /** Hive version for module functionality */
118
    public static final ConfigOption<String> HIVE_VERSION;
119
}
120
```
121

122
### SQL Parser Integration
123

124
Hive SQL dialect support through Flink's parser factory system.
125

126
```java { .api }
127
/**
128
 * Parser factory for Hive SQL dialect support
129
 */
130
public class HiveParserFactory implements ParserFactory {
131
    
132
    /**
133
     * Create parser instance for Hive SQL dialect
134
     * @return Parser instance
135
     */
136
    public Parser create();
137
    
138
    /**
139
     * Get factory identifier
140
     * @return Factory identifier string
141
     */
142
    public String factoryIdentifier();
143
    
144
    /**
145
     * Get required configuration options
146
     * @return Set of required options
147
     */
148
    public Set<ConfigOption<?>> requiredOptions();
149
    
150
    /**
151
     * Get optional configuration options
152
     * @return Set of optional options
153
     */
154
    public Set<ConfigOption<?>> optionalOptions();
155
}
156
```
157

158
## Available Function Categories
159

160
### String Functions
161

162
Hive provides comprehensive string manipulation functions:
163

164
```sql
165
-- Pattern matching and replacement
166
SELECT regexp_replace(text, 'pattern', 'replacement') FROM table;
167
SELECT regexp_extract(text, 'pattern', group_index) FROM table;
168

169
-- String manipulation
170
SELECT concat_ws(',', col1, col2, col3) FROM table;
171
SELECT split(text, ',') FROM table;
172
SELECT trim(text) FROM table;
173

174
-- Case conversion
175
SELECT upper(text), lower(text) FROM table;
176
```
177

178
### Date and Time Functions
179

180
Access to Hive's date and time functions:
181

182
```sql
183
-- Date formatting and parsing
184
SELECT from_unixtime(timestamp_col) FROM table;
185
SELECT unix_timestamp(date_string, 'yyyy-MM-dd') FROM table;
186

187
-- Date arithmetic
188
SELECT date_add(date_col, 30) FROM table;
189
SELECT datediff(end_date, start_date) FROM table;
190

191
-- Date extraction
192
SELECT year(date_col), month(date_col), day(date_col) FROM table;
193
```
194

195
### Mathematical Functions
196

197
Comprehensive mathematical operations:
198

199
```sql
200
-- Aggregation functions
201
SELECT percentile_approx(value, 0.5) FROM table;
202
SELECT stddev_pop(value), variance(value) FROM table;
203

204
-- Mathematical operations
205
SELECT round(value, 2), ceil(value), floor(value) FROM table;
206
SELECT abs(value), pow(base, exponent) FROM table;
207
```
208

209
### Collection Functions
210

211
Array and map manipulation functions:
212

213
```sql
214
-- Array functions
215
SELECT size(array_col) FROM table;
216
SELECT array_contains(array_col, 'value') FROM table;
217
SELECT sort_array(array_col) FROM table;
218

219
-- Map functions
220
SELECT map_keys(map_col), map_values(map_col) FROM table;
221
SELECT map_size(map_col) FROM table;
222

223
-- Collection aggregation
224
SELECT collect_list(col), collect_set(col) FROM table GROUP BY key;
225
```
226

227
### Conditional Functions
228

229
Control flow and conditional logic:
230

231
```sql
232
-- Conditional expressions
233
SELECT if(condition, true_value, false_value) FROM table;
234
SELECT coalesce(col1, col2, 'default') FROM table;
235

236
-- Case expressions with Hive extensions
237
SELECT case 
238
  when col > 100 then 'high'
239
  when col > 50 then 'medium'
240
  else 'low'
241
end FROM table;
242
```
243

244
## Advanced Usage
245

246
### Module Loading and Configuration
247

248
Configure HiveModule with specific versions and options:
249

250
```java
251
// Load HiveModule with specific version
252
Map<String, String> moduleProperties = new HashMap<>();
253
moduleProperties.put("hive-version", "3.1.2");
254

255
HiveModuleFactory factory = new HiveModuleFactory();
256
Module hiveModule = factory.createModule(moduleProperties);
257
tableEnv.loadModule("hive", hiveModule);
258

259
// Use modules with priority
260
tableEnv.loadModule("hive", new HiveModule());
261
tableEnv.loadModule("core", CoreModule.INSTANCE);
262

263
// List loaded modules
264
String[] modules = tableEnv.listModules();
265
```
266

267
### Function Resolution Order
268

269
Understand how Flink resolves functions when multiple modules are loaded:
270

271
```java
272
// Modules are resolved in load order
273
tableEnv.loadModule("core", CoreModule.INSTANCE);  // First priority
274
tableEnv.loadModule("hive", new HiveModule());     // Second priority
275

276
// Explicitly use Hive functions when name conflicts exist
277
tableEnv.executeSql("SELECT hive.size(array_col) FROM table");
278
```
279

280
### Custom Function Integration
281

282
Combine Hive functions with custom UDFs:
283

284
```java
285
// Register custom UDF
286
tableEnv.createTemporarySystemFunction("my_func", MyCustomUDF.class);
287

288
// Use both Hive and custom functions
289
tableEnv.executeSql(
290
    "SELECT " +
291
    "  my_func(col1) as custom_result, " +
292
    "  regexp_replace(col2, 'pattern', 'replacement') as hive_result " +
293
    "FROM table"
294
);
295
```
296

297
### Performance Considerations
298

299
Optimize function usage for better performance:
300

301
```sql
302
-- Use Hive functions for complex string operations
303
SELECT regexp_replace(large_text, complex_pattern, replacement) FROM large_table;
304

305
-- Leverage Hive's optimized aggregate functions
306
SELECT percentile_approx(value, array(0.25, 0.5, 0.75)) FROM table GROUP BY key;
307

308
-- Use collect functions for data restructuring
309
SELECT key, collect_list(struct(col1, col2, col3)) as nested_data 
310
FROM table GROUP BY key;
311
```

Version

Tile

Files

functions.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

functions.mddocs/