or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-analysis.mddata-models.mdextensions-system.mdindex.mdlanguage-support.mdutility-functions.md

utility-functions.mddocs/

0

# Utility Functions

1

2

Helper functions for file processing, filtering, and output formatting. These utilities support the core analysis functionality with file discovery, result filtering, and various output options.

3

4

## Capabilities

5

6

### File Discovery

7

8

Functions for finding and filtering source files for analysis.

9

10

```python { .api }

11

def get_all_source_files(paths, exclude_patterns, lans):

12

"""

13

Gets all source files from paths with exclusion patterns and language filtering.

14

Includes gitignore support and recursive directory traversal.

15

16

Args:

17

paths (list): List of file or directory paths to search

18

exclude_patterns (list): List of glob patterns to exclude from analysis

19

lans (list): List of language names to filter (None for all languages)

20

21

Returns:

22

iterator: Iterator of filtered source file paths

23

24

Example:

25

files = get_all_source_files(

26

['src/', 'lib/'],

27

['*test*', '*.min.js', 'build/*'],

28

['python', 'javascript']

29

)

30

for filepath in files:

31

print(f"Found: {filepath}")

32

"""

33

```

34

35

### Result Filtering

36

37

Functions for filtering analysis results based on thresholds and criteria.

38

39

```python { .api }

40

def warning_filter(option, module_infos):

41

"""

42

Filters functions that exceed specified thresholds.

43

44

Args:

45

option: Configuration object with threshold settings (CCN, length, etc.)

46

module_infos: Iterator of FileInformation objects

47

48

Returns:

49

generator: Generator yielding functions exceeding thresholds

50

51

Example:

52

# option.CCN = 10, option.length = 50

53

warnings = warning_filter(options, analysis_results)

54

for func_info in warnings:

55

print(f"Warning: {func_info.name} exceeds thresholds")

56

"""

57

58

def whitelist_filter(warnings, script=None, whitelist=None):

59

"""

60

Filters warnings based on whitelist configuration.

61

Removes warnings for functions/files specified in whitelist.

62

63

Args:

64

warnings: Iterator of warning objects

65

script (str): Path to whitelist script (optional)

66

whitelist (str): Path to whitelist file (optional, default: "whitelizard.txt")

67

68

Returns:

69

generator: Generator yielding filtered warnings not in whitelist

70

71

Example:

72

filtered_warnings = whitelist_filter(warnings, whitelist="ignore.txt")

73

for warning in filtered_warnings:

74

print(f"Genuine warning: {warning}")

75

"""

76

```

77

78

### File Hashing

79

80

Function for generating file hashes for duplicate detection.

81

82

```python { .api }

83

def md5_hash_file(full_path_name):

84

"""

85

Calculates MD5 hash of a file for duplicate detection.

86

87

Args:

88

full_path_name (str): Full path to the file to hash

89

90

Returns:

91

str: MD5 hash string of file content

92

93

Example:

94

hash1 = md5_hash_file('src/file1.py')

95

hash2 = md5_hash_file('src/file2.py')

96

if hash1 == hash2:

97

print("Files are identical")

98

"""

99

```

100

101

### Output Functions

102

103

Functions for formatting and displaying analysis results in different styles.

104

105

```python { .api }

106

def print_clang_style_warning(code_infos, option, scheme, _):

107

"""

108

Prints warnings in clang/gcc compiler format.

109

110

Args:

111

code_infos: Iterator of code information objects

112

option: Configuration options object

113

scheme: Output formatting scheme

114

_: Unused parameter (for interface compatibility)

115

116

Returns:

117

int: Number of warnings printed

118

119

Example Output:

120

src/app.py:25: warning: function has high complexity (15)

121

"""

122

123

def print_msvs_style_warning(code_infos, option, scheme, _):

124

"""

125

Prints warnings in Microsoft Visual Studio format.

126

127

Args:

128

code_infos: Iterator of code information objects

129

option: Configuration options object

130

scheme: Output formatting scheme

131

_: Unused parameter

132

133

Returns:

134

int: Number of warnings printed

135

136

Example Output:

137

src/app.py(25) : warning: function has high complexity (15)

138

"""

139

140

def silent_printer(result, *_):

141

"""

142

Silent printer that exhausts result iterator without output.

143

Used for analysis without display output.

144

145

Args:

146

result: Iterator of results to consume

147

*_: Additional arguments (ignored)

148

149

Returns:

150

int: Always returns 0

151

152

Example:

153

# Analyze without printing results

154

exit_code = silent_printer(analysis_results)

155

"""

156

```

157

158

### Threading and Parallel Processing

159

160

Functions for managing multi-threaded analysis and parallel file processing.

161

162

```python { .api }

163

def map_files_to_analyzer(files, analyzer, working_threads):

164

"""

165

Maps files to analyzer using appropriate threading method.

166

167

Args:

168

files: Iterator of file paths to analyze

169

analyzer: FileAnalyzer instance to use for analysis

170

working_threads (int): Number of threads to use (1 for single-threaded)

171

172

Returns:

173

iterator: Results from analyzing files

174

175

Example:

176

analyzer = FileAnalyzer([])

177

files = ['app.py', 'utils.py', 'config.py']

178

results = map_files_to_analyzer(files, analyzer, 4)

179

for result in results:

180

print(f"Analyzed: {result.filename}")

181

"""

182

183

def get_map_method(working_threads):

184

"""

185

Returns appropriate mapping method based on thread count.

186

187

Args:

188

working_threads (int): Number of working threads

189

190

Returns:

191

function: Either multiprocessing.Pool.imap_unordered or built-in map

192

193

Example:

194

map_func = get_map_method(4) # Returns pool.imap_unordered

195

map_func = get_map_method(1) # Returns built-in map

196

"""

197

198

def print_extension_results(extensions):

199

"""

200

Prints results from analysis extensions that have print_result method.

201

202

Args:

203

extensions (list): List of extension objects

204

205

Example:

206

extensions = get_extensions(['wordcount', 'duplicate'])

207

print_extension_results(extensions)

208

"""

209

```

210

211

### Constants

212

213

Default configuration values used throughout the system.

214

215

```python { .api }

216

DEFAULT_CCN_THRESHOLD: int = 15

217

"""Default cyclomatic complexity threshold for warnings"""

218

219

DEFAULT_WHITELIST: str = "whitelizard.txt"

220

"""Default whitelist filename for filtering warnings"""

221

222

DEFAULT_MAX_FUNC_LENGTH: int = 1000

223

"""Default maximum function length threshold"""

224

```

225

226

## Usage Examples

227

228

### File Discovery with Filtering

229

230

```python

231

from lizard import get_all_source_files

232

233

# Find all Python and JavaScript files, excluding tests and build artifacts

234

source_files = get_all_source_files(

235

paths=['src/', 'lib/', 'app/'],

236

exclude_patterns=[

237

'*test*', # Exclude test files

238

'*Test*', # Exclude Test files

239

'*/tests/*', # Exclude tests directories

240

'*/node_modules/*', # Exclude npm dependencies

241

'*/build/*', # Exclude build artifacts

242

'*.min.js', # Exclude minified files

243

'*/migrations/*' # Exclude database migrations

244

],

245

lans=['python', 'javascript']

246

)

247

248

print("Source files found:")

249

for filepath in source_files:

250

print(f" {filepath}")

251

```

252

253

### Threshold-Based Filtering

254

255

```python

256

import lizard

257

from lizard import warning_filter

258

259

# Create configuration with custom thresholds

260

class AnalysisOptions:

261

def __init__(self):

262

self.CCN = 8 # Complexity threshold

263

self.length = 40 # Function length threshold

264

self.arguments = 4 # Parameter count threshold

265

self.nloc = 30 # Lines of code threshold

266

267

options = AnalysisOptions()

268

269

# Analyze files

270

results = lizard.analyze(['src/'])

271

272

# Filter functions exceeding thresholds

273

warnings = warning_filter(options, results)

274

275

print("Functions exceeding thresholds:")

276

for func_info in warnings:

277

issues = []

278

if func_info.cyclomatic_complexity > options.CCN:

279

issues.append(f"CCN={func_info.cyclomatic_complexity}")

280

if func_info.length > options.length:

281

issues.append(f"Length={func_info.length}")

282

if func_info.parameter_count > options.arguments:

283

issues.append(f"Args={func_info.parameter_count}")

284

if func_info.nloc > options.nloc:

285

issues.append(f"NLOC={func_info.nloc}")

286

287

print(f" {func_info.name}: {', '.join(issues)}")

288

```

289

290

### Whitelist Filtering

291

292

```python

293

from lizard import warning_filter, whitelist_filter

294

import lizard

295

296

# Create whitelist file

297

whitelist_content = """

298

# Ignore complex legacy functions

299

src/legacy.py:old_complex_function

300

src/legacy.py:another_complex_function

301

302

# Ignore generated code

303

src/generated/*

304

305

# Ignore specific patterns

306

*_test.py:*

307

"""

308

309

with open('project_whitelist.txt', 'w') as f:

310

f.write(whitelist_content)

311

312

# Analyze and filter

313

results = lizard.analyze(['src/'])

314

warnings = warning_filter(options, results)

315

filtered_warnings = whitelist_filter(warnings, whitelist='project_whitelist.txt')

316

317

print("Warnings after whitelist filtering:")

318

for warning in filtered_warnings:

319

print(f" {warning.name} in {warning.filename}")

320

```

321

322

### File Duplicate Detection

323

324

```python

325

from lizard import md5_hash_file

326

import os

327

328

def find_duplicate_files(directory):

329

"""Find duplicate files by MD5 hash comparison."""

330

file_hashes = {}

331

duplicates = []

332

333

for root, dirs, files in os.walk(directory):

334

for file in files:

335

if file.endswith(('.py', '.js', '.java', '.cpp')):

336

filepath = os.path.join(root, file)

337

try:

338

filehash = md5_hash_file(filepath)

339

if filehash in file_hashes:

340

duplicates.append((filepath, file_hashes[filehash]))

341

else:

342

file_hashes[filehash] = filepath

343

except Exception as e:

344

print(f"Error hashing {filepath}: {e}")

345

346

return duplicates

347

348

# Find duplicates in source directory

349

duplicates = find_duplicate_files('src/')

350

if duplicates:

351

print("Duplicate files found:")

352

for file1, file2 in duplicates:

353

print(f" {file1} == {file2}")

354

else:

355

print("No duplicate files found")

356

```

357

358

### Custom Output Formatting

359

360

```python

361

from lizard import print_clang_style_warning, print_msvs_style_warning

362

import lizard

363

364

class CustomOptions:

365

def __init__(self):

366

self.CCN = 10

367

self.length = 50

368

369

class CustomScheme:

370

def function_info(self, func):

371

return f"{func.name}: CCN={func.cyclomatic_complexity}, NLOC={func.nloc}"

372

373

options = CustomOptions()

374

scheme = CustomScheme()

375

376

# Analyze code

377

results = lizard.analyze(['src/'])

378

warnings = lizard.warning_filter(options, results)

379

380

# Print warnings in different formats

381

print("Clang-style warnings:")

382

clang_count = print_clang_style_warning(warnings, options, scheme, None)

383

384

print(f"\nTotal warnings: {clang_count}")

385

386

# Reset iterator for second format

387

warnings = lizard.warning_filter(options, lizard.analyze(['src/']))

388

print("\nVisual Studio-style warnings:")

389

msvs_count = print_msvs_style_warning(warnings, options, scheme, None)

390

```

391

392

### Silent Analysis

393

394

```python

395

from lizard import silent_printer

396

import lizard

397

398

# Perform analysis without output (for programmatic use)

399

results = lizard.analyze(['src/'])

400

401

# Count results without printing

402

result_list = list(results)

403

total_files = len(result_list)

404

total_functions = sum(len(fi.function_list) for fi in result_list)

405

406

print(f"Silent analysis complete:")

407

print(f" Files analyzed: {total_files}")

408

print(f" Functions found: {total_functions}")

409

410

# Use silent printer to consume iterator without output

411

results = lizard.analyze(['src/'])

412

exit_code = silent_printer(results)

413

print(f"Analysis exit code: {exit_code}")

414

```