or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

configuration.mdcontent-extraction.mdcore-purging.mdcss-variables.mdindex.mdsafelist-configuration.md

content-extraction.mddocs/

0

# Content Extraction

1

2

The flexible extractor system for analyzing different file types and extracting CSS selectors from content files.

3

4

## Capabilities

5

6

### Extractor Function Type

7

8

Function signature for content extractors that analyze files and return selectors.

9

10

```typescript { .api }

11

/**

12

* Function type for extracting selectors from content

13

* @param content - Content string to analyze

14

* @returns Extracted selectors as detailed object or string array

15

*/

16

type ExtractorFunction<T = string> = (content: T) => ExtractorResult;

17

```

18

19

### Extractor Result Types

20

21

Union type representing the output of extractor functions.

22

23

```typescript { .api }

24

type ExtractorResult = ExtractorResultDetailed | string[];

25

```

26

27

### Detailed Extractor Result

28

29

Comprehensive result object with categorized selectors extracted from content.

30

31

```typescript { .api }

32

interface ExtractorResultDetailed {

33

/** HTML/CSS attributes */

34

attributes: {

35

/** Attribute names (e.g., 'class', 'id', 'data-toggle') */

36

names: string[];

37

/** Attribute values */

38

values: string[];

39

};

40

/** CSS class names */

41

classes: string[];

42

/** CSS ID selectors */

43

ids: string[];

44

/** HTML tag names */

45

tags: string[];

46

/** Selectors that couldn't be categorized */

47

undetermined: string[];

48

}

49

```

50

51

### Extractor Configuration

52

53

Configuration object linking file extensions to their corresponding extractor functions.

54

55

```typescript { .api }

56

interface Extractors {

57

/** File extensions this extractor handles (e.g., ['.html', '.vue']) */

58

extensions: string[];

59

/** Extractor function for processing content */

60

extractor: ExtractorFunction;

61

}

62

```

63

64

**Usage Example:**

65

66

```typescript

67

import { PurgeCSS, ExtractorFunction } from "purgecss";

68

69

// Custom extractor for Vue files

70

const vueExtractor: ExtractorFunction = (content: string) => {

71

const classes = content.match(/class="([^"]+)"/g) || [];

72

return classes.map(cls => cls.replace(/class="([^"]+)"/, '$1').split(' ')).flat();

73

};

74

75

// Configure extractor

76

const extractors = [{

77

extensions: ['.vue'],

78

extractor: vueExtractor

79

}];

80

81

const results = await new PurgeCSS().purge({

82

content: ['src/**/*.vue'],

83

css: ['styles/*.css'],

84

extractors

85

});

86

```

87

88

## ExtractorResultSets Class

89

90

Management class for organizing and querying extracted selectors from content analysis.

91

92

### Constructor and Merging

93

94

```typescript { .api }

95

/**

96

* ExtractorResultSets constructor

97

* @param er - Initial extractor result to populate the sets

98

*/

99

constructor(er: ExtractorResult);

100

101

/**

102

* Merge another extractor result or result set into this one

103

* @param that - ExtractorResult or ExtractorResultSets to merge

104

* @returns This instance for method chaining

105

*/

106

merge(that: ExtractorResult | ExtractorResultSets): this;

107

```

108

109

### Selector Query Methods

110

111

Methods for checking the presence of specific selector types.

112

113

```typescript { .api }

114

/**

115

* Check if a CSS class name exists in the extracted selectors

116

* @param name - Class name to check

117

* @returns True if class is found

118

*/

119

hasClass(name: string): boolean;

120

121

/**

122

* Check if a CSS ID selector exists in the extracted selectors

123

* @param id - ID selector to check

124

* @returns True if ID is found

125

*/

126

hasId(id: string): boolean;

127

128

/**

129

* Check if an HTML tag name exists in the extracted selectors

130

* @param tag - Tag name to check

131

* @returns True if tag is found

132

*/

133

hasTag(tag: string): boolean;

134

135

/**

136

* Check if an attribute name exists in the extracted selectors

137

* @param name - Attribute name to check

138

* @returns True if attribute name is found

139

*/

140

hasAttrName(name: string): boolean;

141

142

/**

143

* Check if an attribute value exists in the extracted selectors

144

* @param value - Attribute value to check

145

* @returns True if attribute value is found

146

*/

147

hasAttrValue(value: string): boolean;

148

```

149

150

### Advanced Attribute Matching

151

152

Methods for sophisticated attribute selector matching.

153

154

```typescript { .api }

155

/**

156

* Check if any attribute values start with the given prefix

157

* @param prefix - Prefix to match against attribute values

158

* @returns True if matching prefix is found

159

*/

160

hasAttrPrefix(prefix: string): boolean;

161

162

/**

163

* Check if any attribute values end with the given suffix

164

* @param suffix - Suffix to match against attribute values

165

* @returns True if matching suffix is found

166

*/

167

hasAttrSuffix(suffix: string): boolean;

168

169

/**

170

* Check if any attribute values contain the given substring

171

* @param substr - Substring to match (supports space-separated words)

172

* @returns True if matching substring is found

173

*/

174

hasAttrSubstr(substr: string): boolean;

175

```

176

177

**Usage Examples:**

178

179

```typescript

180

import { PurgeCSS, ExtractorResultSets } from "purgecss";

181

182

const purgeCSS = new PurgeCSS();

183

184

// Extract selectors from files

185

const selectors = await purgeCSS.extractSelectorsFromFiles(

186

['src/**/*.html', 'src/**/*.js'],

187

[{ extensions: ['.html', '.js'], extractor: defaultExtractor }]

188

);

189

190

// Query the extracted selectors

191

if (selectors.hasClass('btn-primary')) {

192

console.log('btn-primary class found in content');

193

}

194

195

if (selectors.hasAttrPrefix('data-')) {

196

console.log('Data attributes found in content');

197

}

198

199

// Extract from raw content

200

const rawSelectors = await purgeCSS.extractSelectorsFromString([

201

{ raw: '<div class="hero-banner" id="main"></div>', extension: 'html' }

202

], extractors);

203

204

// Merge multiple result sets

205

const combinedSelectors = new ExtractorResultSets([])

206

.merge(selectors)

207

.merge(rawSelectors);

208

```

209

210

## Utility Functions

211

212

### Merge Extractor Selectors

213

214

Utility function for merging multiple extractor results into a single set.

215

216

```typescript { .api }

217

/**

218

* Merge multiple extractor selectors into a single result set

219

* @param extractors - Variable number of ExtractorResultDetailed or ExtractorResultSets

220

* @returns Merged ExtractorResultSets containing all selectors

221

*/

222

function mergeExtractorSelectors(

223

...extractors: (ExtractorResultDetailed | ExtractorResultSets)[]

224

): ExtractorResultSets;

225

```

226

227

**Usage Example:**

228

229

```typescript

230

import { mergeExtractorSelectors } from "purgecss";

231

232

const htmlSelectors = await purgeCSS.extractSelectorsFromFiles(['*.html'], htmlExtractors);

233

const jsSelectors = await purgeCSS.extractSelectorsFromFiles(['*.js'], jsExtractors);

234

235

const combinedSelectors = mergeExtractorSelectors(htmlSelectors, jsSelectors);

236

```

237

238

## Default Extractor

239

240

The default extractor used when no specific extractor is configured for a file type.

241

242

**Default Implementation:**

243

```typescript

244

const defaultExtractor = (content: string): ExtractorResult =>

245

content.match(/[A-Za-z0-9_-]+/g) || [];

246

```

247

248

The default extractor uses a simple regex pattern to match alphanumeric sequences, underscores, and hyphens - suitable for basic CSS class names and IDs.