0
# HTML Extraction
1
2
Parses HTML attributes to extract UnoCSS utility classes from attributify syntax, converting attribute-based utilities into standard UnoCSS selectors.
3
4
## Capabilities
5
6
### Extractor Function
7
8
Creates an extractor that parses HTML attributes for UnoCSS utilities.
9
10
```typescript { .api }
11
/**
12
* Creates an extractor that parses HTML attributes for UnoCSS utilities
13
* @param options - Configuration options for extraction behavior
14
* @returns UnoCSS Extractor object
15
*/
16
function extractorAttributify(options?: AttributifyOptions): Extractor;
17
18
interface Extractor {
19
name: '@unocss/preset-attributify/extractor';
20
extract(context: ExtractorContext): string[];
21
}
22
23
interface ExtractorContext {
24
code: string;
25
}
26
```
27
28
**Usage Examples:**
29
30
```typescript
31
import { extractorAttributify } from '@unocss/preset-attributify'
32
33
// Create extractor with default options
34
const extractor = extractorAttributify()
35
36
// Create extractor with custom options
37
const customExtractor = extractorAttributify({
38
ignoreAttributes: ['placeholder', 'data-testid'],
39
prefixedOnly: true,
40
prefix: 'ui-'
41
})
42
43
// Extract utilities from HTML
44
const utilities = extractor.extract({ code: htmlString })
45
```
46
47
### Default Ignored Attributes
48
49
List of attributes that are ignored by default during extraction.
50
51
```typescript { .api }
52
/**
53
* Default list of attributes to ignore during extraction
54
* These attributes typically contain non-utility values
55
*/
56
const defaultIgnoreAttributes: string[] = [
57
'placeholder',
58
'fill',
59
'opacity',
60
'stroke-opacity'
61
];
62
```
63
64
65
### Extraction Patterns
66
67
The extractor processes different types of attribute patterns:
68
69
**Non-valued Attributes:**
70
```html
71
<!-- Generates: [mt-2=""] -->
72
<div mt-2></div>
73
74
<!-- With trueToNonValued: true, also generates: [mt-2="true"] -->
75
<div mt-2="true"></div>
76
```
77
78
**Valued Attributes:**
79
```html
80
<!-- Generates: [bg~="blue-500"], [bg~="hover:red-500"] -->
81
<div bg="blue-500 hover:red-500"></div>
82
83
<!-- Generates: [text~="sm"], [text~="white"] -->
84
<div text="sm white"></div>
85
```
86
87
**Class Attributes (Special Case):**
88
```html
89
<!-- Processes as regular classes, not attributify -->
90
<div class="text-red-500 bg-blue-500"></div>
91
<div className="text-red-500 bg-blue-500"></div>
92
```
93
94
**Prefixed Attributes:**
95
```html
96
<!-- With prefix="un-" and prefixedOnly=true -->
97
<div un-bg="blue-500" un-text="white"></div>
98
```
99
100
### Extraction Logic
101
102
The extractor follows this processing logic:
103
104
1. **Parse HTML Elements**: Uses regex to find HTML elements and their attributes
105
2. **Extract Attributes**: Parses attribute name-value pairs from elements
106
3. **Filter Ignored**: Skips attributes in the `ignoreAttributes` list
107
4. **Handle Prefixes**: Strips framework prefixes (`v-bind:`, `:`) from attribute names
108
5. **Process Values**:
109
- Non-valued attributes: Generate `[name=""]` selectors
110
- Class attributes: Process as regular utility classes
111
- Other attributes: Split values and generate `[name~="value"]` selectors
112
6. **Apply Options**: Respect `prefixedOnly`, `prefix`, and other configuration
113
114
### Extraction Examples
115
116
**Input HTML:**
117
```html
118
<button
119
bg="blue-500 hover:blue-600"
120
text="white sm"
121
p="x-4 y-2"
122
border="rounded"
123
disabled
124
>
125
Click me
126
</button>
127
```
128
129
**Extracted Selectors:**
130
```typescript
131
[
132
'[bg~="blue-500"]',
133
'[bg~="hover:blue-600"]',
134
'[text~="white"]',
135
'[text~="sm"]',
136
'[p~="x-4"]',
137
'[p~="y-2"]',
138
'[border~="rounded"]',
139
'[disabled=""]'
140
]
141
```
142
143
### Internal Regex Patterns
144
145
The extractor uses these internal regex patterns for parsing HTML:
146
147
```typescript { .api }
148
/**
149
* Matches HTML elements with their attributes
150
* Used to identify elements that may contain attributify utilities
151
*/
152
const elementRE: RegExp;
153
154
/**
155
* Matches attribute name-value pairs within HTML elements
156
* Captures both the attribute name and its value
157
*/
158
const valuedAttributeRE: RegExp;
159
160
/**
161
* Splits attribute values on whitespace and quote boundaries
162
* Used to separate multiple utilities within a single attribute
163
*/
164
const splitterRE: RegExp;
165
```
166
167
### Framework Integration
168
169
The extractor handles framework-specific attribute prefixes:
170
171
```typescript { .api }
172
const strippedPrefixes: string[] = [
173
'v-bind:', // Vue.js v-bind prefix
174
':' // Vue.js shorthand prefix
175
];
176
```
177
178
**Vue.js Examples:**
179
```html
180
<!-- These are equivalent -->
181
<div v-bind:bg="color"></div>
182
<div :bg="color"></div>
183
<div bg="blue-500"></div>
184
```
185
186
### Error Handling
187
188
The extractor gracefully handles:
189
190
- Malformed HTML attributes
191
- Empty attribute values
192
- Nested HTML in attribute values
193
- Special characters in attribute names
194
- Invalid utility names (filtered by `isValidSelector`)