or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-methods.mdauto-classes.mdcore-models.mdindex.mdlora-methods.mdprompt-learning.mdutilities.md

lora-methods.mddocs/

0

# LoRA Methods

1

2

Low-Rank Adaptation (LoRA) and related methods that decompose weight updates into low-rank matrices, enabling efficient fine-tuning with minimal parameter overhead. This includes standard LoRA, adaptive variants, and structural improvements.

3

4

## Capabilities

5

6

### Standard LoRA

7

8

Low-Rank Adaptation decomposes weight updates into two smaller matrices, dramatically reducing the number of trainable parameters.

9

10

```python { .api }

11

@dataclass

12

class LoraConfig(PeftConfig):

13

"""Configuration for LoRA (Low-Rank Adaptation)."""

14

r: int = 8 # LoRA attention dimension (rank)

15

lora_alpha: int = 8 # LoRA scaling parameter

16

target_modules: Optional[Union[List[str], str]] = None # Names of modules to apply LoRA to

17

exclude_modules: Optional[Union[List[str], str]] = None # Names of modules to exclude from LoRA

18

lora_dropout: float = 0.0 # LoRA dropout probability

19

fan_in_fan_out: bool = False # Set True if layer stores weight like (fan_in, fan_out)

20

bias: Literal["none", "all", "lora_only"] = "none" # Bias type for LoRA

21

use_rslora: bool = False # Whether to use rank-stabilized LoRA

22

modules_to_save: Optional[List[str]] = None # Modules apart from LoRA layers to be trainable

23

init_lora_weights: Union[bool, Literal["gaussian", "eva", "olora", "pissa", "corda", "loftq", "orthogonal"]] = True

24

layers_to_transform: Optional[Union[List[int], int]] = None # Layers to apply LoRA to

25

layers_pattern: Optional[str] = None # Pattern for layer names

26

rank_pattern: Optional[dict] = None # Mapping from layer names to different ranks

27

alpha_pattern: Optional[dict] = None # Mapping from layer names to different alphas

28

megatron_config: Optional[dict] = None # Megatron-specific configuration

29

megatron_core: Optional[str] = None # Megatron core module version

30

loftq_config: Optional[LoftQConfig] = None # LoftQ initialization configuration

31

use_dora: bool = False # Whether to use DoRA (Weight-Decomposed LoRA)

32

layer_replication: Optional[List[Tuple[int, int]]] = None # Layer replication for parameter sharing

33

runtime_config: Optional[LoraRuntimeConfig] = None # Runtime configuration for LoRA

34

eva_config: Optional[EvaConfig] = None # EVA initialization configuration

35

# Additional parameters for specific use cases

36

target_parameters: Optional[List[str]] = None # Parameters to target instead of modules

37

38

class LoraModel:

39

"""LoRA model implementation."""

40

def __init__(self, model, config: LoraConfig, adapter_name: str): ...

41

42

class LoraRuntimeConfig:

43

"""Runtime configuration for LoRA that can be changed during inference."""

44

def __init__(

45

self,

46

ephemeral_gpu_offload: bool = False,

47

**kwargs

48

): ...

49

```

50

51

### AdaLoRA (Adaptive LoRA)

52

53

Adaptive LoRA that dynamically allocates parameter budget across weight matrices based on importance scores.

54

55

```python { .api }

56

class AdaLoraConfig(PeftConfig):

57

"""Configuration for AdaLoRA (Adaptive LoRA)."""

58

def __init__(

59

self,

60

target_r: int = 8,

61

init_r: int = 12,

62

tinit: int = 0,

63

tfinal: int = 0,

64

deltaT: int = 1,

65

beta1: float = 0.85,

66

beta2: float = 0.85,

67

orth_reg_weight: float = 0.5,

68

total_step: Optional[int] = None,

69

rank_pattern: Optional[dict] = None,

70

**kwargs

71

):

72

"""

73

Args:

74

target_r: Target average rank of incremental matrix

75

init_r: Initial rank for each incremental matrix

76

tinit: Number of warmup steps for rank reduction

77

tfinal: Final step for rank reduction

78

deltaT: Step interval for rank reduction

79

beta1: Hyperparameter of EMA for sensitivity smoothing

80

beta2: Hyperparameter of EMA for undertainty quantification

81

orth_reg_weight: Orthogonal regularization weight

82

total_step: Total training steps (for automatic scheduling)

83

rank_pattern: Mapping from layer names to different target ranks

84

"""

85

86

class AdaLoraModel:

87

"""AdaLoRA model implementation."""

88

def __init__(self, model, config: AdaLoraConfig, adapter_name: str): ...

89

90

def update_and_allocate(self, global_step: int): ...

91

```

92

93

### LoRA Variants

94

95

Alternative LoRA formulations that modify the decomposition or combination strategy.

96

97

```python { .api }

98

class LoHaConfig(PeftConfig):

99

"""Configuration for LoHa (Low-Rank Hadamard Product)."""

100

def __init__(

101

self,

102

r: int = 8,

103

alpha: int = 8,

104

rank_dropout: float = 0.0,

105

module_dropout: float = 0.0,

106

use_effective_conv2d: bool = False,

107

**kwargs

108

):

109

"""

110

Args:

111

r: LoHa rank

112

alpha: LoHa alpha scaling parameter

113

rank_dropout: Rank dropout probability

114

module_dropout: Module dropout probability

115

use_effective_conv2d: Use parameter effective decomposition for Conv2d

116

"""

117

118

class LoHaModel:

119

"""LoHa model implementation."""

120

def __init__(self, model, config: LoHaConfig, adapter_name: str): ...

121

122

class LoKrConfig(PeftConfig):

123

"""Configuration for LoKr (Low-Rank Kronecker Product)."""

124

def __init__(

125

self,

126

r: int = 8,

127

alpha: int = 8,

128

rank_dropout: float = 0.0,

129

module_dropout: float = 0.0,

130

use_effective_conv2d: bool = False,

131

decompose_both: bool = False,

132

decompose_factor: int = -1,

133

**kwargs

134

):

135

"""

136

Args:

137

r: LoKr rank

138

alpha: LoKr alpha scaling parameter

139

rank_dropout: Rank dropout probability

140

module_dropout: Module dropout probability

141

use_effective_conv2d: Use parameter effective decomposition for Conv2d

142

decompose_both: Decompose both input and output dimensions

143

decompose_factor: Factor for matrix decomposition

144

"""

145

146

class LoKrModel:

147

"""LoKr model implementation."""

148

def __init__(self, model, config: LoKrConfig, adapter_name: str): ...

149

```

150

151

### Advanced LoRA Configurations

152

153

Specialized configurations and initialization methods for LoRA.

154

155

```python { .api }

156

@dataclass

157

class LoftQConfig:

158

"""Configuration for LoftQ initialization."""

159

loftq_bits: int = 4 # Quantization bits for LoftQ

160

loftq_iter: int = 1 # Number of LoftQ iterations

161

162

@dataclass

163

class EvaConfig:

164

"""Configuration for EVA (Eigenvalue Adaptation) initialization."""

165

rho: float = 2.0 # Rho value for EVA redistribution (>= 1.0)

166

tau: float = 0.99 # Cosine similarity threshold for early stopping

167

use_label_mask: bool = True # Use label mask for EVA initialization

168

label_mask_value: int = -100 # Value to look for to mask out ignored tokens

169

whiten: bool = False # Apply whitening to singular vectors

170

adjust_scaling_factors: bool = True # Adjust scaling factors during EVA

171

172

class VBLoRAConfig(PeftConfig):

173

"""Configuration for VBLoRA (Variable Budget LoRA)."""

174

def __init__(

175

self,

176

r: int = 8,

177

lora_alpha: int = 8,

178

target_modules: Optional[Union[List[str], str]] = None,

179

lora_dropout: float = 0.0,

180

**kwargs

181

): ...

182

183

class VBLoRAModel:

184

"""VBLoRA model implementation."""

185

def __init__(self, model, config: VBLoRAConfig, adapter_name: str): ...

186

187

class RandLoraConfig(PeftConfig):

188

"""Configuration for RandLoRA (Randomized LoRA)."""

189

def __init__(

190

self,

191

r: int = 8,

192

lora_alpha: int = 8,

193

target_modules: Optional[Union[List[str], str]] = None,

194

lora_dropout: float = 0.0,

195

**kwargs

196

): ...

197

198

class RandLoraModel:

199

"""RandLoRA model implementation."""

200

def __init__(self, model, config: RandLoraConfig, adapter_name: str): ...

201

```

202

203

### LoRA Utilities

204

205

Utility functions for LoRA weight management and initialization.

206

207

```python { .api }

208

def get_eva_state_dict(model, adapter_name: str = "default") -> dict:

209

"""

210

Get EVA state dictionary for LoRA model.

211

212

Args:

213

model: PEFT model with EVA initialization

214

adapter_name: Name of the adapter

215

216

Returns:

217

State dictionary for EVA weights

218

"""

219

220

def initialize_lora_eva_weights(model, adapter_name: str = "default"):

221

"""

222

Initialize LoRA weights using EVA method.

223

224

Args:

225

model: PEFT model to initialize

226

adapter_name: Name of the adapter to initialize

227

"""

228

229

def replace_lora_weights_loftq(

230

peft_model,

231

quantized_model,

232

num_iter: int = 1,

233

device: Optional[str] = None

234

):

235

"""

236

Replace LoRA weights with LoftQ initialization.

237

238

Args:

239

peft_model: PEFT model with LoRA adapters

240

quantized_model: Quantized base model

241

num_iter: Number of LoftQ iterations

242

device: Device to perform computation on

243

"""

244

```

245

246

## Usage Examples

247

248

### Basic LoRA Setup

249

250

```python

251

from transformers import AutoModelForCausalLM

252

from peft import get_peft_model, LoraConfig

253

254

model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")

255

256

# Standard LoRA configuration

257

lora_config = LoraConfig(

258

r=16,

259

lora_alpha=32,

260

target_modules=["c_attn", "c_proj"],

261

lora_dropout=0.1,

262

bias="none",

263

task_type="CAUSAL_LM"

264

)

265

266

peft_model = get_peft_model(model, lora_config)

267

```

268

269

### AdaLoRA with Dynamic Rank Allocation

270

271

```python

272

from peft import AdaLoraConfig

273

274

adalora_config = AdaLoraConfig(

275

target_r=8,

276

init_r=12,

277

tinit=200,

278

tfinal=1000,

279

deltaT=10,

280

beta1=0.85,

281

beta2=0.85,

282

orth_reg_weight=0.5,

283

task_type="CAUSAL_LM"

284

)

285

286

peft_model = get_peft_model(model, adalora_config)

287

288

# During training, call rank update

289

peft_model.peft_modules[adapter_name].update_and_allocate(global_step)

290

```

291

292

### LoRA with LoftQ Initialization

293

294

```python

295

from peft import LoraConfig, LoftQConfig

296

297

loftq_config = LoftQConfig(loftq_bits=4, loftq_iter=1)

298

299

lora_config = LoraConfig(

300

r=16,

301

lora_alpha=32,

302

target_modules=["q_proj", "v_proj"],

303

loftq_config=loftq_config,

304

task_type="CAUSAL_LM"

305

)

306

307

peft_model = get_peft_model(quantized_model, lora_config)

308

```

309

310

### DoRA (Weight-Decomposed LoRA)

311

312

```python

313

dora_config = LoraConfig(

314

r=8,

315

lora_alpha=16,

316

target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],

317

use_dora=True, # Enable DoRA

318

task_type="CAUSAL_LM"

319

)

320

321

peft_model = get_peft_model(model, dora_config)

322

```

323

324

### LoHa (Low-Rank Adaptation with Hadamard Product)

325

326

LoRA variant using Hadamard products for improved expressiveness with fewer parameters.

327

328

```python { .api }

329

class LoHaConfig(PeftConfig):

330

"""Configuration for LoHa (Low-Rank Adaptation with Hadamard Product)."""

331

def __init__(

332

self,

333

r: int = 8,

334

alpha: int = 8,

335

target_modules: Optional[Union[List[str], str]] = None,

336

exclude_modules: Optional[Union[List[str], str]] = None,

337

dropout: float = 0.0,

338

modules_to_save: Optional[List[str]] = None,

339

**kwargs

340

): ...

341

342

class LoHaModel:

343

"""LoHa model implementation."""

344

def __init__(self, model, config: LoHaConfig, adapter_name: str): ...

345

```

346

347

Usage:

348

```python

349

loha_config = LoHaConfig(

350

r=8,

351

alpha=16,

352

target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],

353

dropout=0.1,

354

task_type="CAUSAL_LM"

355

)

356

357

peft_model = get_peft_model(model, loha_config)

358

```

359

360

### LoKr (Low-Rank Adaptation with Kronecker Product)

361

362

LoRA variant using Kronecker products for structured low-rank decomposition.

363

364

```python { .api }

365

class LoKrConfig(PeftConfig):

366

"""Configuration for LoKr (Low-Rank Adaptation with Kronecker Product)."""

367

def __init__(

368

self,

369

r: int = 8,

370

alpha: int = 8,

371

target_modules: Optional[Union[List[str], str]] = None,

372

exclude_modules: Optional[Union[List[str], str]] = None,

373

dropout: float = 0.0,

374

modules_to_save: Optional[List[str]] = None,

375

decompose_both: bool = False,

376

decompose_factor: int = -1,

377

**kwargs

378

): ...

379

380

class LoKrModel:

381

"""LoKr model implementation."""

382

def __init__(self, model, config: LoKrConfig, adapter_name: str): ...

383

```

384

385

Usage:

386

```python

387

lokr_config = LoKrConfig(

388

r=8,

389

alpha=16,

390

target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],

391

decompose_both=True,

392

decompose_factor=8,

393

task_type="CAUSAL_LM"

394

)

395

396

peft_model = get_peft_model(model, lokr_config)

397

```

398

399

### LoftQ Configuration

400

401

Configuration class for LoftQ initialization used with LoRA methods.

402

403

```python { .api }

404

class LoftQConfig:

405

"""Configuration for LoftQ (LoRA-Fine-Tuning-aware Quantization)."""

406

def __init__(

407

self,

408

loftq_bits: int = 4,

409

loftq_iter: int = 1,

410

fake_quant: bool = True,

411

**kwargs

412

):

413

"""

414

Args:

415

loftq_bits: Number of bits for quantization

416

loftq_iter: Number of alternating steps

417

fake_quant: Whether to use fake quantization

418

"""

419

```

420

421

### Runtime Configuration

422

423

Runtime configuration for LoRA that can be modified during inference.

424

425

```python { .api }

426

class LoraRuntimeConfig:

427

"""Runtime configuration for LoRA."""

428

def __init__(

429

self,

430

ephemeral_gpu_offload: bool = False,

431

**kwargs

432

):

433

"""

434

Args:

435

ephemeral_gpu_offload: Whether to use ephemeral GPU offloading

436

"""

437

```