Agent skills for iOS, iPadOS, Swift, SwiftUI, and modern Apple framework development.
71
89%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Complete reference for running open-source LLMs on Apple platforms using MLX Swift and llama.cpp.
Apple's ML framework for Swift. Highest sustained generation throughput on Apple Silicon via unified memory architecture.
import MLX
import MLXLLM
let config = ModelConfiguration(
id: "mlx-community/Mistral-7B-Instruct-v0.3-4bit"
)
let model = try await LLMModelFactory.shared.loadContainer(
configuration: config
)
try await model.perform { context in
let input = try await context.processor.prepare(
input: UserInput(prompt: "Hello")
)
let stream = try generate(
input: input,
parameters: GenerateParameters(temperature: 0.0),
context: context
)
for await part in stream {
print(part.chunk ?? "", terminator: "")
}
}| Device | RAM | Recommended Model | Disk Size | RAM Usage |
|---|---|---|---|---|
| iPhone 12-14 | 4-6 GB | SmolLM2-135M or Qwen 2.5 0.5B | ~278 MB | ~0.3 GB |
| iPhone 15 Pro+ | 8 GB | Gemma 3n E4B 4-bit | ~2.7 GB | ~3.5 GB |
| Mac 8 GB | 8 GB | Llama 3.2 3B 4-bit | ~1.8 GB | ~3 GB |
| Mac 16 GB+ | 16 GB+ | Mistral 7B 4-bit | ~4 GB | ~6 GB |
MLX.GPU.set(cacheLimit: 512 * 1024 * 1024) // 512 MB@Observable
class ModelManager {
private var model: LLMModelContainer?
private var generationCount = 0
func loadModel() async throws {
let config = ModelConfiguration(
id: "mlx-community/Llama-3.2-3B-Instruct-4bit"
)
model = try await LLMModelFactory.shared.loadContainer(
configuration: config
)
}
func unloadModel() {
model = nil
MLX.GPU.set(cacheLimit: 0)
}
}Key lifecycle patterns:
// iOS: Observe app lifecycle
NotificationCenter.default.addObserver(
forName: UIApplication.didEnterBackgroundNotification,
object: nil, queue: .main
) { _ in
modelManager.cancelGeneration()
Task {
try await Task.sleep(for: .seconds(5))
modelManager.unloadModel()
}
}C/C++ LLM inference engine. Best cross-platform support. Uses GGUF model format.
import SwiftLlamaCpp
let service = LlamaService(
modelUrl: modelURL,
config: .init(
batchSize: 256,
maxTokenCount: 4096,
useGPU: true
)
)
let messages = [
LlamaChatMessage(role: .system, content: "You are helpful."),
LlamaChatMessage(role: .user, content: "Hello")
]
let stream = try await service.streamCompletion(
of: messages,
samplingConfig: .init(temperature: 0.8)
)
for try await token in stream {
print(token, terminator: "")
}| Level | Quality | Size | Use Case |
|---|---|---|---|
| Q2_K | Lowest | Smallest | Extreme memory constraints |
| Q4_K_M | Good | Balanced | Mobile devices (recommended) |
| Q5_K_M | Higher | Larger | When quality matters more |
| Q8_0 | Near-original | Largest | Desktop with ample RAM |
| Aspect | llama.cpp | MLX Swift |
|---|---|---|
| Model format | GGUF | Hugging Face / MLX format |
| Platform support | Cross-platform | Apple only |
| Throughput (Apple Silicon) | Good | Best |
| Model ecosystem | Broadest | mlx-community models |
| Maturity | Very mature | Evolving |
| Memory efficiency | Excellent | Good |
When an app needs multiple AI backends:
func respond(to prompt: String) async throws -> String {
// Try Foundation Models first (zero setup, best integration)
if SystemLanguageModel.default.isAvailable {
return try await foundationModelsRespond(prompt)
}
// Fall back to MLX Swift (best throughput)
if canLoadMLXModel() {
return try await mlxRespond(prompt)
}
// Fall back to llama.cpp (broadest compatibility)
if llamaModelAvailable() {
return try await llamaRespond(prompt)
}
throw AIError.noBackendAvailable
}actor ModelCoordinator {
private var activeBackend: Backend?
func withExclusiveAccess<T>(
_ work: () async throws -> T
) async rethrows -> T {
try await work()
}
enum Backend {
case foundationModels
case mlx
case llamaCpp
}
}Before reaching for custom models, consider built-in frameworks:
No model downloads required:
NLLanguageRecognizer -- Language detectionNLTokenizer -- Word, sentence, paragraph tokenizationNLTagger -- Parts of speech, named entity recognition, sentimentNLEmbedding -- Word and sentence vectors, similarity searchBuilt-in computer vision (legacy VN* API; for iOS 18+ prefer modern Swift equivalents like RecognizeTextRequest):
VNRecognizeTextRequest -- OCRVNClassifyImageRequest -- Image classificationVNDetectFaceRectanglesRequest -- Face detectionVNDetectHumanBodyPoseRequest -- Body pose estimationTraining custom classifiers directly on device or Mac:
session.prewarm() for Foundation Models before user interactionperform() call.fast recognition level for real-time camera processingskills
accessorysetupkit
references
activitykit
references
adattributionkit
references
alarmkit
references
app-clips
app-intents
references
app-store-optimization
app-store-review
apple-on-device-ai
appmigrationkit
references
audioaccessorykit
references
authentication
references
avkit
references
background-processing
references
browserenginekit
references
callkit
references
carplay
references
cloudkit
references
contacts-framework
references
core-bluetooth
references
core-data
core-motion
references
core-nfc
references
coreml
references
cryptokit
references
cryptotokenkit
references
debugging-instruments
device-integrity
references
dockkit
references
energykit
references
eventkit
references
financekit
references
focus-engine
gamekit
references
healthkit
references
homekit
references
ios-accessibility
ios-localization
ios-networking
ios-simulator
references
mapkit
metrickit
references
musickit
references
natural-language
references
paperkit
references
passkit
references
pdfkit
references
pencilkit
references
permissionkit
references
photokit
push-notifications
realitykit
references
relevancekit
references
scenekit
references
sensorkit
references
speech-recognition
spritekit
references
storekit
swift-api-design-guidelines
swift-architecture
swift-charts
references
swift-codable
swift-concurrency
swift-formatstyle
swift-language
swift-security
references
swift-testing
swiftdata
swiftlint
swiftui-animation
swiftui-gestures
references
swiftui-layout-components
swiftui-liquid-glass
references
swiftui-patterns
swiftui-performance
swiftui-uikit-interop
swiftui-webkit
tabletopkit
references
tipkit
references
vision-framework
weatherkit
references
widgetkit
references