Skip to main content

Getting started

Install Edge Kit, load a local model, and stream tokens from an on-device LLM.

Developer Preview

Edge Kit is in Developer Preview. Pin the package version you test with and re-validate on real devices after each upgrade.

Requirements

Edge Kit ships on Apple platforms. Android, Linux, HarmonyOS, and Windows support is planned.

RequirementVersion
iOS17.0 or later
macOS14.0 or later
Xcode15 or later
Swift5.9 or later

For iOS apps that run larger models, enable the Increased Memory Limit entitlement in your agent target.

Install with Swift Package Manager

Add Edge Kit to your package:

// Package.swift
dependencies: [
.package(url: "https://github.com/AtomGradient/edge-kit.git", exact: "1.0.0-rc97")
]

Pin preview releases exactly. Re-validate on real devices before moving to a newer 1.0.0-rcN tag.

Preview access

Some package resolution paths may require AtomGradient preview access or SSH access for transitive dependencies such as Edge Engine. Run swift package resolve in your development and CI environment to verify.

Then add the product you need:

.target(
name: "MyApp",
dependencies: [
.product(name: "EdgeInference", package: "edge-kit")
]
)

Use EdgeKit if you want the umbrella product:

.product(name: "EdgeKit", package: "edge-kit")

Run your first LLM

import EdgeInference

let engine = LLMEngine()
let modelURL = URL(fileURLWithPath: "/path/to/qwen-model")

try await engine.loadLocal(directory: modelURL)

for try await chunk in engine.generate(
messages: [.user("Write a one sentence definition of edge AI.")]
) {
print(chunk.text, terminator: "")
}

Prepare a registered model

ModelConfig contains entries for supported model families. Prepare the model with EdgeModelKit, then load the local cache directory with loadLocal(directory:).

import EdgeInference
import EdgeModelKit

let engine = LLMEngine()

guard let config = ModelConfig.find(modelID: "qwen3.5-9b-4bit") else {
throw EdgeRuntimeError.modelNotFound("qwen3.5-9b-4bit")
}

try await HFDownloader.shared.download(config: config) { progress in
print("Download progress:", progress)
}

try await engine.loadLocal(directory: ModelCache.shared.cachedURL(for: config))

Run your first VLM

Use VLMEngine when the model accepts images and text.

import EdgeInference

let engine = VLMEngine()
let modelURL = URL(fileURLWithPath: "/path/to/vlm-model")
let imageURL = URL(fileURLWithPath: "/path/to/image.jpg")

try await engine.loadLocal(directory: modelURL)

for try await chunk in engine.generate(
messages: [.user("Describe this image in one paragraph.")],
images: [imageURL]
) {
print(chunk.text, terminator: "")
}

On iOS, prefer the ciImages: overload after loading an image into memory:

for try await chunk in engine.generate(
messages: [.user("What is visible in this photo?")],
ciImages: [ciImage]
) {
print(chunk.text, terminator: "")
}

Next steps

TaskGuide
Text generationLLM guide
Vision-language inferenceVLM guide
Model cache and downloadsModel management
iOS memory guidanceMemory management
Platform supportPlatform requirements