Skip to main content

Supported models

Edge Kit supports local model directories that follow the expected model family layout for each engine.

Developer Preview

Model support is expanding during Developer Preview. Validate each model on the target device class before shipping.

Categories

CategoryEngineInputOutput
LLMLLMEngineText messagesStreaming text
VLMVLMEngineText messages and imagesStreaming text
STTSTTEngine for native ASR; WhisperEngine only as a preview bridgeAudioText
TTSTTSEngineTextPCM audio
CategoryRecommended starting point
LLMQwen3-4B-4bit, Qwen3.5-0.8B, Qwen3.5-4B-4bit, Qwen3.5-9B-4bit
VLMQwen3.5-4B-4bit VLM variant
STTQwen3-ASR-0.6B-8bit for STTEngine; Whisper-family files only when your app supplies a real Whisper binding
TTSQwen3-TTS-12Hz-0.6B-CustomVoice-bf16

Device fit

Model sizeRecommended device class
0.8BAny Apple Silicon device
4B8 GB or more unified memory recommended
9B16 GB or more unified memory recommended, or a validated high-memory iOS device

iOS memory limits are lower than physical RAM. Test on the exact device class you plan to support.

Model source

Preview models are distributed through Hugging Face. Edge Kit can load from:

  • A local directory that already contains the model files.
  • A ModelConfig entry that points to a Hugging Face repository.
  • A model exported from Edge Studio.
import EdgeInference

let engine = LLMEngine()
let modelURL = URL(fileURLWithPath: "/path/to/model")

try await engine.loadLocal(directory: modelURL)

Custom models

Use safetensors-format models compatible with the supported model families. For best results:

  • Keep tokenizer files next to config.json.
  • Test generation, memory use, and unload/reload behavior.
  • Export through Edge Studio when you need an Edge Kit-ready bundle.