OpenELM: Apple's Open Language Model Family

07/09/2025 15 min

Listen "OpenELM: Apple's Open Language Model Family"

Episode Synopsis

The provided May 2024 sources center around CoreNet, an Apple-developed library for training deep neural networks, and OpenELM, an efficient language model family built using CoreNet. CoreNet is a versatile toolkit supporting various tasks, including foundation models like large language models (LLMs), object classification, and semantic segmentation, with its development evolving from the earlier CVNets. A key innovation highlighted is OpenELM's layer-wise scaling strategy, which optimizes parameter allocation within transformer models to achieve superior accuracy with fewer pre-training tokens compared to other open LLMs. The resources emphasize reproducibility and transparency by providing comprehensive frameworks for OpenELM's training and evaluation, including code for inference and fine-tuning on Apple devices using the MLX library, and detailed benchmarks on both NVIDIA CUDA and Apple Silicon hardware.Sources:https://arxiv.org/pdf/2404.14619https://machinelearning.apple.com/research/openelmhttps://github.com/apple/corenethttps://github.com/apple/corenet/tree/main/projects/kv-prediction