Apple Debuts Third-Generation Foundation Models and AFM Core Advanced

Apple introduced the third generation of Apple Foundation Models (AFM), a family of five models spanning on-device and server deployments, in a June 8, 2026 post on its machine learning research site. The set includes two on-device models, AFM 3 Core and AFM 3 Core Advanced, and three server models that run on Private Cloud Compute: AFM 3 Cloud, ADM 3 Cloud (an image model), and AFM 3 Cloud Pro. Apple describes AFM 3 Core Advanced as a 20-billion-parameter, natively multimodal on-device model that uses a sparse architecture, activating only 1 to 4 billion parameters per request so it can run on Apple silicon. Apple worked with Google and NVIDIA to extend Private Cloud Compute for AFM 3 Cloud Pro to NVIDIA GPUs in Google Cloud while, Apple says, preserving its privacy guarantees. A January 12, 2026 joint statement from Apple and Google framed the next-generation AFM family as built with Google and its Gemini technology, though Apple's June 8 post emphasizes its own architecture and Apple silicon optimization.
What happened
Apple announced the third generation of Apple Foundation Models (AFM) in a June 8, 2026 post on its machine learning research site, describing a family of five models that run across devices and Apple's Private Cloud Compute. The family includes two on-device models, AFM 3 Core (the successor to Apple's roughly 3-billion-parameter dense model) and AFM 3 Core Advanced, plus three server models: AFM 3 Cloud, ADM 3 Cloud (a dedicated image model for creation, editing, and Genmoji), and AFM 3 Cloud Pro. Apple says AFM 3 Core Advanced is its most powerful on-device model, a 20-billion-parameter, natively multimodal system that uses a sparse architecture to activate only 1 to 4 billion parameters at a time depending on the request.
Technical details
Apple frames the sparse design as how it fits a 20-billion-parameter model onto consumer hardware: the full parameter store is retained, but only a 1-to-4-billion-parameter subset is active per prompt, easing memory and compute pressure on Apple silicon. AFM 3 Core, AFM 3 Core Advanced, AFM 3 Cloud, and ADM 3 Cloud (Image) are optimized to run on Apple silicon, while AFM 3 Cloud Pro, positioned for the most demanding agentic tool use and complex reasoning, is optimized for NVIDIA GPUs. Apple says it worked with Google and NVIDIA to extend Private Cloud Compute so AFM 3 Cloud Pro can run on NVIDIA GPUs in Google Cloud while preserving the same privacy guarantees Apple describes for on-device and Apple-silicon server inference, namely that user data is not stored or shared, including with Apple.
The Google and NVIDIA partnership
A January 12, 2026 joint statement from Apple and Google characterized the next-generation AFM family as built in collaboration with Google and based on its Gemini technology and cloud infrastructure. Apple's June 8 technical post emphasizes its own model architecture and Apple-silicon optimization, and some independent reporting describes the on-device models as distilled from Gemini rather than running Gemini directly. The practical division reported around the launch is that Google and NVIDIA supply cloud capacity and GPUs for the most capable server model, while Apple retains control of the on-device stack.
Why it matters
For practitioners, the release illustrates two converging trends. First, sparse activation and memory-efficient parameter storage are becoming standard tools for pushing larger, multimodal models onto constrained consumer silicon, with AFM 3 Core Advanced's 20-billion-parameter, 1-to-4-billion-active design a prominent example. Second, even a vendor with deep in-house silicon and model capability is leaning on external frontier-model and cloud partners for its most demanding server workloads, a hybrid device-plus-cloud pattern that blends local inference with privacy-scoped cloud compute.
What to watch
Open questions include the technical specifics Apple may publish on its sparse-activation routing and parameter storage, benchmarks comparing AFM 3 Core Advanced against dense and other sparse on-device models, the developer APIs Apple exposes for on-device versus server calls, and how the NVIDIA-GPU-in-Google-Cloud path performs and scales under Private Cloud Compute. Compatibility across Apple silicon generations and the real memory and latency tradeoffs for multimodal workloads will determine how widely AFM 3 Core Advanced can actually be deployed.
Scoring Rationale
Verified: Apple's third-generation AFM family is a flagship release spanning a novel 20-billion-parameter sparse on-device model and Private Cloud Compute server models, with the most capable server model running on NVIDIA GPUs in Google Cloud. A major, deployment-defining model release for a billion-device ecosystem and highly relevant to on-device and hybrid-inference practitioners, though scoped to Apple's own platform rather than a field-wide frontier shift.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems


