Apple's third-generation Foundation Models explained

During the WWDC26 keynote, Apple announced the third generation of Apple Foundation Models (AFM), which consists of five models, some on-premise, some cloud-based, and one residing on Google servers running on Nvidia chips. Here’s a snippet of how it will work.

A little background

When Apple first announced its foundational models In 2024the lineup includes an on-device language model with nearly 3 billion parameters and a “larger server-based language model available with Private Cloud Computing and running on Apple silicon servers,” according to the company. put time.

It was Private Cloud Computing Ambitious as it aims to deliver cloud-based AI capabilities while maintaining the same privacy guarantees users expect from on-device processing.

That’s why it was important to keep everything at home. Private Cloud Computing ran in Apple data centers, on servers equipped with Apple silicon. However, its privacy guarantees can be independently verified by third-party security researchers.

However, as Apple struggles to get its AI aspirations off the ground, the company became a partner Together with Google, it will use Gemini as the basis of its new AI efforts, the results of which were announced during the WWDC26 keynote earlier this week.

Apple’s new flagship models

Third-generation AFMs include five models: AFM 3 core and AFM 3 Code Advancedmodels on the device and AFM Cloud, ADM 3 Cloud (Image)and AFM 3 Cloud Prowhich is server-based. In ADM 3 Cloud (Figure), the D stands for diffusion, a technology we’ve covered in the past here.

All other models except AFM 3 Cloud Pro are designed to work on Apple silicon devices. AFM 3 Cloud Pro, meanwhile, It runs on NVIDIA GPUs hosted on Google Cloud.

This became possible after Apple extended According to the company, it introduces its Private Cloud Computing architecture to a third-party infrastructure for the first time, “while maintaining Apple’s strong security and privacy protections.”

As for the models themselves, here’s a breakdown of each, as explained by Apple:

AFM 3 Core, the next generation of our dense model with 3 billion parameters, providing a step up in quality.

AFM 3 Core Advanced, our most powerful device model. It is natively multimodal, providing useful features such as expressive sounds and higher-accuracy spelling. Built on cutting-edge Apple research, this 20-billion-parameter model uses a sparse architecture, enabling only 1 to 4 billion parameters at a time, depending on demand. AFM 3 Core Advanced is unlocked and optimized by our most capable Apple silicon systems.

AFM 3 Cloud, our server-side workload optimized for speed, efficiency and performance.

ADM 3 Cloud (Image) for image creation and editing, which unlocks advanced photo editing tools, the all-new Image Playground, and more.

AFM 3 Cloud Pro is our most capable server-based model and powers our most demanding use cases, such as agent tooling and complex reasoning.

The highlights here are AFM 3 Core Advanced and AFM 3 Cloud Pro.

Starting with AFM 3 Core Advanced, it packs 20 billion parameters into an on-device model, which is no small feat. Most device models aimed at the general public tend to stay in the low single-digit billion settings.

To make AFM 3 Core Advanced work well, Apple used a sparse architecture that activates up to 4 billion parameters simultaneously, depending on the processor, rather than a dense architecture that must keep all 20 billion parameters active for each request.

Although conceptually similar A Mixed Approach of Expertsthis selective activation is based on a technique invented by Apple and detailed in an interesting study Tutorial – Pruning for Large Language Models released a year ago.

As for AFM 3 Cloud Pro, it is one that runs on external infrastructure. You can read some technical details of this expansion here this article It was posted on Apple’s Security blog earlier this week, but here’s the most important part:

On this foundation, Apple and Google have collaborated to create capabilities that go beyond traditional secret computing deployments:

We do not rely solely on secret computing technologies to mitigate attacks that use privileged access outside the secret VM, including side channel attacks. We consider every component—from software to host and guest OS stacks to application code—part of our trusted computing base, subject to our auditable transparency and unprivileged access guarantees.

To reduce the risk of supply chain attacks, we maintain a cryptographically verified, append-only ledger of all Google Cloud hardware that is part of the PCC fleet. Software attestation for components that can be exploited to compromise user data relies on at least two separate roots of trust from independent vendors.

Even when embedded with secret computing, we believe that the inference stack should be designed with privacy and security in mind from the start. PCC in Google Cloud uses the same architectural security patterns as PCC in Apple silicon to implement these layered protections: analysis of the initial network data for each request occurs in a dedicated process in its own namespace, the shared result program is recycled in a short period of time, and verified keys are stored in a special secret entry, separate from the external VM.

On the Machine Learning Research blog, Apple he says All five models “share a common initial foundation before specializing for their respective architectures and use cases, adding multimodal capabilities such as audio, image understanding, long-context reasoning, and high-quality visual generation,” he said.

The company adds that it used “a mix of data that includes public data, data licensed or purchased from third parties, open source data, data derived from proprietary research, and synthetic data.” Apple also emphasizes that there is no user data or interaction in the training process, and that web publishers can opt out of the underlying model training.

Results

Apple said it conducted extensive human evaluations of its third-generation foundation models, with internal reviewers evaluating responses in categories such as following instructions, accuracy, presentation and image understanding.

Models are evaluated against their predecessors (where applicable) and you can see some of the results below:

A selection of preferred responses in side-by-side human evaluations of overall text capabilities, comparing AFM 3 Core and AFM 3 Cloud to our previous generation models. The results are presented in four different local groups to demonstrate consistent performance in the international variants. “English” represents our global English language assessment set, and “PFIGSCJK”, “DNNSTV” and “AFIHHMPRTU” represent the rest of our supported global languages.

Proportion of preferred responses in side-by-side human assessments of descriptive comprehension in English. The results compare AFM 3 Core and AFM 3 Cloud to their 2025 predecessors.

Proportion of preferred responses in side-by-side human ratings for dictation tasks. The results compare AFM 3 Core Advanced with Apple’s current production spelling system across seven quality measures. AFM 3 Core Advanced demonstrates a positive win rate in overall quality with a consistently widening advantage across all individual formatting and comprehension metrics.

For an even deeper look at the third-generation Apple Foundation Models, follow this link.

It’s worth checking out on Amazon

Add 9to5Mac as a preferred resource on Google

FTC: We use automatic affiliate links that generate income. More.

Source link

Apple’s third-generation Foundation Models explained

A little background

Apple’s new flagship models

Results

It’s worth checking out on Amazon

Leave a ReplyCancel Reply

Anthropic’s new data retention policies temporarily prohibit Microsoft employees from using Claude Fable 5 AI.

Context compression finally works in production: new study reduces LLM input by 16x without hitting accuracy

All the technology that helps you eliminate calls

A little background

Apple’s new flagship models

Results

It’s worth checking out on Amazon

Leave a ReplyCancel Reply

Trending now

Anthropic’s new data retention policies temporarily prohibit Microsoft employees from using Claude Fable 5 AI.

Context compression finally works in production: new study reduces LLM input by 16x without hitting accuracy

All the technology that helps you eliminate calls