Running native models on Macs is faster with Ollama’s MLX support



Ollama, a runtime system for handling large language models on the local computer, has introduced open source support from Apple. MLX framework for machine learning. In addition, Ollama said it has improved caching performance and now supports Nvidia NVFP4 model compression format allows more efficient use of memory in certain models.

Together, these developments promise significantly improved performance on Macs with Apple Silicon chips (M1 or later), and the timing couldn’t be better, as native models begin to gain steam in ways they haven’t before outside of the researcher and hobbyist communities.

OpenClaw’s latest runaway success, reaching over 300,000 stars on GitHubmade headlines with experiences like Moltbook and became an obsession in China especially— many people test models that work on their cars.

As developers grew frustrated with rate restrictions and the high cost of premium subscriptions to tools like Claude Code or ChatGPT Codex, experimentation with native coding models heated up. (Ollama also recently expanded its Visual Studio Code integration.)

The new support is available in preview (in Ollama 0.19) and currently only supports one model A 35 billion parameter variant of Alibaba’s Qwen3.5. The hardware requirements are tight by the standards of ordinary users. Users need a Mac equipped with Apple Silicon, but they need at least 32 GB of RAM. Don’t be announcement.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *