New Mac toolkit enables local multimodal fine-tuning of Gemma models

Developers can now fine-tune Google’s Gemma 3n and Gemma 4 models using image, audio, and text modalities directly on macOS hardware. The new toolkit, gemma-tuner-multimodal, leverages Apple’s Metal Performance Shaders (MPS) to execute training tasks that previously required expensive, high-end NVIDIA H100 GPU clusters.

By utilizing PyTorch and PEFT LoRA (Low-Rank Adaptation), the framework provides a native path for users to customize models for domain-specific tasks. This includes medical dictation, specialized legal transcription, or visual analysis of manufacturing defects and charts. Because the training runs locally, sensitive data never leaves the user's machine, satisfying privacy requirements for enterprise or personal use.

Streamlining complex workflows on local hardware

A primary challenge in training multimodal models is the sheer volume of data, which often exceeds the capacity of a standard laptop’s SSD. The toolkit addresses this by integrating cloud-native data streaming. Developers can pull datasets directly from Google Cloud Storage or BigQuery, allowing for the training of models on terabytes of information without requiring massive local storage.

"If you want to fine-tune Gemma on text, images, or audio without renting an H100 or copying a terabyte of data to your laptop, this is the only toolkit that does all three modalities on Apple Silicon," the project documentation states. The system is designed to be highly modular, with a hierarchical configuration system that allows users to define custom model profiles and dataset splits via simple INI files.

The project supports a variety of Gemma checkpoints, including the 2B and 4B variants of Gemma 4 and Gemma 3n. While larger models—such as the 26B or 31B versions—are not yet supported due to architectural differences, the current implementation covers the most common use cases for on-device AI tuning.

To begin, users require Python 3.10 or higher and a Mac running macOS 12.3 or later. The framework includes a command-line interface and a guided wizard to simplify the setup process, ensuring that the MPS environment is correctly initialized before training begins. Once training is complete, the toolkit exports the results as a merged Hugging Face or SafeTensors tree, making the fine-tuned adapters ready for immediate use in inference pipelines.

New Mac toolkit enables local multimodal fine-tuning of Gemma models

Streamlining complex workflows on local hardware

评论

继续阅读

更多人工智能

链上侦探ZachXBT揭露朝鲜非法加密货币支付网络

Fundstrat策略师Tom Lee称股市已触底加密货币迎来上涨契机

英国投入1500万英镑研发AI犯罪地图以遏制持刀暴力

最新消息

Bithumb 申请冻结资产以追回遗失的 800 万美元比特币

Adobe Reader零日漏洞遭长期利用用户打开PDF即可能被窃取数据

伊朗正式接受加密货币作为霍尔木兹海峡过境通行费

New Mac toolkit enables local multimodal fine-tuning of Gemma models

Streamlining complex workflows on local hardware

评论

继续阅读

更多人工智能

链上侦探ZachXBT揭露朝鲜非法加密货币支付网络

Fundstrat策略师Tom Lee称股市已触底 加密货币迎来上涨契机

英国投入1500万英镑研发AI犯罪地图以遏制持刀暴力

最新消息

Bithumb 申请冻结资产以追回遗失的 800 万美元比特币

Adobe Reader零日漏洞遭长期利用 用户打开PDF即可能被窃取数据

伊朗正式接受加密货币作为霍尔木兹海峡过境通行费

Fundstrat策略师Tom Lee称股市已触底加密货币迎来上涨契机

Adobe Reader零日漏洞遭长期利用用户打开PDF即可能被窃取数据