[Models](/models) / [Gemma 2 9B](/models?family=Gemma%202%209B#models-gallery) / Gemma 2 9B IT

![Google Gemma logo](/model-logos/gemma.svg)

# Run Gemma 2 9B IT on iPad & Mac

Gemma 2 9B IT runs 100% private on iPad & Mac inside Private LLM — no internet connection required, no data sent to any server.

9.2BiPadMac

## Download the quantized weights

These are the exact OmniQuant weights Private LLM runs for Gemma 2 9B IT, published on our Hugging Face org. They're standard weights you can load in any app that supports the format, not just Private LLM.

-   [iPad — 4-bit OmniQuant →](https://huggingface.co/numen-tech/gemma-2-9b-it-w4a16g128asym)
-   [Mac — 4-bit OmniQuant →](https://huggingface.co/numen-tech/gemma-2-9b-it-w4a16g128asym)

## Specifications

Parameters

9.2B

License

Gemma

Quantization

OmniQuant 4-bit

Family

Gemma 2 9B

## What Gemma 2 9B IT is good at

general

Gemma 2 9B IT is part of the Gemma 2 9B family, tuned for general use on Apple devices.

## Which of your devices can run it

### iPad

iPad Pro (M5, 16GB)iPad Pro (M4, 16GB)iPad Pro (M2, 16GB)iPad Pro (M1, 16GB)

### Mac

Mac (Apple Silicon, 192GB)MacBook Pro (M4 Max, 128GB)Mac Studio / Pro (Apple Silicon, 96GB)MacBook Pro (M4 Max, 64GB)MacBook Pro (M4 Max, 48GB)MacBook Pro (M4 Max, 36GB)Mac (Apple Silicon, 32GB)MacBook Air (M4, 24GB)MacBook Air (M-series, 16GB)

Browse every model that fits [iPhone](/models/best-for-iphone) or [Mac](/models/best-for-mac).

## How to run Gemma 2 9B IT in Private LLM

1.  Download Private LLM from the App Store.
2.  Open the in-app model library and choose Gemma 2 9B IT.
3.  Download the model once, then chat fully offline.

[![Download Private LLM on the App Store](/app-store/download-badge/en/download.svg)![Download Private LLM on the App Store](/app-store/download-badge/en/download-dark.svg)](/download)[Join our Discord](/discord)

## Variants & related models

![Google Gemma logo](/model-logos/gemma.svg)

### FuseChat Gemma 2 9B Instruct

9.2B

general

8K contextiPhone & iPad · Mac

[View details →](/models/fusechat-gemma-2-9b-instruct)

![Google Gemma logo](/model-logos/gemma.svg)

### Gemma 2 9B IT SPPO Iter3

9.2B

general

8K contextiPhone & iPad · Mac

[View details →](/models/gemma-2-9b-it-sppo-iter3)

![Google Gemma logo](/model-logos/gemma.svg)

### Gemma 2 Ifable 9B

9B

creative

8K contextiPhone & iPad · Mac

[View details →](/models/gemma-2-ifable-9b)

![Google Gemma logo](/model-logos/gemma.svg)

### Tiger Gemma 9B v3

9.2B

uncensored

8K contextiPhone & iPad · Mac

[View details →](/models/tiger-gemma-9b-v3)

![Meta logo](/model-logos/meta.svg)

### Airoboros l2 7b 3.0

6.7B

general

4K contextiPhone & iPad · Mac

[View details →](/models/airoboros-l2-7b-3.0)

![Mistral logo](/model-logos/mistral.svg)

### Airoboros M 7B

7.2B

general

33K contextMac

[View details →](/models/airoboros-m-7b-3.1.2)

## Frequently asked questions

-   Can I run Gemma 2 9B IT on iPhone?
    
    Gemma 2 9B IT is too large for current iPhones. It runs on Mac with enough unified memory inside Private LLM, fully offline.
    
-   Can I run Gemma 2 9B IT on Mac?
    
    Yes. Gemma 2 9B IT runs on Macs with enough unified memory, such as Mac (Apple Silicon, 192GB), MacBook Pro (M4 Max, 128GB), Mac Studio / Pro (Apple Silicon, 96GB), MacBook Pro (M4 Max, 64GB), MacBook Pro (M4 Max, 48GB), MacBook Pro (M4 Max, 36GB), Mac (Apple Silicon, 32GB), MacBook Air (M4, 24GB), MacBook Air (M-series, 16GB), fully on-device in Private LLM.
    
-   Does Gemma 2 9B IT work offline?
    
    Yes. Once downloaded in Private LLM, Gemma 2 9B IT runs 100% on-device — no internet connection, and nothing is sent to any server.
    
-   Is Gemma 2 9B IT free to use?
    
    Private LLM is a one-time purchase with no subscription and no per-message cost. The models themselves are open source — once downloaded, they run offline with nothing to pay per use.
    

## Why run Gemma 2 9B IT in Private LLM

Private LLM has run local AI on iPhone, iPad, and Mac since 2023, before Apple Intelligence existed. Inference happens on your device, so your Gemma 2 9B IT conversations never reach a server. The part most apps gloss over is quantization, and that is exactly where on-device quality is won or lost. Most llama.cpp and MLX wrappers ship the same off-the-shelf 4-bit RTN weights. Private LLM ships GPTQ and OmniQuant quantization, tuned per model, and our 3-bit OmniQuant models match or beat those 4-bit RTN builds on the same Apple Silicon. Run the same model both ways and you feel it in the first reply. [See how our quantization works](/en/faq#Why-can-t-Private-LLM-load-models-directly-from-Hugging-Face).

### More from Numen

[![Slop or Not app icon](/app-icons/slop-or-not.png)

#### Slop or Not

Catch AI-written text on iPhone and Mac. The check runs on-device, so nothing you paste gets uploaded.

](https://slopornot.ai)[![Clean Links app icon](/app-icons/clean-links.png)

#### Clean Links

Strip trackers and clutter from any link before you share it. The cleanup happens on your device.

](https://cleanlinks.app)

Specifications and summary come from Gemma 2 9B IT's [Hugging Face model card](https://huggingface.co/google/gemma-2-9b-it), released under the Gemma license. Private LLM ships its own quantized models, built with OmniQuant quantization tuned per model, and isn't affiliated with the model's authors.