We're excited to announce that Private LLM for iPhone and iPad v1.9.3 and Mac v1.9.5 now support the advanced FuseChat 3.0 series language models. These models enhance your local AI chatbot experience by combining multiple large language models into efficient, single-target architectures.

About FuseChat 3.0 Models

The FuseChat 3.0 series uses Implicit Model Fusion (IMF) to combine several source LLMs into a compact target LLM. This improves performance in conversation, instruction following, math, and coding.

Models now supported in Private LLM include:

FuseAI/FuseChat Llama 3.1 8B Instruct
FuseAI/FuseChat Llama 3.2 3B Instruct
FuseAI/FuseChat Llama 3.2 1B Instruct (OmniQuant quantized and unquantized versions)
FuseAI/FuseChat Qwen 2.5 7B Instruct
FuseAI/FuseChat Gemma 2 9B Instruct

What Makes FuseChat 3.0 Models Stand Out?

FuseChat 3.0 models use advanced training and optimization:

Two-Stage Training Pipeline:

Supervised Fine-Tuning (SFT): Reduces differences between target and source LLMs for better performance
Direct Preference Optimization (DPO): Improves responses by learning from various source LLMs

Implicit Model Fusion:

Combines multiple source LLMs without complex vocabulary alignment or matrix merging

FuseChat Models vs Base Models

Llama 3.1 8B Instruct vs FuseChat Llama 3.1 8B Instruct

FuseChat Llama 3.1 8B Instruct performs better than the base model:

Instruction Following: 37% improvement on AlpacaEval-2 and Arena-Hard tasks
Mathematics and Coding: Better results in GSM8K and HumanEval

Comparison of Llama 3.1 8B Instruct and FuseChat Llama 3.1 8B Instruct explaining photosynthesis — Comparison showing FuseChat model providing clearer explanation with better structure and analogies

Qwen 2.5 7B Instruct vs FuseChat Qwen 2.5 7B Instruct

FuseChat Qwen 2.5 7B Instruct shows major improvements:

Instruction Following: 90% better in AlpacaEval-2
Code Generation: Higher scores in HumanEval and MBPP benchmarks

Comparison of Qwen 2.5 7B GPTQ and FuseChat Qwen 2.5 7B GPTQ JavaScript implementations — Comparison showing FuseChat model providing more robust cache implementation

Gemma 2 9B Instruct vs FuseChat Gemma 2 9B Instruct

FuseChat Gemma 2 9B Instruct shows clear gains:

General Conversation: 37% improvement in AlpacaEval-2
Mathematics: 2% better performance in GSM8K

For more details, see the FuseChat 3.0 blog post.

Private LLM vs Ollama for FuseChat 3.0 Models

When choosing a local AI chatbot, here's how Private LLM compares to Ollama:

Quantization Technology:
Private LLM: Uses OmniQuant for better weight distribution and perplexity scores
Ollama: Uses basic Round-to-Nearest (RTN) quantization
Performance:
Private LLM: Faster responses and higher quality text generation
Ollama: Basic performance with standard quantization

See the difference yourself—Private LLM vs Ollama comparison.

Try FuseChat 3.0 Models Today

Update Private LLM on your iPhone, iPad (v1.9.3) or Mac (v1.9.5) to use FuseChat 3.0 models. Get subscription free, unlimited chat with local AI right on your Apple device.