FuseChat 3.0 Models Now Available in Private LLM for iPhone, iPad & Mac
We're excited to announce that Private LLM for iPhone and iPad v1.9.3 and Mac v1.9.5 now support the advanced FuseChat 3.0 series language models. These models enhance your local AI chatbot experience by combining multiple large language models into efficient, single-target architectures.
About FuseChat 3.0 Models
The FuseChat 3.0 series uses Implicit Model Fusion (IMF) to combine several source LLMs into a compact target LLM. This improves performance in conversation, instruction following, math, and coding.
Models now supported in Private LLM include:
- FuseAI/FuseChat Llama 3.1 8B Instruct
- FuseAI/FuseChat Llama 3.2 3B Instruct
- FuseAI/FuseChat Llama 3.2 1B Instruct (OmniQuant quantized and unquantized versions)
- FuseAI/FuseChat Qwen 2.5 7B Instruct
- FuseAI/FuseChat Gemma 2 9B Instruct
What Makes FuseChat 3.0 Models Stand Out?
FuseChat 3.0 models use advanced training and optimization:
- Two-Stage Training Pipeline:
- Supervised Fine-Tuning (SFT): Reduces differences between target and source LLMs for better performance
- Direct Preference Optimization (DPO): Improves responses by learning from various source LLMs
- Implicit Model Fusion:
- Combines multiple source LLMs without complex vocabulary alignment or matrix merging
FuseChat Models vs Base Models
Llama 3.1 8B Instruct vs FuseChat Llama 3.1 8B Instruct
FuseChat Llama 3.1 8B Instruct performs better than the base model:
- Instruction Following: 37% improvement on AlpacaEval-2 and Arena-Hard tasks
- Mathematics and Coding: Better results in GSM8K and HumanEval
Qwen 2.5 7B Instruct vs FuseChat Qwen 2.5 7B Instruct
FuseChat Qwen 2.5 7B Instruct shows major improvements:
- Instruction Following: 90% better in AlpacaEval-2
- Code Generation: Higher scores in HumanEval and MBPP benchmarks
Gemma 2 9B Instruct vs FuseChat Gemma 2 9B Instruct
FuseChat Gemma 2 9B Instruct shows clear gains:
- General Conversation: 37% improvement in AlpacaEval-2
- Mathematics: 2% better performance in GSM8K
For more details, see the FuseChat 3.0 blog post.
Private LLM vs Ollama for FuseChat 3.0 Models
When choosing a local AI chatbot, here's how Private LLM compares to Ollama:
-
Quantization Technology:
-
Private LLM: Uses OmniQuant for better weight distribution and perplexity scores
-
Ollama: Uses basic Round-to-Nearest (RTN) quantization
-
Performance:
-
Private LLM: Faster responses and higher quality text generation
-
Ollama: Basic performance with standard quantization
See the difference yourself—Private LLM vs Ollama comparison.
Try FuseChat 3.0 Models Today
Update Private LLM on your iPhone, iPad (v1.9.3) or Mac (v1.9.5) to use FuseChat 3.0 models. Get subscription free, unlimited chat with local AI right on your Apple device.