We are pleased to announce the release of Private LLM v1.5 for macOS, offering you the flexibility to choose between two great models: Mistral-7B-OpenOrca and WizardLM-13B. Both the 7B and 13B models are quantized with state of the art OmniQuant quantization that offers the best model perplexity and fluency amongst all current quantization methods.

A significant feature of this release is a reduction in the models’ memory footprint. The WizardLM-13B model now consumes 5% less memory, thus saving around 720MB of RAM. Similarly, the Mistral-7B-OpenOrca model requires 14% less RAM as compared to the earlier Luna-AI-Llama2-Uncensored model, saving nearly 800MB of RAM. These optimizations will improve the app’s user experience for all users, especially users on Macs with 8GB and 16GB of RAM.

The new Mistral-7B-OpenOrca model punches well above its weight in terms of performance. On the HuggingFace Open LLMs leaderboard, this model currently ranks at #2 amongst all models with fewer than 30B parameters.

We invite you to stay tuned for more updates and share your experiences with the new version of Private LLM on our discord. Your feedback and suggestions greatly contribute to our community and help us continually improve the app. We look forward to bringing more updates and models to the app, soon!