Release Notes

v1.9.11 - macOS

about 1 month ago

Support for two Qwen3 4B Instruct 2507 based models: Qwen3 4B Instruct 2507 abliterated and Josiefied Qwen3 4B Instruct 2507 (on Apple Silicon Macs with 16GB or more RAM)
Fix for the rare crash in the Settings panel on some Macs.
Minor bug fixes and updates

v1.9.9 - iOS

about 1 month ago

Support for two Qwen3 4B Instruct 2507 based models: Qwen3 4B Instruct 2507 abliterated and Josiefied Qwen3 4B Instruct 2507 (on any iOS device with 6GB or more RAM)
Minor bug fixes and updates

v1.9.10 - macOS

about 1 month ago

Support for the new Qwen3 4B Instruct 2507 model (on Apple Silicon Macs with 16GB or more RAM)
Minor bug fixes and updates

v1.9.8 - iOS

about 1 month ago

Support for the new Qwen3 4B Instruct 2507 model (on any iOS device with 6GB or more RAM)
Minor bug fixes and updates

v1.9.9 - macOS

6 months ago

Added support for 3-bit and 4-bit OmniQuant quantized versions of the Perplexity r1-1776-distill-llama-70b model
Added support for a 4-bit OmniQuant quantized version of the Llama-3.1-8B-UltraMedical model
Added support for a 4-bit OmniQuant quantized version of the Meta-Llama-3.1-8B-SurviveV3 survival specialist model
Added support for a 4-bit GPTQ quantized versions of the openhands 7B and 32B coding models
Added support for 4-bit QAT version of the Google Gemma3 1B IT model
Added support for 4-bit OmniQuant quantized versions of the Google Gemma3 1B based gemma-3-1b-it-abliterated and amoral-gemma3-1B-v2 models
Other minor bug fixes and updates

v1.9.7 - iOS

6 months ago

Added support for a 3-bit OmniQuant quantized version of the Llama-3.1-8B-UltraMedical model
Added support for a 3-bit OmniQuant quantized version of the Meta-Llama-3.1-8B-SurviveV3 survival specialist model
Added support for a 4-bit GPTQ quantized version of the Openhands 7B coding model
Added support for 4-bit QAT version of the Google Gemma3 1B IT model (32k ctx on iPhones with 6GB or more RAM, 8k on older iPhones with 4GB of RAM)
Added support for 4-bit OmniQuant quantized versions of the Google Gemma3 1B based gemma-3-1b-it-abliterated and amoral-gemma3-1B-v2 models

v1.9.8 - macOS

8 months ago

Added support for 3-bit OmniQuant quantized versions of Llama 3.3 70B-based models (5 new models)
Added support for 3-bit and 4-bit OmniQuant quantized versions of the EVA LLaMA 3.33 70B v0.1 model
Added support for 8 new models from the Dolphin 3.0 family of models
Added support for the unquantized version of the Llama 3.2 1B Instruct Abliterated model
Added support for the 4-bit OmniQuant quantized Gemma 2 Ifable 9B creative writing model
Context length is now displayed in the model quick switcher
Fixed a crash with some newer models on older versions of macOS (Sonoma)
Other minor bug fixes and updates

v1.9.6 - iOS

8 months ago

Added support for 8 new models from the Dolphin 3.0 family of models
Added support for the unquantized version of the Llama 3.2 1B Instruct Abliterated model
Added support for the 4-bit quantized Gemma 2 Ifable 9B creative writing model (downloadable on M-series iPad Pros with 16GB of RAM)
Context length is now displayed in the model quick switcher
Minor bug fixes and updates

v1.9.7 - macOS

8 months ago

Support for downloading 7 new DeepSeek R1 Distill based models on Apple Silicon Macs. Support for individual models varies by device capabilities.
Users with Apple Silicon Macs with 16GB RAM can now download the phi-4 model (previously restricted to Apple Silicon Macs with 24 GB of RAM)
Minor bugfixes and updates.

v1.9.5 - iOS

8 months ago

Support for downloading 4 new DeepSeek R1 Distill based models. Support for individual models varies by device capabilities.
Minor bugfixes and updates.

v1.9.6 - macOS

9 months ago

Support for downloading the dynamic 4-bit GPTQ quantized version of the new Phi 4 model Apple Silicon Macs with 24GB or more RAM.
Stability improvements and bug fixes.

v1.9.4 - iOS

10 months ago

Bugfix release: Fix for crash while loading 14B models on iPad Pros with 16GB of RAM

v1.9.5 - macOS

10 months ago

Support for downloading 16 new models (varies by device capacity).
- Three new Llama 3.3 based uncensored models: EVA-LLaMA-3.33-70B-v0.0, Llama-3.3-70B-Instruct-abliterated and L3.3-70B-Euryale-v2.3.
- Hermes-3-Llama-3.2-3B and Hermes-3-Llama-3.1-8B models.
- FuseChat-Llama-3.2-1B-Instruct, FuseChat-Llama-3.2-3B-Instruct, FuseChat-Llama-3.1-8B-Instruct, FuseChat-Qwen-2.5-7B-Instruct and FuseChat-Gemma-2-9B-Instruct models.
- FuseChat-Llama-3.2-1B-Instruct also comes with an unquantized variant.
- EVA-D-Qwen2.5-1.5B-v0.0, EVA-Qwen2.5-7B-v0.1, EVA-Qwen2.5-14B-v0.2 and EVA-Qwen2.5-32B-v0.2 models.
- Llama-3.1-8B-Lexi-Uncensored-V2 model
Improved LaTeX rendering
Stability improvements and bug fixes.

v1.9.3 - iOS

10 months ago

Support for downloading 12 new models (varies by device capacity).
- Hermes-3-Llama-3.2-3B and Hermes-3-Llama-3.1-8B models
- FuseChat-Llama-3.2-1B-Instruct, FuseChat-Llama-3.2-3B-Instruct, FuseChat-Llama-3.1-8B-Instruct, FuseChat-Qwen-2.5-7B-Instruct and FuseChat-Gemma-2-9B-Instruct models
- FuseChat-Llama-3.2-1B-Instruct also has an unquantized variant, downloadable on devices with 6GB or more RAM
- EVA-D-Qwen2.5-1.5B-v0.0, EVA-Qwen2.5-7B-v0.1 and EVA-Qwen2.5-14B-v0.2 models
- Llama-3.1-8B-Lexi-Uncensored-V2 model
Improved LaTeX rendering
Stability improvements and bug fixes.

v1.9.4 - macOS

10 months ago

Support for downloading the latest Meta Llama 3.3 70B model (on Apple Silicon Macs with 48GB or more RAM).

v1.9.2 - iOS

10 months ago

Support for downloading 8 new models.
Added support for downloading Qwen 2.5 family of models (0.5B-14B)
Added support for downloading Qwen 2.5 Coder family of models (0.5B-14B)
Support for individual models across both families of models varies by the amount of physical memory on devices.
Stability improvements and bug fixes.

v1.9.3 - macOS

10 months ago

Support for downloading 11 new models on Apple Silicon Macs.
Added support for downloading Qwen 2.5 family of models (0.5B-32B) on Apple Silicon Macs
Added support for downloading Qwen 2.5 Coder family of models (0.5B-32B) on Apple Silicon Macs
Larger models from both families can only be downloaded on Apple Silicon Macs with more RAM.
New Settings tab “Performance” with tips to improve LLM inference performance.
Stability improvements and bug fixes.

v1.9.2 - macOS

12 months ago

Bugfix release: fix for crash while loading some of the older models that use the sentencepiece tokenizer.
Drop support for Llama 3.2 1B and 3B models on Intel Macs due to stability issues.

v1.9.1 - iOS

12 months ago

Bugfix release: fix for crash while loading some of the older models that use the sentencepiece tokenizer.

v1.9.1 - macOS

12 months ago

Added support for Llama 3.2 1B and 3B based models (On x86-64 and Apple Silicon Macs).
Added support for three Gemma-2 9B based models (On Apple Silicon Macs with 16GB or more RAM).
This update brings the list of models in the macOS app in parity with the models in the iOS app.

v1.9.0 - iOS

12 months ago

Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 1B Instruct abliterated model (on all iOS devices).
Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 3B Instruct abliterated model (on devices with 6GB or more RAM).
Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 3B Instruct uncensored model (on devices with 6GB or more RAM).
Added support for downloading 4-bit Omniquant quantized version the Gemma 2 9B IT model (on M1/M2/M4 iPad Pros with 16GB of RAM).
Added support for downloading 4-bit Omniquant quantized version the Gemma 2 9B IT SPPO Iter3 model (on M1/M2/M4 iPad Pros with 16GB of RAM).
Added support for downloading 4-bit Omniquant quantized version the Tiger-Gemma-9B-v3 model (on M1/M2/M4 iPad Pros with 16GB of RAM).
Stability improvements and bug fixes.

v1.8.9 - iOS

about 1 year ago

Added support for downloading 4-bit Omniquant quantized version the new Llama 3.2 1B Instruct model (on all iOS devices).
Added support for downloading 4-bit Omniquant quantized version the new Llama 3.2 3B Instruct model (on devices with 6GB or more RAM).
Added support for downloading the unquantized version of the Llama 3.2 1B Instruct model (on devices with 6GB or more RAM).
Support for rendering Latex math formulas in LLM generated text.
Users can now copy debug information and also email our support address, from the help view in the app.

v1.9.0 - macOS

about 1 year ago

Support for 2 new models from the Gemma 2 family of models (on Apple Silicon Macs).
4-bit OmniQuant quantized version of the gemma-2-2b-it model.
4-bit OmniQuant quantized version of the multilingual SauerkrautLM-gemma-2-2b-it model.
Stability improvements and bug fixes.

v1.8.8 - iOS

about 1 year ago

Fix for a non-deterministic crash while downloading Gemma 2B based models on older devices with 4GB of RAM.

v1.8.7 - iOS

about 1 year ago

Support for downloading 2 new models from the Gemma 2 family of models (on all devices with 4GB or more RAM).
4-bit OmniQuant quantized version of the gemma-2-2b-it model.
4-bit OmniQuant quantized version of the multilingual SauerkrautLM-gemma-2-2b-it model.
Stability improvements and bug fixes.

v1.8.9 - macOS

about 1 year ago

Support for downloading 5 new models. Three models from the new Meta Llama 3.1 family of models and two Meta Llama 3 based models (Support varies by device capabilities).
4-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct model.
4-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct abliterated model.
4-bit OmniQuant quantized version of the Meta Llama 3.1 70B Instruct model.
4-bit OmniQuant quantized version of the Llama 3 based L3 Umbral Mind RP v3.0 model.
4-bit OmniQuant quantized version of the Llama 3 based Llama 3 Instruct 8B SPPO Iter3 model.
Stability improvements and bug fixes.

v1.8.6 - iOS

about 1 year ago

Support for downloading 4 new models. Two models from the new Meta Llama 3.1 family of models and two Meta Llama 3 based models (Support varies by device capabilities).
3-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct model.
3-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct abliterated model.
3-bit OmniQuant quantized version of the Llama 3 based L3 Umbral Mind RP v3.0 model.
3-bit OmniQuant quantized version of the Llama 3 based Llama 3 Instruct 8B SPPO Iter3 model.
Stability improvements and bug fixes.

v1.8.8 - macOS

over 1 year ago

Support for downloading 12 new models (support varies by device capabilities).
4-bit OmniQuant quantized version of Mistral 7B Instruct v0.3
4-bit OmniQuant quantized version of Meta-Llama-3-8B-Instruct-abliterated-v3
4-bit OmniQuant quantized version of Llama-3-8B-Instruct-MopeyMule
4-bit OmniQuant quantized version of openchat-3.6-8b-20240522
4-bit OmniQuant quantized version of Llama-3-WhiteRabbitNeo-8B-v2.0
4-bit OmniQuant quantized version of Hermes-2-Theta-Llama-3-8B
4-bit OmniQuant quantized version of LLaMA3-iterative-DPO-final
4-bit OmniQuant quantized version of Hathor_Stable-v0.2-L3-8B
4-bit OmniQuant quantized version of NeuralDaredevil-8B-abliterated
3 and 4-bit OmniQuant quantized versions of Smaug-Llama-3-70B-Instruct
3 and 4-bit OmniQuant quantized versions of Smaug-Llama-3-70B-Instruct-abliterated-v3
3 and 4-bit OmniQuant quantized versions of Cat-Llama-3-70B-instruct
Minor UI improvements
Stability improvements and bug fixes.

v1.8.5 - iOS

over 1 year ago

Support for downloading 9 new models (support varies by device capabilities).
3-bit OmniQuant quantized version of Mistral 7B Instruct v0.3
3-bit OmniQuant quantized version of Meta-Llama-3-8B-Instruct-abliterated-v3
3-bit OmniQuant quantized version of Llama-3-8B-Instruct-MopeyMule
3-bit OmniQuant quantized version of openchat-3.6-8b-20240522
3-bit OmniQuant quantized version of Llama-3-WhiteRabbitNeo-8B-v2.0
3-bit OmniQuant quantized version of Hermes-2-Theta-Llama-3-8B
3-bit OmniQuant quantized version of LLaMA3-iterative-DPO-final
3-bit OmniQuant quantized version of Hathor_Stable-v0.2-L3-8B
3-bit OmniQuant quantized version of NeuralDaredevil-8B-abliterated
Minor UI improvements
Stability improvements and bug fixes.

v1.8.7 - macOS

over 1 year ago

Support for downloading a 3-bit OmniQuant quantized version of the Meta-Llama-3-70B-Instruct model on Apple Silicon Macs with 48GB or more RAM.
Stability improvements and bug fixes.

v1.8.6 - macOS

over 1 year ago

Support for downloading a 4-bit OmniQuant quantized version of the Meta-Llama-3-70B-Instruct model on Apple Silicon Macs with 48GB or more RAM.
Support for downloading a 4-bit OmniQuant quantized version of the new Phi-3-Mini based kappa-3-phi-abliterated model on all Macs.
Stability improvements and bug fixes.

v1.8.4 - iOS

over 1 year ago

Support for downloading a 4-bit OmniQuant quantized version of the new Phi-3-Mini based kappa-3-phi-abliterated model on all devices with 6GB or more RAM.
Stability improvements and bug fixes.

v1.8.5 - macOS

over 1 year ago

Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 8B Instruct model.
Support for downloading a 4-bit OmniQuant quantized version of the Dolphin 2.9 Llama 3 8B model.
Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 Smaug 8B model.
Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 8B based OpenBioLLM-8B model.
Support for downloading a 4-bit OmniQuant quantized version of the Hermes 2 Pro - Llama-3 8B model.
Support for downloading a 4-bit OmniQuant quantized version of the Phi-3-Mini model.
Support for downloading a 4-bit OmniQuant quantized version of the bilingual (Hebrew, English) DictaLM-2.0-Instruct model.
Stability improvements and bug fixes.

v1.8.3 - iOS

over 1 year ago

Support for downloading a 3-bit OmniQuant quantized version of the Llama 3 8B based OpenBioLLM-8B model.
Support for downloading a 3-bit OmniQuant quantized version of the Hermes 2 Pro - Llama-3 8B model.
Support for downloading a 3-bit OmniQuant quantized version of the bilingual (Hebrew, English) DictaLM-2.0-Instruct model.
Users on iPhone 11, 12, 13 Pro, Pro Max devices can now download the faster and older fully quantized version of the Phi-3-Mini model.
Private LLM now uses the loaded model's default system prompt if the system prompt is blank when invoked from app intents (Siri and Shortcuts).
Fixed a bug where temperature and top-p settings were not being persisted across app restarts.
Stability improvements and bug fixes.

v1.8.2 - iOS

over 1 year ago

Support for downloading an improved version of the new Phi-3-mini-4k-instruct model with an unquantized embedding layer.
The old Phi-3-mini-4k-instruct model has been deprecated, and will continue to be functional for the next two releases.
Fixed bug where the "+" character was elided from prompts when Private LLM is invoked from iOS Shortcuts.
Stability improvements and bug fixes.

v1.8.1 - iOS

over 1 year ago

Support for downloading the new Phi-3-mini-4k-instruct model.
Support for downloading the Llama 3 based Smaug-8B model.
Stability improvements and bug fixes.

v1.8.0 - iOS

over 1 year ago

Support for downloading the new Dolphin 2.9 Llama 3 8B uncensored model.

v1.7.9 - iOS

over 1 year ago

Fix for issues with loading the builtin StableLM 2 1.6B model and stability fixes on older iOS devices.

v1.7.8 - iOS

over 1 year ago

Support for downloading the new Llama 3 8B Instruct model (Supported on all iOS and iPadOS devices with 6GB or more RAM).

v1.7.7 - iOS

over 1 year ago

Fix for compatibility issue with Yi-6B and iPhone 13 Pro and Pro Max devices running iOS 17.4.1.
Fix for compatibility issue with 3B models on iPhone 13 devices running iOS 17.4.1.
Minor inference performance enhancements for all supported models.

v1.8.4 - macOS

over 1 year ago

New 4-bit OmniQuant quantized downloadable model: Gemma 1.1 2B IT (Downloadable on all compatible Macs, also available on the iOS version of the app).
New 4-bit OmniQuant quantized downloadable model: Dolphin 2.6 Mixtral 8x7B (Downloadable on Apple Silicon Macs with 32GB or more RAM).
New 4-bit OmniQuant quantized downloadable model: Nous Hermes 2 Mixtral 8x7B DPO (Downloadable on Apple Silicon Macs with 32GB or more RAM).
Minor bug fixes and improvements.

v1.7.6 - iOS

over 1 year ago

New 4-bit OmniQuant quantized downloadable model: gemma-1.1-2b-it (Downloadable on all iOS devices with 8GB or more RAM).
New 3-bit OmniQuant quantized downloadable model: Dolphin 2.8 Mistral 7b v0.2 (Downloadable on all iOS devices with 6GB or more RAM).
The downloaded models directory is now marked as excluded from iCloud backups.

v1.8.3 - macOS

over 1 year ago

New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-6B-Chat (Downloadable on all compatible Macs).
New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-34B-Chat (Downloadable on Apple Silicon Macs with 24GB or more RAM).
New 4-bit OmniQuant quantized downloadable model: Starling 7B Beta (Downloadable on all compatible Macs).
WizardLM 33B model can now be downloaded on Apple Silicon Macs with 24GB or more RAM (Previously restricted to Apple Silicon Macs with 32GB or more RAM).
CodeNinja and openchat-3.5-0106 models can now be downloaded on Macs running the older version of macOS (macOS 13 Ventura).
Configurable option to show abridged system prompt in the chat window.
Minor bug fixes and improvements.

v1.7.5 - iOS

over 1 year ago

New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-6B-Chat (Downloadable on all iOS devices with 6GB or more RAM).
New 3-bit OmniQuant quantized downloadable model: Starling 7B Beta (Downloadable on all iOS devices with 6GB or more RAM).
New 3-bit OmniQuant quantized downloadable model: openchat-3.5-0106 (Downloadable on all iOS devices with 6GB or more RAM).
New 3-bit OmniQuant quantized downloadable model: CodeNinja-1.0 (Downloadable on all iOS devices with 6GB or more RAM).
Configurable option to show abridged system prompt in the chat window.
Minor bug fixes and improvements.

v1.8.2 - macOS

over 1 year ago

Support for downloading the Japanese RakutenAI-7B-chat model.
Users can now switch models from the Chat view, without opening Settings.
Fixed a crash with the WizardLM 33B model.

v1.7.4 - iOS

over 1 year ago

Support for downloading the Japanese RakutenAI-7B-chat model.
Minor bug fixes and updates.

v1.8.1 - macOS

over 1 year ago

Minor bugfix release

v1.8.0 - macOS

over 1 year ago

The built-in model is now StableLM Zephyr 3B. The previous built-in model is still downloadable.
New Phi-2 3B based downloadable model: Phi-2 Orange v2
New Mistral 7B based downloadable model: Hermes 2 Pro
Slightly improved performance and reduced memory footprint with the Mixtral model.
Minor bug-fixes and performance improvements.

v1.7.3 - iOS

over 1 year ago

New 1.8B downloadable model: H2O Danube 1.8B Chat (downloadable on all devices).
New Phi-2 3B based downloadable model: Phi-2 Orange v2 (downloadable on all devices with 4GB or more RAM).
New Mistral 7B based downloadable model: Hermes 2 Pro (downloadable on all devices with 6GB or more RAM).
Minor bug-fixes and performance improvements.

v1.7.2 - iOS

over 1 year ago

Added the ability to switch models without leaving the chat interface.
Minor bug fixes and updates.

v1.7.1 - iOS

over 1 year ago

Fix for crash while loading OpenHermes 2.5 Mistral 7B model on supported devices.
Added support for downloading the Phi 2 Super model on devices with 4GB or more RAM.

v1.7.9 - macOS

over 1 year ago

New downloadable model: merlinite-7b (a Mistral 7B based model that was distilled from Mixtral-8x7B-Instruct-v0.1)
Minor improvements to offline grammar correction macOS service, especially for non-English Western European languages.
Model downloads view now highlights general purpose models that are recommended by the developers.

v1.7.0 - iOS

over 1 year ago

The builtin LLM is the new StableLM 2 Zephyr 1.6B.
Support for downloading multiple models from the following families: TinyLlama 1.1B, Stable LM 3B, Phi-2 3B, Mistral 7B, Llama 7B and Gemma 2B
All models, downloadable and builtin are quantized with the SOTA OmniQuant quantization method.
The list of downloadable model vary by the amount of physical memory on the device that the app is running on.
All models are downloadable on iPhone 15 Pro, Pro Max and Apple Silicon iPads.
Support for editing system prompts.

v1.7.8 - macOS

over 1 year ago

Further improved Mixtral model with unquantized embedding and MoE gates weights, the rest of the weights are 4 bit OmniQuant quantized.
The old Mixtral model is now deprecated. Users who had previously downloaded the Mixtral model can still keep using it, if they wish to.
Mistral Instruct vO.2, Nous Hermes 2 Mistral 7B DPO and BioMistral 7B models now load with full 32k context length if the app finds at least 8.69GB of free memory, otherwise they're loaded with a 4k context length.
Grammar correction macOS service now uses the OS locale to determine the English spellings to use.
Experimental support for non-English European languages in macOS services (works best with Western European languages and with larger models). This needs to be enabled in app settings.

v1.7.7 - macOS

over 1 year ago

Support for three new downloadable models: BioMistral-7B, Gemma 2B, Nous-Hermes-2-Mistral-7B-DPO

v1.7.6 - macOS

over 1 year ago

Fix for crash on macOS Ventura (version 13.x).

v1.7.5 - macOS

over 1 year ago

Five new OmniQuant quantized downloadable models: WhiteRabbit Neo 13B v1 cybersecurity model, Mixtral 8x7B Instruct v0.1, WizardLM 33B 1.0 Uncensored, Mistral 7B Instruct v0.2 and Phi-2 Orange 3B.
WhiteRabbit Neo 13B v1 needs Apple Silicon Macs with at least 16GB of RAM, Mixtral 8x7B Instruct vO.1, WizardLM 33B 1.0 Uncensored needs Apple Silicon Macs with at least 32GB of RAM, and Mixtral 8x7B Instruct v0.1 works on all Macs.
MacOS services for system wide and offline Grammar correction, Summarization, Text shortening and Rephrasing.
The default base model that's bundled with the app is now Dolphin 2.6 phi-2 3B model. The older base model (Spicyboros-7b-2.2) can still be downloaded on any Mac.
Miscellaneous bug fixes and improvements.

v1.7.0 - macOS

over 1 year ago

Support for downloading two new models on Apple Silicon Macs openchat-3.5-0106 and CodeNinja-1.0-OpenChat-7B (coding model).
Users can now customize system prompts in app settings.
Minor bug fixes and improvements.

v1.6.2 - macOS

over 1 year ago

Support for the new Nous-Hermes-2-SOLAR-10.7B model on Apple Silicon Macs with 16GB or more RAM (Full 4k context support with 6.85GB of memory footprint).
Bugfix: Allow deleting partially downloaded models.

v1.6.7 - iOS

almost 2 years ago

More memory optimizations for improved compatibility of the optional 7B parameter model on iPhone 15.

v1.6.1 - macOS

almost 2 years ago

Bugfix: Allow switching to downloaded 7B models on Apple Silicon Macs with 8GB of RAM and Intel Macs.

v1.6.6 - iOS

almost 2 years ago

A slew of memory optimizations.
Minor performance improvements stemming from the aforementioned memory optimizations.
Private LLM now works on older iPhones and iPads with 3GB of RAM, like the 2nd Gen iPhone SE and the 9th Gen iPad. Previously, it needed devices with at least 4GB of RAM to function.

v1.6 - macOS

almost 2 years ago

The builtin model is now spicyboros-7b-2.2
Support for more downloadable models
- Llama2 7B based models
- airoboros-12-7b-3.0
- Xwin-LM-13B-V0.1
- Llama2 13B based models (on Apple Silicon Macs with 16GB or more RAM)
- WizardLM-13B-V1.2
- spicyboros-13b-2.2
- Xwin-LM-13B-V0.1
- MythoMax-L2-13b
- Mistral 7B based models
- Mistral-7B-OpenOrca
- Mistral-7B-Instruct-v0.1
- zephyr-7b-beta
- leo-mistral-hessianai-7b-chat (German model)
- jackalope-7b
- dolphin-2.1-mistral-7b
- samantha-1.2-mistral-7b
- OpenHermes-2-Mistral-7B
- SynthIA-7B-v2.0
- airoboros-m-7b-3.1.2
- Mistral-Trismegistus-7B
- openchat_3.5

v1.6.5 - iOS

almost 2 years ago

Update to use an alternative mirror to download the 7B model if the primary download source is inaccessible.
Minor bug fixes and updates.

v1.6.4 - iOS

almost 2 years ago

Fixed a memory and performance issue with the optional
7B parameter model on iPhone 14 Pro, iPhone 14 Pro Max and iPhone 15 phones.

v1.6.3 - iOS

almost 2 years ago

Minor bug fixes and improvements.
Improved model perplexity with 7B model on iPhone 14 Pro and Pro Max phones.

v1.5 - macOS

almost 2 years ago

The baseline 7B LLM in the app is now Mistral-7B-OpenOrca.
Optimizations to reduce memory footprint of LLMs.
13B model now consumes ~6% less RAM.
7B model now consumes ~14% less RAM (this is party due to the new 7B model's architecture).
Minor bug-fixes and improvements.

v1.6.2 - iOS

about 2 years ago

Minor bug fixes and improvements.

v1.6.1 - iOS

about 2 years ago

Minor bug fix in App Intents (Shortcuts app Integration)
Minor performance improvements

v1.4.1 - macOS

about 2 years ago

Minor bug fix in App Intents (Shortcuts app Integration)
Minor performance improvements

v1.6 - iOS

about 2 years ago

Support for iPhone 15 series phones.
The optional 7B parameter models are now quantised with the state of the art OmniQuant algorithm.
Support for the downloadable 7B parameter model on iPhone 13 Pro and iPhone 13 Pro Max devices (previously, only iPhone 14 series phones and Apple Silicon iPads were supported).

v1.4 - macOS

about 2 years ago

Model: All models are now quantized using the state of the art OmniQuant quantization algorithm. This significantly improves models' perplexity and text generation.
Users who had previously downloaded the 13B WizardLM V1.2 model can optionally download an update with the improved version of the model.
UI: Allow right-clicking and text selection in the message bubbles (based on numerous users' requests).

v1.3 - macOS

about 2 years ago

Option to download the 13B parameter WizardLM-13B-V1.2 model on Apple Silicon Macs with 16GB or more RAM.
Minor bug fixes and improvements.

v1.2.1 - macOS

about 2 years ago

Significant performance improvement for users with Intel Macs and Apple Silicon Macs with 8GB of RAM.
App Intent (for Shortcuts support) now supports an optional System prompt, along with the query. This feature is based on a user request.
Minor bug fixes and improvements.

v1.2 - macOS

about 2 years ago

The macOS version of the app now ships with a bigger and better Llama 2 based 7B parameter model with 4k context length (Previous versions shipped with a 3B model with 2k context length). Intel Macs are still supported, but Apple Silicon Macs offer best performance with this bigger model.
The macOS version of the app now runs in the background when invoked from the Shortcuts App and preserves any existing conversation in the UI.
Minor bugfixes and improvements.

v1.5.1 - iOS

about 2 years ago

Fix for a non-deterministic crash while regenerating LLM responses
Slightly reduced memory footprint with Llama 2 7B class models (on supported devices).
Minor performance improvements.

v1.1.1 - macOS

about 2 years ago

Minor bug fix release. Fixed a bug where the temperature and top-p settings could not be changed from their default values.

v1.1 - macOS

about 2 years ago

Significantly improved performance with Metal accelerated inference. ~36% faster on Apple Silicon and ~61% faster on Intel Macs.
All iOS app features have been back ported to the macOS app.
This will be the last release to support Intel Macs along with Apple Silicon, future feature releases will only support Apple Silicon.

v1.5 - iOS

about 2 years ago

Improved model quantization.
Option to download a larger 7 billion parameter model (in addition to the default 3B parameter model) on iPhone 14 series phones and M1, M2 iPads.
Edit and continue for prompts.
Ability to regenerate LLM responses.
Conversations are now persisted between the app restarts.
App has now been renamed to Private LLM

v1.1.4 - iOS

over 2 years ago

Shortcuts and Siri support (App intents)
Slightly improved inference performance (Upgrade from Metal 2.3 to Metal 3.0)
Improved text selection (Long press on any message)
Ability to share any response from the bot using a share sheet (Also long press on any message)
Added a help view with frequently asked questions and example prompts
Added support for the x-callback-url specification (Supported by Shortcuts and many other apps)

v1.1.3 - iOS

over 2 years ago

Fix for crash when resetting settings.
Faster text generation on older iPhone 11 and iPhone 12 devices.

v1.1.2 - iOS

over 2 years ago

This is an accessibility focused minor release, with the following changes:

Accessibility: Basic VoiceOver support (Based on user feedback).
Haptic feedback on iPhones (Can be disabled from settings).
Further reduced model's memory footprint. The app now needs 2GB of free memory to load the LLM model.

This further improves compatibility with older devices.

v1.1.1 - iOS

over 2 years ago

This is a minor bug fix release, with the following changes:

Fixed model loading issue on iPhone 13 and iPhone 13 mini.
More accurate model footprint. The app upon launch, now ensures that 2.1GB of memory is free, instead of 3GB. This should help with running the app on older devices like iPhone 11 Pro and iPhone 11 Pro Max.

v1.1 - iOS

over 2 years ago

This release marks a huge overhaul of the Private LLM for iOS codebase.

New Metal backend (about 1.5x faster than the older backend in 1.0.3, although there's a slight increase in the startup time).
~7.7% smaller app binary (1.54GB -> 1.42GB)
The language model now has a much better conversational memory.
The "Regenerate last response" button has been removed, based on feedback from users.
Fixed weird syntax hilighting of code, when the language could be auto-detected.