Release Notes

v1.8.6 - macOS
about 15 hours ago
  • Support for downloading a 4-bit OmniQuant quantized version of the Meta-Llama-3-70B-Instruct model on Apple Silicon Macs with 48GB or more RAM.
  • Support for downloading a 4-bit OmniQuant quantized version of the new Phi-3-Mini based kappa-3-phi-abliterated model on all Macs.
  • Stability improvements and bug fixes.
v1.8.4 - iOS
about 15 hours ago
  • Support for downloading a 4-bit OmniQuant quantized version of the new Phi-3-Mini based kappa-3-phi-abliterated model on all devices with 6GB or more RAM.
  • Stability improvements and bug fixes.
v1.8.5 - macOS
4 days ago
  • Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 8B Instruct model.
  • Support for downloading a 4-bit OmniQuant quantized version of the Dolphin 2.9 Llama 3 8B model.
  • Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 Smaug 8B model.
  • Support for downloading a 4-bit OmniQuant quantized version of the Llama 3 8B based OpenBioLLM-8B model.
  • Support for downloading a 4-bit OmniQuant quantized version of the Hermes 2 Pro - Llama-3 8B model.
  • Support for downloading a 4-bit OmniQuant quantized version of the Phi-3-Mini model.
  • Support for downloading a 4-bit OmniQuant quantized version of the bilingual (Hebrew, English) DictaLM-2.0-Instruct model.
  • Stability improvements and bug fixes.
v1.8.3 - iOS
4 days ago
  • Support for downloading a 3-bit OmniQuant quantized version of the Llama 3 8B based OpenBioLLM-8B model.
  • Support for downloading a 3-bit OmniQuant quantized version of the Hermes 2 Pro - Llama-3 8B model.
  • Support for downloading a 3-bit OmniQuant quantized version of the bilingual (Hebrew, English) DictaLM-2.0-Instruct model.
  • Users on iPhone 11, 12, 13 Pro, Pro Max devices can now download the faster and older fully quantized version of the Phi-3-Mini model.
  • Private LLM now uses the loaded model's default system prompt if the system prompt is blank when invoked from app intents (Siri and Shortcuts).
  • Fixed a bug where temperature and top-p settings were not being persisted across app restarts.
  • Stability improvements and bug fixes.
v1.8.2 - iOS
11 days ago
  • Support for downloading an improved version of the new Phi-3-mini-4k-instruct model with an unquantized embedding layer.
  • The old Phi-3-mini-4k-instruct model has been deprecated, and will continue to be functional for the next two releases.
  • Fixed bug where the "+" character was elided from prompts when Private LLM is invoked from iOS Shortcuts.
  • Stability improvements and bug fixes.
v1.8.1 - iOS
15 days ago
  • Support for downloading the new Phi-3-mini-4k-instruct model.
  • Support for downloading the Llama 3 based Smaug-8B model.
  • Stability improvements and bug fixes.
v1.8.0 - iOS
17 days ago
  • Support for downloading the new Dolphin 2.9 Llama 3 8B uncensored model.
v1.7.9 - iOS
19 days ago
  • Fix for issues with loading the builtin StableLM 2 1.6B model and stability fixes on older iOS devices.
v1.7.8 - iOS
19 days ago
  • Support for downloading the new Llama 3 8B Instruct model (Supported on all iOS and iPadOS devices with 6GB or more RAM).
v1.7.7 - iOS
24 days ago
  • Fix for compatibility issue with Yi-6B and iPhone 13 Pro and Pro Max devices running iOS 17.4.1.
  • Fix for compatibility issue with 3B models on iPhone 13 devices running iOS 17.4.1.
  • Minor inference performance enhancements for all supported models.
v1.8.4 - macOS
25 days ago
  • New 4-bit OmniQuant quantized downloadable model: Gemma 1.1 2B IT (Downloadable on all compatible Macs, also available on the iOS version of the app).
  • New 4-bit OmniQuant quantized downloadable model: Dolphin 2.6 Mixtral 8x7B (Downloadable on Apple Silicon Macs with 32GB or more RAM).
  • New 4-bit OmniQuant quantized downloadable model: Nous Hermes 2 Mixtral 8x7B DPO (Downloadable on Apple Silicon Macs with 32GB or more RAM).
  • Minor bug fixes and improvements.
v1.7.6 - iOS
30 days ago
  • New 4-bit OmniQuant quantized downloadable model: gemma-1.1-2b-it (Downloadable on all iOS devices with 8GB or more RAM).
  • New 3-bit OmniQuant quantized downloadable model: Dolphin 2.8 Mistral 7b v0.2 (Downloadable on all iOS devices with 6GB or more RAM).
  • The downloaded models directory is now marked as excluded from iCloud backups.
v1.8.3 - macOS
about 1 month ago
  • New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-6B-Chat (Downloadable on all compatible Macs).
  • New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-34B-Chat (Downloadable on Apple Silicon Macs with 24GB or more RAM).
  • New 4-bit OmniQuant quantized downloadable model: Starling 7B Beta (Downloadable on all compatible Macs).
  • WizardLM 33B model can now be downloaded on Apple Silicon Macs with 24GB or more RAM (Previously restricted to Apple Silicon Macs with 32GB or more RAM).
  • CodeNinja and openchat-3.5-0106 models can now be downloaded on Macs running the older version of macOS (macOS 13 Ventura).
  • Configurable option to show abridged system prompt in the chat window.
  • Minor bug fixes and improvements.
v1.7.5 - iOS
about 1 month ago
  • New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-6B-Chat (Downloadable on all iOS devices with 6GB or more RAM).
  • New 3-bit OmniQuant quantized downloadable model: Starling 7B Beta (Downloadable on all iOS devices with 6GB or more RAM).
  • New 3-bit OmniQuant quantized downloadable model: openchat-3.5-0106 (Downloadable on all iOS devices with 6GB or more RAM).
  • New 3-bit OmniQuant quantized downloadable model: CodeNinja-1.0 (Downloadable on all iOS devices with 6GB or more RAM).
  • Configurable option to show abridged system prompt in the chat window.
  • Minor bug fixes and improvements.
v1.8.2 - macOS
about 1 month ago
  • Support for downloading the Japanese RakutenAI-7B-chat model.
  • Users can now switch models from the Chat view, without opening Settings.
  • Fixed a crash with the WizardLM 33B model.
v1.7.4 - iOS
about 1 month ago
  • Support for downloading the Japanese RakutenAI-7B-chat model.
  • Minor bug fixes and updates.
v1.8.1 - macOS
about 1 month ago
  • Minor bugfix release
v1.8.0 - macOS
about 1 month ago
  • The built-in model is now StableLM Zephyr 3B. The previous built-in model is still downloadable.
  • New Phi-2 3B based downloadable model: Phi-2 Orange v2
  • New Mistral 7B based downloadable model: Hermes 2 Pro
  • Slightly improved performance and reduced memory footprint with the Mixtral model.
  • Minor bug-fixes and performance improvements.
v1.7.3 - iOS
about 1 month ago
  • New 1.8B downloadable model: H2O Danube 1.8B Chat (downloadable on all devices).
  • New Phi-2 3B based downloadable model: Phi-2 Orange v2 (downloadable on all devices with 4GB or more RAM).
  • New Mistral 7B based downloadable model: Hermes 2 Pro (downloadable on all devices with 6GB or more RAM).
  • Minor bug-fixes and performance improvements.
v1.7.2 - iOS
about 2 months ago
  • Added the ability to switch models without leaving the chat interface.
  • Minor bug fixes and updates.
v1.7.1 - iOS
about 2 months ago
  • Fix for crash while loading OpenHermes 2.5 Mistral 7B model on supported devices.
  • Added support for downloading the Phi 2 Super model on devices with 4GB or more RAM.
v1.7.9 - macOS
about 2 months ago
  • New downloadable model: merlinite-7b (a Mistral 7B based model that was distilled from Mixtral-8x7B-Instruct-v0.1)
  • Minor improvements to offline grammar correction macOS service, especially for non-English Western European languages.
  • Model downloads view now highlights general purpose models that are recommended by the developers.
v1.7.0 - iOS
about 2 months ago
  • The builtin LLM is the new StableLM 2 Zephyr 1.6B.
  • Support for downloading multiple models from the following families: TinyLlama 1.1B, Stable LM 3B, Phi-2 3B, Mistral 7B, Llama 7B and Gemma 2B
  • All models, downloadable and builtin are quantized with the SOTA OmniQuant quantization method.
  • The list of downloadable model vary by the amount of physical memory on the device that the app is running on.
  • All models are downloadable on iPhone 15 Pro, Pro Max and Apple Silicon iPads.
  • Support for editing system prompts.
v1.7.8 - macOS
2 months ago
  • Further improved Mixtral model with unquantized embedding and MoE gates weights, the rest of the weights are 4 bit OmniQuant quantized.
  • The old Mixtral model is now deprecated. Users who had previously downloaded the Mixtral model can still keep using it, if they wish to.
  • Mistral Instruct vO.2, Nous Hermes 2 Mistral 7B DPO and BioMistral 7B models now load with full 32k context length if the app finds at least 8.69GB of free memory, otherwise they're loaded with a 4k context length.
  • Grammar correction macOS service now uses the OS locale to determine the English spellings to use.
  • Experimental support for non-English European languages in macOS services (works best with Western European languages and with larger models). This needs to be enabled in app settings.
v1.7.7 - macOS
2 months ago
  • Support for three new downloadable models: BioMistral-7B, Gemma 2B, Nous-Hermes-2-Mistral-7B-DPO
v1.7.6 - macOS
3 months ago
  • Fix for crash on macOS Ventura (version 13.x).
v1.7.5 - macOS
3 months ago
  • Five new OmniQuant quantized downloadable models: WhiteRabbit Neo 13B v1 cybersecurity model, Mixtral 8x7B Instruct v0.1, WizardLM 33B 1.0 Uncensored, Mistral 7B Instruct v0.2 and Phi-2 Orange 3B.
  • WhiteRabbit Neo 13B v1 needs Apple Silicon Macs with at least 16GB of RAM, Mixtral 8x7B Instruct vO.1, WizardLM 33B 1.0 Uncensored needs Apple Silicon Macs with at least 32GB of RAM, and Mixtral 8x7B Instruct v0.1 works on all Macs.
  • MacOS services for system wide and offline Grammar correction, Summarization, Text shortening and Rephrasing.
  • The default base model that's bundled with the app is now Dolphin 2.6 phi-2 3B model. The older base model (Spicyboros-7b-2.2) can still be downloaded on any Mac.
  • Miscellaneous bug fixes and improvements.
v1.7.0 - macOS
4 months ago
  • Support for downloading two new models on Apple Silicon Macs openchat-3.5-0106 and CodeNinja-1.0-OpenChat-7B (coding model).
  • Users can now customize system prompts in app settings.
  • Minor bug fixes and improvements.
v1.6.2 - macOS
4 months ago
  • Support for the new Nous-Hermes-2-SOLAR-10.7B model on Apple Silicon Macs with 16GB or more RAM (Full 4k context support with 6.85GB of memory footprint).
  • Bugfix: Allow deleting partially downloaded models.
v1.6.7 - iOS
5 months ago
  • More memory optimizations for improved compatibility of the optional 7B parameter model on iPhone 15.
v1.6.1 - macOS
5 months ago
  • Bugfix: Allow switching to downloaded 7B models on Apple Silicon Macs with 8GB of RAM and Intel Macs.
v1.6.6 - iOS
5 months ago
  • A slew of memory optimizations.
  • Minor performance improvements stemming from the aforementioned memory optimizations.
  • Private LLM now works on older iPhones and iPads with 3GB of RAM, like the 2nd Gen iPhone SE and the 9th Gen iPad. Previously, it needed devices with at least 4GB of RAM to function.
v1.6 - macOS
6 months ago
  • The builtin model is now spicyboros-7b-2.2
  • Support for more downloadable models
    • Llama2 7B based models
    • airoboros-12-7b-3.0
    • Xwin-LM-13B-V0.1
    • Llama2 13B based models (on Apple Silicon Macs with 16GB or more RAM)
    • WizardLM-13B-V1.2
    • spicyboros-13b-2.2
    • Xwin-LM-13B-V0.1
    • MythoMax-L2-13b
    • Mistral 7B based models
    • Mistral-7B-OpenOrca
    • Mistral-7B-Instruct-v0.1
    • zephyr-7b-beta
    • leo-mistral-hessianai-7b-chat (German model)
    • jackalope-7b
    • dolphin-2.1-mistral-7b
    • samantha-1.2-mistral-7b
    • OpenHermes-2-Mistral-7B
    • SynthIA-7B-v2.0
    • airoboros-m-7b-3.1.2
    • Mistral-Trismegistus-7B
    • openchat_3.5
v1.6.5 - iOS
6 months ago
  • Update to use an alternative mirror to download the 7B model if the primary download source is inaccessible.
  • Minor bug fixes and updates.
v1.6.4 - iOS
6 months ago
  • Fixed a memory and performance issue with the optional
  • 7B parameter model on iPhone 14 Pro, iPhone 14 Pro Max and iPhone 15 phones.
v1.6.3 - iOS
7 months ago
  • Minor bug fixes and improvements.
  • Improved model perplexity with 7B model on iPhone 14 Pro and Pro Max phones.
v1.5 - macOS
7 months ago
  • The baseline 7B LLM in the app is now Mistral-7B-OpenOrca.
  • Optimizations to reduce memory footprint of LLMs.
  • 13B model now consumes ~6% less RAM.
  • 7B model now consumes ~14% less RAM (this is party due to the new 7B model's architecture).
  • Minor bug-fixes and improvements.
v1.6.2 - iOS
7 months ago
  • Minor bug fixes and improvements.
v1.6.1 - iOS
8 months ago
  • Minor bug fix in App Intents (Shortcuts app Integration)
  • Minor performance improvements
v1.4.1 - macOS
8 months ago
  • Minor bug fix in App Intents (Shortcuts app Integration)
  • Minor performance improvements
v1.6 - iOS
8 months ago
  • Support for iPhone 15 series phones.
  • The optional 7B parameter models are now quantised with the state of the art OmniQuant algorithm.
  • Support for the downloadable 7B parameter model on iPhone 13 Pro and iPhone 13 Pro Max devices (previously, only iPhone 14 series phones and Apple Silicon iPads were supported).
v1.4 - macOS
8 months ago
  • Model: All models are now quantized using the state of the art OmniQuant quantization algorithm. This significantly improves models' perplexity and text generation.
  • Users who had previously downloaded the 13B WizardLM V1.2 model can optionally download an update with the improved version of the model.
  • UI: Allow right-clicking and text selection in the message bubbles (based on numerous users' requests).
v1.3 - macOS
8 months ago
  • Option to download the 13B parameter WizardLM-13B-V1.2 model on Apple Silicon Macs with 16GB or more RAM.
  • Minor bug fixes and improvements.
v1.2.1 - macOS
8 months ago
  • Significant performance improvement for users with Intel Macs and Apple Silicon Macs with 8GB of RAM.
  • App Intent (for Shortcuts support) now supports an optional System prompt, along with the query. This feature is based on a user request.
  • Minor bug fixes and improvements.
v1.2 - macOS
8 months ago
  • The macOS version of the app now ships with a bigger and better Llama 2 based 7B parameter model with 4k context length (Previous versions shipped with a 3B model with 2k context length). Intel Macs are still supported, but Apple Silicon Macs offer best performance with this bigger model.
  • The macOS version of the app now runs in the background when invoked from the Shortcuts App and preserves any existing conversation in the UI.
  • Minor bugfixes and improvements.
v1.5.1 - iOS
9 months ago
  • Fix for a non-deterministic crash while regenerating LLM responses
  • Slightly reduced memory footprint with Llama 2 7B class models (on supported devices).
  • Minor performance improvements.
v1.1.1 - macOS
9 months ago
  • Minor bug fix release. Fixed a bug where the temperature and top-p settings could not be changed from their default values.
v1.1 - macOS
9 months ago
  • Significantly improved performance with Metal accelerated inference. ~36% faster on Apple Silicon and ~61% faster on Intel Macs.
  • All iOS app features have been back ported to the macOS app.
  • This will be the last release to support Intel Macs along with Apple Silicon, future feature releases will only support Apple Silicon.
v1.5 - iOS
9 months ago
  • Improved model quantization.
  • Option to download a larger 7 billion parameter model (in addition to the default 3B parameter model) on iPhone 14 series phones and M1, M2 iPads.
  • Edit and continue for prompts.
  • Ability to regenerate LLM responses.
  • Conversations are now persisted between the app restarts.
  • App has now been renamed to Private LLM
v1.1.4 - iOS
11 months ago
  • Shortcuts and Siri support (App intents)
  • Slightly improved inference performance (Upgrade from Metal 2.3 to Metal 3.0)
  • Improved text selection (Long press on any message)
  • Ability to share any response from the bot using a share sheet (Also long press on any message)
  • Added a help view with frequently asked questions and example prompts
  • Added support for the x-callback-url specification (Supported by Shortcuts and many other apps)
v1.1.3 - iOS
11 months ago
  • Fix for crash when resetting settings.
  • Faster text generation on older iPhone 11 and iPhone 12 devices.
v1.1.2 - iOS
12 months ago

This is an accessibility focused minor release, with the following changes:

  • Accessibility: Basic VoiceOver support (Based on user feedback).
  • Haptic feedback on iPhones (Can be disabled from settings).
  • Further reduced model's memory footprint. The app now needs 2GB of free memory to load the LLM model.

This further improves compatibility with older devices.

v1.1.1 - iOS
12 months ago

This is a minor bug fix release, with the following changes:

  • Fixed model loading issue on iPhone 13 and iPhone 13 mini.
  • More accurate model footprint. The app upon launch, now ensures that 2.1GB of memory is free, instead of 3GB. This should help with running the app on older devices like iPhone 11 Pro and iPhone 11 Pro Max.
v1.1 - iOS
12 months ago

This release marks a huge overhaul of the Private LLM for iOS codebase.

  • New Metal backend (about 1.5x faster than the older backend in 1.0.3, although there's a slight increase in the startup time).
  • ~7.7% smaller app binary (1.54GB -> 1.42GB)
  • The language model now has a much better conversational memory.
  • The "Regenerate last response" button has been removed, based on feedback from users.
  • Fixed weird syntax hilighting of code, when the language could be auto-detected.