Release notes

Private LLM Release Notes

Every Private LLM release for iPhone, iPad, and Mac. See what's new in Local AI chat.

Private LLM iPhone & iPad Release Notes

  1. Version 1.9.12

    Latest

    - Accessibility improvements - Minor bug fixes and updates Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to join our Discord, email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  2. Version 1.9.11

    - Support for the Qwen3-4B-Instruct-2507-heretic abliterated model (on any iOS device with 6GB or more RAM) - Support for the Qwen3-4B-Instruct-2507-heretic-noslop model (on any iOS device with 6GB or more RAM) - The noslop model has been specially tuned with abliterated to reduce LLM slop in its generated outputs and is exclusively available only on Private LLM - Minor bug fixes and updates Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to join our Discord, email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  3. Version 1.9.10

    Minor compatibility fixes with iOS 26

  4. Version 1.9.9

    - Support for two Qwen3 4B Instruct 2507 based models: Qwen3 4B Instruct 2507 abliterated and Josiefied Qwen3 4B Instruct 2507 (on any iOS device with 6GB or more RAM) - Minor bug fixes and updates

  5. Version 1.9.8

    - Support for the new Qwen3 4B Instruct 2507 model (on any iOS device with 6GB or more RAM) - Minor bug fixes and updates

  6. Version 1.9.7

    - Added support for a 3-bit OmniQuant quantized version of the Llama-3.1-8B-UltraMedical model - Added support for a 3-bit OmniQuant quantized version of the Meta-Llama-3.1-8B-SurviveV3 survival specialist model - Added support for a 4-bit GPTQ quantized version of the Openhands 7B coding model - Added support for 4-bit QAT version of the Google Gemma3 1B IT model (32k ctx on iPhones with 6GB or more RAM, 8k on older iPhones with 4GB of RAM) - Added support for 4-bit OmniQuant quantized versions of the Google Gemma3 1B based gemma-3-1b-it-abliterated and amoral-gemma3-1B-v2 models - Many other minor bug fixes and updates

  7. Version 1.9.6

    - Added support for 8 new models from the Dolphin 3.0 family of models - Added support for the unquantized version of the Llama 3.2 1B Instruct Abliterated model - Added support for the 4-bit quantized Gemma 2 Ifable 9B creative writing model (downloadable on M-series iPad Pros with 16GB of RAM) - Context length is now displayed in the model quick switcher - Minor bug fixes and updates

  8. Version 1.9.5

    * Support for downloading 7 new DeepSeek R1 Distill based models on Apple Silicon Macs. Support for individual models varies by device capabilities. * Users with Apple Silicon Macs with 16GB RAM can now download the phi-4 model (previously restricted to Apple Silicon Macs with 24 GB of RAM) * Minor bugfixes and updates.

  9. Version 1.9.4

    Bugfix release: Fix for crash while loading 14B models on iPad Pros with 16GB of RAM

  10. Version 1.9.3

    - Support for downloading 12 new models (varies by device capacity). - Hermes-3-Llama-3.2-3B and Hermes-3-Llama-3.1-8B models - FuseChat-Llama-3.2-1B-Instruct, FuseChat-Llama-3.2-3B-Instruct, FuseChat-Llama-3.1-8B-Instruct, FuseChat-Qwen-2.5-7B-Instruct and FuseChat-Gemma-2-9B-Instruct models - FuseChat-Llama-3.2-1B-Instruct also has an unquantized variant, downloadable on devices with 6GB or more RAM - EVA-D-Qwen2.5-1.5B-v0.0, EVA-Qwen2.5-7B-v0.1 and EVA-Qwen2.5-14B-v0.2 models - Llama-3.1-8B-Lexi-Uncensored-V2 model - Improved LaTeX rendering - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  11. Version 1.9.2

    - Support for downloading 8 new models. - Added support for downloading Qwen 2.5 family of models (0.5B-14B) - Added support for downloading Qwen 2.5 Coder family of models (0.5B-14B) - Support for individual models across both families of models varies by the amount of physical memory on devices. - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  12. Version 1.9.1

    - Bugfix release: fix for crash while loading some of the older models that use the sentencepiece tokenizer. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  13. Version 1.9.0

    - Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 1B Instruct abliterated model (on all iOS devices). - Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 3B Instruct abliterated model (on devices with 6GB or more RAM). - Added support for downloading 4-bit Omniquant quantized version the Llama 3.2 3B Instruct uncensored model (on devices with 6GB or more RAM). - Added support for downloading 4-bit Omniquant quantized version the Gemma 2 9B IT model (on M1/M2/M4 iPad Pros with 16GB of RAM). - Added support for downloading 4-bit Omniquant quantized version the Gemma 2 9B IT SPPO Iter3 model (on M1/M2/M4 iPad Pros with 16GB of RAM). - Added support for downloading 4-bit Omniquant quantized version the Tiger-Gemma-9B-v3 model (on M1/M2/M4 iPad Pros with 16GB of RAM). - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  14. Version 1.8.9

    - Added support for downloading 4-bit Omniquant quantized version the new Llama 3.2 1B Instruct model (on all iOS devices). - Added support for downloading 4-bit Omniquant quantized version the new Llama 3.2 3B Instruct model (on devices with 6GB or more RAM). - Added support for downloading the unquantized version of the Llama 3.2 1B Instruct model (on devices with 6GB or more RAM). - Support for rendering Latex math formulas in LLM generated text. - Users can now copy debug information and also email our support address, from the help view in the app. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  15. Version 1.8.8

    - Fix for a non-deterministic crash while downloading Gemma 2B based models on older devices with 4GB of RAM. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  16. Version 1.8.7

    - Support for downloading 2 new models from the Gemma 2 family of models (on all devices with 4GB or more RAM). - 4-bit OmniQuant quantized version of the gemma-2-2b-it model. - 4-bit OmniQuant quantized version of the multilingual SauerkrautLM-gemma-2-2b-it model. - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  17. Version 1.8.6

    - Support for downloading 4 new models. Two models from the new Meta Llama 3.1 family of models and two Meta Llama 3 based models (Support varies by device capabilities). - 3-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct model. - 3-bit OmniQuant quantized version of the Meta Llama 3.1 8B Instruct abliterated model. - 3-bit OmniQuant quantized version of the Llama 3 based L3 Umbral Mind RP v3.0 model. - 3-bit OmniQuant quantized version of the Llama 3 based Llama 3 Instruct 8B SPPO Iter3 model. - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  18. Version 1.8.5

    - Support for downloading 9 new models (support varies by device capabilities). - 3-bit OmniQuant quantized version of Mistral 7B Instruct v0.3 - 3-bit OmniQuant quantized version of Meta-Llama-3-8B-Instruct-abliterated-v3 - 3-bit OmniQuant quantized version of Llama-3-8B-Instruct-MopeyMule - 3-bit OmniQuant quantized version of openchat-3.6-8b-20240522 - 3-bit OmniQuant quantized version of Llama-3-WhiteRabbitNeo-8B-v2.0 - 3-bit OmniQuant quantized version of Hermes-2-Theta-Llama-3-8B - 3-bit OmniQuant quantized version of LLaMA3-iterative-DPO-final - 3-bit OmniQuant quantized version of Hathor_Stable-v0.2-L3-8B - 3-bit OmniQuant quantized version of NeuralDaredevil-8B-abliterated - Minor UI improvements - Stability improvements and bug fixes. Thank you for choosing Private LLM. We are committed to continue improving the app and to making it more useful for you. For support requests and feature suggestions, please feel free to email us at [email protected], or tweet us @private_llm. If you enjoy the app, leaving an App Store is a great way to support us.

  19. Version 1.8.4

    - Support for downloading a 4-bit OmniQuant quantized version of the new Phi-3-Mini based kappa-3-phi-abliterated model on all devices with 6GB or more RAM. - Stability improvements and bug fixes.

  20. Version 1.8.3

    - Support for downloading a 3-bit OmniQuant quantized version of the Llama 3 8B based OpenBioLLM-8B model. - Support for downloading a 3-bit OmniQuant quantized version of the Hermes 2 Pro - Llama-3 8B model. - Support for downloading a 3-bit OmniQuant quantized version of the bilingual (Hebrew, English) DictaLM-2.0-Instruct model. - Users on iPhone 11, 12, 13 Pro, Pro Max devices can now download the faster and older fully quantized version of the Phi-3-Mini model. - Private LLM now uses the loaded model's default system prompt if the system prompt is blank when invoked from app intents (Siri and Shortcuts). - Fixed a bug where temperature and top-p settings were not being persisted across app restarts. - Stability improvements and bug fixes.

  21. Version 1.8.2

    - Support for downloading an improved version of the new Phi-3-mini-4k-instruct model with an unquantized embedding layer. - The old Phi-3-mini-4k-instruct model has been deprecated, and will continue to be functional for the next two releases. - Fixed bug where the "+" character was elided from prompts when Private LLM is invoked from iOS Shortcuts. - Stability improvements and bug fixes. If you have any feedback or questions, we would love to hear from you! Numen Technologies offers free tech support; you can email us at [email protected], message us on Discord, or tweet at us @private_llm. If you find Private LLM useful, we would appreciate a review on the App Store. Your review will help others discover Private LLM.

  22. Version 1.8.1

    - Support for downloading the new Phi-3-mini-4k-instruct model. - Support for downloading the Llama 3 based Smaug-8B model. - Stability improvements and bug fixes. If you have any feedback or questions, we would love to hear from you! Numen Technologies offers free tech support; you can email us at [email protected], message us on Discord, or tweet at us @private_llm. If you find Private LLM useful, we would appreciate a review on the App Store. Your review will help others discover Private LLM.

  23. Version 1.8.0

    - Support for downloading the new Dolphin 2.9 Llama 3 8b model. If you have any feedback or questions, we would love to hear from you! Numen Technologies offers free tech support; you can email us at [email protected], message us on Discord, or tweet at us @private_llm. If you find Private LLM useful, we would appreciate a review on the App Store. Your review will help others discover Private LLM.

  24. Version 1.7.9

    Bug-fix release: Fix for issues with loading the builtin StableLM 2 1.6B model and stability fixes on older iOS devices.

  25. Version 1.7.8

    - Support for downloading the new Llama 3 8B Instruct model (Supported on all iOS and iPadOS devices with 6GB or more RAM). If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  26. Version 1.7.7

    - Fix for compatibility issue with Yi-6B and iPhone 13 Pro and Pro Max devices running iOS 17.4.1. - Fix for compatibility issue with 3B models on iPhone 13 devices running iOS 17.4.1. - Minor inference performance enhancements for all supported models. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  27. Version 1.7.6

    - New 4-bit OmniQuant quantized downloadable model: gemma-1.1-2b-it (Downloadable on all iOS devices with 8GB or more RAM). - New 3-bit OmniQuant quantized downloadable model: Dolphin 2.8 Mistral 7b v0.2 (Downloadable on all iOS devices with 6GB or more RAM). - The downloaded models directory is now marked as excluded from iCloud backups. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  28. Version 1.7.5

    - New 4-bit OmniQuant quantized downloadable bilingual(English, Chinese) model: Yi-6B-Chat (Downloadable on all iOS devices with 6GB or more RAM). - New 3-bit OmniQuant quantized downloadable model: Starling 7B Beta (Downloadable on all iOS devices with 6GB or more RAM). - New 3-bit OmniQuant quantized downloadable model: openchat-3.5-0106 (Downloadable on all iOS devices with 6GB or more RAM). - New 3-bit OmniQuant quantized downloadable model: CodeNinja-1.0 (Downloadable on all iOS devices with 6GB or more RAM). - Configurable option to show abridged system prompt in the chat window. - Minor bug fixes and improvements. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  29. Version 1.7.4

    * Support for downloading the Japanese RakutenAI-7B-chat model. * Minor bug fixes and updates. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  30. Version 1.7.3

    * New 1.8B downloadable model: H2O Danube 1.8B Chat (downloadable on all devices). * New Phi-2 3B based downloadable model: Phi-2 Orange v2 (downloadable on all devices with 4GB or more RAM). * New Mistral 7B based downloadable model: Hermes 2 Pro (downloadable on all devices with 6GB or more RAM). * Minor bug-fixes and performance improvements. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  31. Version 1.7.2

    * Added the ability to switch models without leaving the chat interface. * Minor bug fixes and updates. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  32. Version 1.7.1

    * Fix for crash while loading OpenHermes 2.5 Mistral 7B model on supported devices. * Added support for downloading the Phi 2 Super model on devices with 4GB or more RAM. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  33. Version 1.7.0

    * The builtin LLM is the new StableLM 2 Zephyr 1.6B. * Support for downloading multiple models from the following families: TinyLlama 1.1B, Stable LM 3B, Phi-2 3B, Mistral 7B, Llama 7B and Gemma 2B * All models, downloadable and builtin are quantized with the SOTA OmniQuant quantization method. * The list of downloadable model vary by the amount of physical memory on the device that the app is running on. * All models are downloadable on iPhone 15 Pro, Pro Max and Apple Silicon iPads. * Support for editing system prompts. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], message us on our Discord, or Tweet at us @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  34. Version 1.6.7

    More memory optimizations for improved compatibility of the optional 7B parameter model on iPhone 15.

  35. Version 1.6.6

    * A slew of memory optimizations. * Minor performance improvements stemming from the aforementioned memory optimizations. * Private LLM now works on older iPhones and iPads with 3GB of RAM, like the 2nd Gen iPhone SE and the 9th Gen iPad. Previously, it needed devices with at least 4GB of RAM to function. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email: [email protected], or tweet at @private_llm. If you find Private LLM to be useful, we'd appreciate a review on the App Store. Your review will help other people find Private LLM.

  36. Version 1.6.5

    * Update to use an alternative mirror to download the 7B model if the primary download source is inaccessible. * Minor bug fixes and updates. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm

  37. Version 1.6.4

    Fixed a memory and performance issue with the optional 7B parameter model on iPhone 14 Pro, iPhone 14 Pro Max and iPhone 15 phones. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm

  38. Version 1.6.3

    * Minor bug fixes and improvements. * Improved model perplexity with 7B model on iPhone 14 Pro and Pro Max phones. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm

  39. Version 1.6.2

    * Minor bug fixes and improvements. If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm.

  40. Version 1.6.1

    * Minor bug fix in App Intents (Shortcuts app Integration) * Minor performance improvements If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm. If Private LLM empowers you, we would appreciate an App Store review. Your review will help other people find Private LLM and make them more productive too.

  41. Version 1.6

    * Support for iPhone 15 series phones. * The optional 7B parameter models are now quantised with the state of the art OmniQuant algorithm. * Support for the downloadable 7B parameter model on iPhone 13 Pro and iPhone 13 Pro Max devices (previously, only iPhone 14 series phones and Apple Silicon iPads were supported). If you have any feedback or questions, we'd love to hear from you! Numen Technologies offers free tech support; you can email [email protected], or tweet @private_llm. If Private LLM empowers you, we would appreciate an App Store review. Your review will help other people find Private LLM and make them more productive too.

  42. Version 1.5.1

    * Fix for a non-deterministic crash while regenerating LLM responses * Slightly reduced memory footprint with Llama 2 7B class models (on supported devices). * Minor performance improvements. As always, please feel free to email us ([email protected]) or DM us on X (@private_llm), if you have any issue with the app, suggestions or feature requests.

  43. Version 1.5

    * Improved model quantization. * Option to download a larger 7 billion parameter model (in addition to the default 3B parameter model) on iPhone 14 series phones and M1, M2 iPads. * Edit and continue for prompts. * Ability to regenerate LLM responses. * Conversations are now persisted between the app restarts. * App has now been renamed to Private LLM

  44. Version 1.1.4

    1. Shortcuts and Siri support (App intents) 2. Slightly improved inference performance (Upgrade from Metal 2.3 to Metal 3.0) 3. Improved text selection (Long press on any message) 4. Ability to share any response from the bot using a share sheet (Also long press on any message) 5. Added a help view with frequently asked questions and example prompts 6. Added support for the x-callback-url specification (Supported by Shortcuts and many other apps)

  45. Version 1.1.3

    * Fix for crash when resetting settings. * Faster text generation on older iPhone 11 and iPhone 12 devices.

  46. Version 1.1.2

    This is an accessibility focused minor release, with the following changes: * Accessibility: Basic VoiceOver support (Based on user feedback). * Haptic feedback on iPhones (Can be disabled from settings). * Further reduced model's memory footprint. The app now needs 2GB of free memory to load the LLM model. This further improves compatibility with older devices.

  47. Version 1.1.1

    This is a minor bug fix release, with the following changes: * Fixed model loading issue on iPhone 13 and iPhone 13 mini. * More accurate model footprint. The app upon launch, now ensures that 2.1GB of memory is free, instead of 3GB. This should help with running the app on older devices like iPhone 11 Pro and iPhone 11 Pro Max.

  48. Version 1.1

    This release marks a huge overhaul of the Personal GPT for iOS codebase. * New Metal backend (about 1.5x faster than the older backend in v1.0.3, although there's a slight increase in the startup time). * ~7.7% smaller app binary (1.54GB -> 1.42GB) * The language model now has a much better conversational memory. * The “Regenerate last response” button has been removed, based on feedback from users. * Fixed weird syntax hilighting of code, when the language could be auto-detected. As always, please feel free to email us at [email protected] with any comments, suggestions and bug reports.

  49. Version 1.0.3

    * Removed initial prompt hints. * Slightly improved model perplexity. The model is now quantized with k-quantization. * Minor bugfixes

  50. Version 1.0.1

    New and improved fine tuned model

Get Private LLM

Run Local AI chat on iPhone, iPad, and Mac. Your prompts stay on-device.

Download on the App Store