## v2.4.2 #### 🚀 Features - Integrate DeepSeek models (PR: #312) - Update LlamaLib to v1.2.3 (llama.cpp b4688) (PR: #312) - Drop CUDA 11.7.1 support (PR: #312) - Add warm-up function for provided prompt (PR: #301) - Add documentation in Unity tooltips (PR: #302) #### 🐛 Fixes - Fix code signing on iOS (PR: #298) - Persist debug mode and use of extras to the build (PR: #304) - Fix dependency resolution for full CUDA and vulkan architectures (PR: #313) ## v2.4.1 #### 🚀 Features - Static library linking on mobile (fixes iOS signing) (PR: #289) #### 🐛 Fixes - Fix support for extras (flash attention, iQ quants) (PR: #292) ## v2.4.0 #### 🚀 Features - iOS deployment (PR: #267) - Improve building process (PR: #282) - Add structured output / function calling sample (PR: #281) - Update LlamaLib to v1.2.0 (llama.cpp b4218) (PR: #283) #### 🐛 Fixes - Clear temp build directory before building (PR: #278) #### 📦 General - Remove support for extras (flash attention, iQ quants) (PR: #284) - remove support for LLM base prompt (PR: #285) ## v2.3.0 #### 🚀 Features - Implement Retrieval Augmented Generation (RAG) in LLMUnity (PR: #246) #### 🐛 Fixes - Fixed build conflict, endless import of resources. (PR: #266) ## v2.2.4 #### 🚀 Features - Add Phi-3.5 and Llama 3.2 models (PR: #255) - Speedup LLMCharacter warmup (PR: #257) #### 🐛 Fixes - Fix handling of incomplete requests (PR: #251) - Fix Unity locking of DLLs during cross-platform build (PR: #252) - Allow spaces in lora paths (PR: #254) #### 📦 General - Set default context size to 8192 and allow to adjust with a UI slider (PR: #258) ## v2.2.3 #### 🚀 Features - LlamaLib v1.1.12: SSL certificate & API key for server, Support more AMD GPUs (PR: #241) - Server security with API key and SSL (PR: #238) - Show server command for easier deployment (PR #239) #### 🐛 Fixes - Fix multiple LLM crash on Windows (PR: #242) - Exclude system prompt from saving of chat history (PR: #240) ## v2.2.2 #### 🚀 Features - Allow to set the LLMCharacter slot (PR: #231) #### 🐛 Fixes - fix adding grammar from StreamingAssets (PR: #229) - fix library setup restart when interrupted (PR: #232) - Remove unnecessary Android linking in IL2CPP builds (PR: #233) ## v2.2.1 #### 🐛 Fixes - Fix naming showing full path when loading model (PR: #224) - Fix parallel prompts (PR: #226) ## v2.2.0 #### 🚀 Features - Implement embedding and lora adapter functionality (PR: #210) - Read context length and warn if it is very large (PR: #211) - Setup allowing to use extra features: flash attention and IQ quants (PR: #216) - Allow HTTP request retries for remote server (PR: #217) - Allow to set lora weights at startup, add unit test (PR: #219) - allow relative StreamingAssets paths for models (PR: #221) #### 🐛 Fixes - Fix set template for remote setup (PR: #208) - fix crash when stopping scene before LLM creation (PR: #214) #### 📦 General - Documentation/point to gguf format for lora (PR: #215) ## v2.1.1 #### 🐛 Fixes - Resolve build directory creation ## v2.1.0 #### 🚀 Features - Android deployment (PR: #194) - Allow to download models on startup with resumable download functionality (PR: #196) - LLM model manager (PR: #196) - Add Llama 3 7B and Qwen2 0.5B models (PR: #198) - Start LLM always asynchronously (PR: #199) - Add contributing guidelines (PR: #201) ## v2.0.3 #### 🚀 Features - Add LLM selector in Inspector mode (PR: #182) - Allow to save chat history at custom path (PR: #179) - Use asynchronous startup by default (PR: #186) - Assign LLM if not set according to the scene and hierarchy (PR: #187) - Allow to set log level (PR: #189) - Allow to add callback functions for error messages (PR: #190) - Allow to set a LLM base prompt for all LLMCharacter objects (PR: #192) #### 🐛 Fixes - set higher priority for mac build with Accelerate than without (PR: #180) - Fix duplicate bos warning ## v2.0.2 #### 🐛 Fixes - Fix bugs in chat completion (PR: #176) - Call DontDestroyOnLoad on root to remove warning (PR: #174) ## v2.0.1 #### 🚀 Features - Implement backend with DLLs (PR: #163) - Separate LLM from LLMClient functionality (PR: #163) - Add sample with RAG and LLM integration (PR: #170) ## v1.2.9 #### 🐛 Fixes - disable GPU compilation when running on CPU (PR: #159) ## v1.2.8 #### 🚀 Features - Switch to llamafile v0.8.6 (PR: #155) - Add phi-3 support (PR: #156) ## v1.2.7 #### 🚀 Features - Add Llama 3 and Vicuna chat templates (PR: #145) #### 📦 General - Use the context size of the model by default for longer history (PR: #147) ## v1.2.6 #### 🚀 Features - Add documentation (PR: #135) #### 🐛 Fixes - Add server security for interceptions from external llamafile servers (PR: #132) - Adapt server security for macOS (PR: #137) #### 📦 General - Add sample to demonstrates the async functionality (PR: #136) ## v1.2.5 #### 🐛 Fixes - Add to chat history only if the response is not null (PR: #123) - Allow SetTemplate function in Runtime (PR: #129) ## v1.2.4 #### 🚀 Features - Use llamafile v0.6.2 (PR: #111) - Pure text completion functionality (PR: #115) - Allow change of roles after starting the interaction (PR: #120) #### 🐛 Fixes - use Debug.LogError instead of Exception for more verbosity (PR: #113) - Trim chat responses (PR: #118) - Fallback to CPU for macOS with unsupported GPU (PR: #119) - Removed duplicate EditorGUI.EndChangeCheck() (PR: #110) #### 📦 General - Provide access to LLMUnity version (PR: #117) - Rename to "LLM for Unity" (PR: #121) ## v1.2.3 #### 🐛 Fixes - Fix async server 2 (PR: #108) ## v1.2.2 #### 🐛 Fixes - use namespaces in all classes (PR: #104) - await separately in StartServer (PR: #107) ## v1.2.1 #### 🐛 Fixes - Kill server after Unity crash (PR: #101) - Persist chat template on remote servers (PR: #103) ## v1.2.0 #### 🚀 Features - LLM server unit tests (PR: #90) - Implement chat templates (PR: #92) - Stop chat functionality (PR: #95) - Keep only the llamafile binary (PR: #97) #### 🐛 Fixes - Fix remote server functionality (PR: #96) - Fix Max issue needing to run llamafile manually the first time (PR: #98) #### 📦 General - Async startup support (PR: #89) ## v1.1.1 #### 📦 General - Refactoring and small enhancements (PR: #80) ## v1.0.6 #### 🐛 Fixes - Fix Mac command spaces (PR: #71) ## v1.0.5 #### 🚀 Features - Expose new llama.cpp arguments (PR: #60) - Allow to change prompt (PR: #64) - Feature/variable sliders (PR: #65) - Feature/show expert options (PR: #66) - Improve package loading (PR: #67) #### 🐛 Fixes - Fail if port is already in use (PR: #62) - Run server without mmap on mmap crash (PR: #63) ## v1.0.4 #### 🐛 Fixes - Fix download function (PR: #51) #### 📦 General - Added how settings impact generation to the readme (PR: #49) ## v1.0.3 #### 🐛 Fixes - fix slash in windows paths (PR: #42) - Fix chmod when deploying from windows (PR: #43) ## v1.0.2 #### 🚀 Features - Code auto-formatting (PR: #26) - Setup auto-formatting precommit (PR: #31) - Start server on Awake instead of OnEnable (PR: #28) - AMD support, switch to llamafile 0.6 (PR: #33) - Release workflows (PR: #35) #### 🐛 Fixes - Support Unity 2021 LTS (PR: #32) - Fix macOS command (PR: #34) - Release fixes and readme (PR: #36) ## v1.0.1 - Fix running commands for projects with space in path - closes #8 - closes #9 - Fix sample scenes for different screen resolutions - closes #10 - Allow parallel prompts