Releases: helixml/helix
0.9.19 - fix ollama cleanup bug
What's Changed
Fix bug which was stopping ollama servers getting cleaned up on runners. This was stopping GPU memory getting allocated, slowing down model responses. Effect of deploying this change should be that llama3:70b is reliably fast on the platform, for example.
- I don't love this approach, but it seems to work in interactive testing. by @lukemarsden in #343
Full Changelog: 0.9.18...0.9.19
0.9.18 - improve llama3:70b performance
What's Changed
Make llama3:70b performance better by improving reliability of shutdown of other models, allowing it to use the full GPU memory
- more reliable approach to shutdown process tree by @lukemarsden in #342
Full Changelog: 0.9.17...0.9.18
0.9.17 - optimize startup
What's Changed
optimize startup to avoid excess copies when using a bind-mounted cache directory and pre-baked model weights
- optimize copy by @lukemarsden in #341
Full Changelog: 0.9.16...0.9.17
0.9.16 - ollama cleanups, discord bot
What's Changed
https://www.youtube.com/watch?v=Fow7iUaKrq4
- Feature/discord bot v0.1 by @rusenask in #339
- kill ollama process group, not just parent process by @lukemarsden in #340
Full Changelog: 0.9.15...0.9.16
0.9.15 - Fix warmups, dev websockets, ollama keepalive
What's Changed
-
Stop loading sdxl by default, neatly sidestepping the headache that was warmup models over-filling GPU memory. SDXL will still work, it just might take a bit longer on the first request to download the weights.
-
Also fix bug that was stopping llama3:instruct getting loaded as a warmup model.
-
Fix frontend websocket bug in development
-
Make ollama keep model weights in memory forever, which is what our scheduler is designed for.
-
Fix warmups, dev websockets, ollama keepalive by @lukemarsden in #329
Full Changelog: 0.9.14...0.9.15
0.9.14 - Azure OpenAI compatibility, stability improvements
What's Changed
Fixed nasty bug where inference requests would "fall down a crack" and need to be retried by the user. This would also cause OpenAI API to hang.
Added Azure OpenAI compatibility so users just need to set a couple of env vars to use Helix instead of OpenAI
- Azure openai compat by @lukemarsden in #328
Full Changelog: 0.9.13...0.9.14
0.9.13 - fix regression for auto-created API keys in some cases
What's Changed
- reliably create api keys for users by @lukemarsden in #327
Full Changelog: 0.9.12...0.9.13
0.9.12 - app params for API tools, RAG in K8s, fix finetuning in openshift
What's Changed
- feat: add ability to override app query parameters in OpenAI API request by @philwinder in #318
- Local dev guide by @chocobar in #321
- Feature/llamaindex decoupling by @rusenask in #322
- Fix/rag models by @rusenask in #325
- move update & install into same layer by @rusenask in #326
Full Changelog: 0.9.11...0.9.12
0.9.11 - fix finetuning in locked down env
Fix finetuning in locked down openshift
Full Changelog: 0.9.10...0.9.11
0.9.10 - ollama and cog fixes for openshift
More fixes for OpenShift - make ollama cache and cog-sdxl directories writeable
Full Changelog: 0.9.9...0.9.10