Skip to content

Releases: helixml/helix

0.9.19 - fix ollama cleanup bug

25 Jun 12:50
8fd886f
Compare
Choose a tag to compare

What's Changed

Fix bug which was stopping ollama servers getting cleaned up on runners. This was stopping GPU memory getting allocated, slowing down model responses. Effect of deploying this change should be that llama3:70b is reliably fast on the platform, for example.

  • I don't love this approach, but it seems to work in interactive testing. by @lukemarsden in #343

Full Changelog: 0.9.18...0.9.19

0.9.18 - improve llama3:70b performance

24 Jun 14:14
18587aa
Compare
Choose a tag to compare

What's Changed

Make llama3:70b performance better by improving reliability of shutdown of other models, allowing it to use the full GPU memory

Full Changelog: 0.9.17...0.9.18

0.9.17 - optimize startup

24 Jun 13:10
47c6ab7
Compare
Choose a tag to compare

What's Changed

optimize startup to avoid excess copies when using a bind-mounted cache directory and pre-baked model weights

Full Changelog: 0.9.16...0.9.17

0.9.16 - ollama cleanups, discord bot

24 Jun 12:39
a8ddbcd
Compare
Choose a tag to compare

What's Changed

image

https://www.youtube.com/watch?v=Fow7iUaKrq4

Full Changelog: 0.9.15...0.9.16

0.9.15 - Fix warmups, dev websockets, ollama keepalive

14 Jun 15:15
2cbf0b8
Compare
Choose a tag to compare

What's Changed

  • Stop loading sdxl by default, neatly sidestepping the headache that was warmup models over-filling GPU memory. SDXL will still work, it just might take a bit longer on the first request to download the weights.

  • Also fix bug that was stopping llama3:instruct getting loaded as a warmup model.

  • Fix frontend websocket bug in development

  • Make ollama keep model weights in memory forever, which is what our scheduler is designed for.

  • Fix warmups, dev websockets, ollama keepalive by @lukemarsden in #329

Full Changelog: 0.9.14...0.9.15

0.9.14 - Azure OpenAI compatibility, stability improvements

14 Jun 13:43
327f72f
Compare
Choose a tag to compare

What's Changed

Fixed nasty bug where inference requests would "fall down a crack" and need to be retried by the user. This would also cause OpenAI API to hang.

Added Azure OpenAI compatibility so users just need to set a couple of env vars to use Helix instead of OpenAI

image

Full Changelog: 0.9.13...0.9.14

0.9.13 - fix regression for auto-created API keys in some cases

14 Jun 10:53
e8c6c95
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.12...0.9.13

0.9.12 - app params for API tools, RAG in K8s, fix finetuning in openshift

13 Jun 21:39
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.9.11...0.9.12

0.9.11 - fix finetuning in locked down env

03 Jun 19:06
Compare
Choose a tag to compare

Fix finetuning in locked down openshift

Full Changelog: 0.9.10...0.9.11

0.9.10 - ollama and cog fixes for openshift

03 Jun 17:55
Compare
Choose a tag to compare

More fixes for OpenShift - make ollama cache and cog-sdxl directories writeable

Full Changelog: 0.9.9...0.9.10