#

llm-infernece

Here are 2 public repositories matching this topic...

Picovoice / picollm

On-device LLM Inference Powered by X-Bit Quantization

natural-language-processing compression self-hosted llama language-models quantization language-model gemma mistral model-compression efficient-inference llm llms generative-ai large-language-model llama2 mixtral llm-infernece llama3

Updated Jun 24, 2024
Python

pandada8 / llm-inference-benchmark

LLM 推理服务性能测试

Updated Dec 17, 2023
Jupyter Notebook

Improve this page

Add a description, image, and links to the llm-infernece topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-infernece topic, visit your repo's landing page and select "manage topics."