95 inference projects ranked by GitHub stars, weekly growth, and maintenance health.
Showing 1-50 of 95 projects
Get the fastest-growing projects, useful MCP servers, and technical reads in one weekly email.
| HTML |
| 10d ago |
| 3 | llama.cpp LLM inference in C/C++ | ⚡ Inference | 109.6K | +1.2K | 100 | C++ | 9d ago |
| 4 | vLLM A high-throughput and memory-efficient inference and serving engine for LLMs | ⚡ Inference | 79.7K | +606 | 93 | Python | 9d ago |
| 5 | Llm Course Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. | ⚡ Inference | 79.2K | 0 | 30 | - | 3mo ago |
| 6 | Llamafactory Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024) | ⚡ Inference | 71.1K | 0 | 56 | Python | 13d ago |
| 7 | Caveman 🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman | ⚡ Inference | 58.3K | 0 | 64 | JavaScript | 11d ago |
| 9 | Context7 Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors | ⚡ Inference | 55.0K | 0 | 66 | TypeScript | 9d ago |
| 10 | Mempalace The best-benchmarked open-source AI memory system. And it's free. | ⚡ Inference | 52.0K | 0 | 80 | Python | 10d ago |
| 11 | Pi Mono AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods | ⚡ Inference | 48.2K | 0 | 73 | TypeScript | 10d ago |
| 12 | LocalAI LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. | ⚡ Inference | 46.2K | +130 | 91 | Go | 9d ago |
| 13 | Milvus Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search | ⚡ Inference | 44.2K | 0 | 75 | Go | 9d ago |
| 14 | Kong 🦍 The API and AI Gateway | ⚡ Inference | 43.4K | 0 | 40 | Lua | 1mo ago |
| 15 | Jan Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. | ⚡ Inference | 42.5K | +88 | 80 | TypeScript | 10d ago |
| 16 | Lightrag [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation" | ⚡ Inference | 35.0K | 0 | 80 | Python | 10d ago |
| 17 | Graphrag A modular graph-based Retrieval-Augmented Generation (RAG) system | ⚡ Inference | 32.9K | 0 | 56 | Python | 9d ago |
| 19 | Self Llm 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程 | ⚡ Inference | 30.4K | 0 | 37 | Jupyter Notebook | 27d ago |
| 20 | Void | ⚡ Inference | 28.7K | 0 | 36 | TypeScript | 4mo ago |
| 21 | Sglang SGLang is a high-performance serving framework for large language models and multimodal models. | ⚡ Inference | 27.7K | 0 | 77 | Python | 9d ago |
| 22 | Gitleaks Find secrets with Gitleaks 🔑 | ⚡ Inference | 26.8K | 0 | 33 | Go | 1mo ago |
| 23 | Awesome Generative Ai Guide A one stop repository for generative AI research updates, interview resources, notebooks and much more! | ⚡ Inference | 26.6K | 0 | 42 | HTML | 12d ago |
| 24 | Hands On Large Language Models Official code repo for the O'Reilly Book - "Hands-On Large Language Models" | ⚡ Inference | 26.2K | 0 | 32 | Jupyter Notebook | 27d ago |
| 25 | Llmfit Hundreds of models & providers. One command to find what runs on your hardware. | ⚡ Inference | 25.8K | 0 | 70 | Rust | 11d ago |
| 26 | Scrapegraph Ai Python scraper based on AI | ⚡ Inference | 25.0K | 0 | 60 | Python | 11d ago |
| 27 | llamafile Distribute and run LLMs with a single file. | ⚡ Inference | 24.4K | +44 | 65 | C++ | 16d ago |
| 28 | Llm Action 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地) | ⚡ Inference | 24.3K | 0 | 30 | HTML | 11d ago |
| 29 | MLC LLM Universal LLM Deployment Engine with ML Compilation | ⚡ Inference | 22.6K | +36 | 62 | Python | 9d ago |
| 30 | Awesome Chinese LLM 整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。 | ⚡ Inference | 22.6K | 0 | 41 | - | 11d ago |
| 31 | Unilm Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities | ⚡ Inference | 22.1K | 0 | 43 | Python | 3mo ago |
| 32 | Skyvern Automate browser based workflows with AI | ⚡ Inference | 21.6K | 0 | 68 | Python | 9d ago |
| 33 | Datasets 🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools | ⚡ Inference | 21.5K | 0 | 60 | Python | 10d ago |
| 34 | Free Llm Api Resources A list of free LLM inference resources accessible via API. | ⚡ Inference | 21.3K | 0 | 30 | Python | 11d ago |
| 35 | Qwen The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. | ⚡ Inference | 21.1K | 0 | 46 | Python | 2mo ago |
| 36 | Peft 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. | ⚡ Inference | 21.1K | 0 | 60 | Python | 10d ago |
| 37 | Heretic Fully automatic censorship removal for language models | ⚡ Inference | 20.8K | 0 | 54 | Python | 12d ago |
| 38 | Dyad Local, open-source AI app builder for power users ✨ v0 / Lovable / Replit / Bolt alternative 🌟 Star if you like it! | ⚡ Inference | 20.3K | 0 | 72 | TypeScript | 9d ago |
| 40 | Web Llm High-performance In-browser LLM Inference Engine | ⚡ Inference | 18.0K | 0 | 46 | TypeScript | 15d ago |
| 41 | Ml Engineering Machine Learning Engineering Open Book | ⚡ Inference | 17.9K | 0 | 36 | Python | 2mo ago |
| 42 | Airllm AirLLM 70B inference with single 4GB GPU | ⚡ Inference | 17.7K | 0 | 31 | Jupyter Notebook | 2mo ago |
| 46 | Easy Dataset A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval | ⚡ Inference | 14.2K | 0 | 56 | JavaScript | 20d ago |
| 47 | Outlines Structured Outputs | ⚡ Inference | 13.8K | 0 | 52 | Python | 17d ago |
| 48 | Omlx LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar | ⚡ Inference | 13.6K | 0 | 78 | Python | 10d ago |
| 49 | Awesome Generative Ai A curated list of modern Generative Artificial Intelligence projects and services | ⚡ Inference | 12.0K | 0 | 43 | - | 15d ago |
| 50 | Tensorzero TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation. | ⚡ Inference | 11.4K | 0 | 72 | Rust | 9d ago |