🔭AI Tools Scout
LeaderboardMCPSkillsContentAbout
🔭AI Tools Scout·Open signals for AI builders
LeaderboardMCPSkillsContentAbout
← Back to Leaderboard

Best Open Source AI Inference Projects

95 inference projects ranked by GitHub stars, weekly growth, and maintenance health.

Project data last synced 9d ago. Check before relying on time-sensitive rankings.

Showing 51-95 of 95 projects

#ProjectCategoryStars▼Weekly▽TrendHealth▽LanguageUpdated▽
51
Llm Engineer Toolkit
A curated list of 120+ LLM libraries category wise.
⚡ Inference10.4K039-1mo ago
52
Openvino
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
⚡ Inference10.2K072C++9d ago
53
PreviousPage 2 of 2Next

Weekly AI open-source movers

Get the fastest-growing projects, useful MCP servers, and technical reads in one weekly email.

Unity Mcp
Unity MCP acts as a bridge, allowing AI assistants (like Claude, Cursor) to interact directly with your Unity Editor via a local MCP (Model Context Protocol) Client. Give your LLM tools to manage assets, control scenes, edit scripts, and automate tasks within Unity.
⚡ Inference
9.5K
0
75
C#
16d ago
54
Ipex Llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
⚡ Inference8.8K042Python3mo ago
55
Toonflow App
Toonflow 是开源一站式 AI 短剧创作工具,将小说、剧本快速转化为动画短剧。集成 AI 编剧、智能分镜、角色与视频生成,跨平台桌面端轻量部署,助力创作者低成本批量产出视觉内容。Toonflow is an open-source AI tool that turns stories and scripts into animated short dramas. Features AI scriptwriting, storyboarding, character and video generation. A cross-platform desktop app for efficient content creation.
⚡ Inference7.8K071HTML12d ago
56
Prompt Master
A Claude skill that writes the accurate prompts for any AI tool. Zero tokens or credits wasted. Full context and memory retention
⚡ Inference7.4K042-18d ago
57
Transformer Explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
⚡ Inference7.3K028JavaScript1mo ago
58
Local Deep Research
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.
⚡ Inference7.2K075Python9d ago
59
Openllmetry
Open-source observability for your GenAI or LLM application, based on OpenTelemetry
⚡ Inference7.1K051Python10d ago
60
Vespa
AI + Data, online. https://vespa.ai
⚡ Inference6.9K066Java9d ago
61
Llm Wiki
LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratch every time), the LLM incrementally builds and maintains a persistent wiki from your sources。
⚡ Inference6.9K065TypeScript10d ago
62
Learning
A log of things I'm learning
⚡ Inference6.9K030-18d ago
63
LTX 2
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
⚡ Inference6.6K031Python10d ago
64
Firecrawl Mcp Server
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
⚡ Inference6.3K037JavaScript14d ago
65
Sqlbot
🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG.
⚡ Inference6.1K065JavaScript10d ago
66
Pgai
A suite of tools to develop RAG, semantic search, and other AI applications more easily with PostgreSQL
⚡ Inference5.8K034PLpgSQL3mo ago
67
Taxhacker
Self-hosted AI accounting app. LLM analyzer for receipts, invoices, transactions with custom prompts and categories
⚡ Inference5.6K040TypeScript1mo ago
68
Alignment Handbook
Robust recipes to align language models with human and AI preferences
⚡ Inference5.6K037Python1mo ago
69
Ultrarag
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
⚡ Inference5.5K043Python10d ago
70
Chronos Forecasting
Chronos: Pretrained Models for Time Series Forecasting
⚡ Inference5.3K041Python1mo ago
71
5ire
5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .
⚡ Inference5.2K047TypeScript2mo ago
72
Sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM
⚡ Inference5.2K043Python12d ago
73
Transformerlab App
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
⚡ Inference4.9K074Python9d ago
74
Bifrost
Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.
⚡ Inference4.8K074Go9d ago
75
Shimmy
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
⚡ Inference4.8K044Rust1mo ago
76
Claude Obsidian
Claude + Obsidian knowledge companion. Persistent, compounding wiki vault based on Karpathy's LLM Wiki pattern. /wiki /save /autoresearch
⚡ Inference4.8K054Python27d ago
77
Mlx Vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
⚡ Inference4.7K068Python9d ago
78
Vllm Omni
A framework for efficient model inference with omni-modality models
⚡ Inference4.7K074Python10d ago
79
Llm Twin Course
🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 12 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴
⚡ Inference4.3K028Python1mo ago
80
LLM RL Visualized
🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )
⚡ Inference4.3K034Python12d ago
81
Spark Nlp
State of the Art Natural Language Processing
⚡ Inference4.1K048Scala11d ago
82
Lemonade
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk
⚡ Inference3.9K069C++10d ago
83
Scikit Llm
Seamlessly integrate LLMs into scikit-learn.
⚡ Inference3.5K029Python19d ago
84
Optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
⚡ Inference3.4K042Python14d ago
85
Horizon
📡 Your own AI-powered news radar. Generates daily briefings in English & Chinese. | 用 AI 构建你专属的新闻雷达
⚡ Inference3.4K054Python9d ago
86
Hallucination Leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
⚡ Inference3.2K032Python10d ago
87
Landppt
一个基于LLM的演示文稿生成平台,能够自动将文档内容转换为专业的PPT演示文稿。平台支持多种AI模型,提供丰富的模板和样式选择,让用户能够创建高质量的演示文稿。
⚡ Inference3.2K038Python25d ago
88
Xturing
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
⚡ Inference2.7K047Python2mo ago
89
Aix DB
Aix-DB 基于 LangChain/LangGraph 框架,结合 MCP Skills 多智能体协作架构,实现自然语言到数据洞察的端到端转换。
⚡ Inference2.1K049JavaScript1mo ago
90
Rapid MLX
The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.
⚡ Inference2.1K076Python9d ago
91
Lucebox Hub
Lucebox optimization hub: hand-tuned LLM inference, built for specific consumer hardware.
⚡ Inference1.9K063C++10d ago
92
Detikzify
Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ.
⚡ Inference1.8K018Python3mo ago
93
Mindpipe
A powerful model compression framework for LLMs and LVLMs, adapted for NVIDIA GPUs and Huawei Ascend NPUs.
⚡ Inference1.0K043Python10d ago
94
Llm Internals
Learn LLM internals step by step - from tokenization to attention to inference optimization.
⚡ Inference978021-10d ago
95
Vllm Studio
Control panel for VLLM, Sglang, llama.cpp, exllamav3
⚡ Inference908045TypeScript10d ago