The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
-
Updated
Feb 11, 2026 - Python
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
oneAPI Data Analytics Library (oneDAL)
The easiest way to use Machine Learning. Mix and match underlying ML libraries and data set sources. Generate new datasets or modify existing ones with ease.
Cross-platform c++ sdk & model hub for cross-platform AI inference. Ready-to-deploy models including Segment Anything 3, Depth Anything 2 and Gemma.
High-Performance AI-Native Web Server — built in C & Assembly for ultra-fast AI inference and streaming.
Client library to interact with various APIs used within Philips in a simple and uniform way
GPU-aware inference mesh for large-scale AI serving
Unity TTS plugin: Piper neural synthesis + OpenJTalk Japanese + Unity AI Inference Engine. Windows/Mac/Linux/Android/iOS ready. High-quality voices for games & apps.
MOTO - Autonomous ASI Deep Research Harness by Intrafere - creative novelty-seeking mathematics researcher for S.T.E.M. users, run for days at a time once pressing start - no interaction needed! MOTO uses simultaneous agents working in parallel from either local host LM studio, OpenRouter, or both. Star us and follow - there's more to come soon!
A development framework for Fully Homomorphic Encryption (FHE)
Customed version of Google's tflite-micro
A powerful, faster, scalable full-stack boilerplace for AI inference using Node.js, Python, Redis, and Docker
KaiROS AI— Intelligence, Precisely When It Matters.
Apache 2.0-licensed open source operations stack for private AI inference with open models. Run LLMs (7B-70B) locally with vLLM, OpenAI-compatible API, web dashboard, chat UI, admin panel, and hardware monitoring.
Open-source developer tool for testing deAPI.ai endpoints — unified AI inference API for image, video, audio, transcription, OCR and more
A personal demo project for Flutter + ONNX Runtime integration. Not related to any company work.A comprehensive on-device face recognition SDK for Flutter
Dockerized Yawcam-AI, Edge-ready AI NVR with CPU and CUDA builds, RTSP support, persistent storage, YOLO inference, and EdgePulse optimization.
Arbitrary Numbers
Distributed Inference Key-value Cache in Cloud Setting
Add a description, image, and links to the ai-inference topic page so that developers can more easily learn about it.
To associate your repository with the ai-inference topic, visit your repo's landing page and select "manage topics."