EXPLORE
RunanywhereAI/runanywhere-sdks MIRROR
Production ready toolkit to run AI locally
androidapple-intelligencecppdiffusion-modelsedge
C++ 10385 0
triton-inference-server/server MIRROR
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
clouddatacenterdeep-learningedgegpu
Python 10467 0
huggingface/text-generation-inference MIRROR
Large Language Model Text Generation Inference
bloomdeep-learningfalcongptinference
Python 10809 0
aws/amazon-sagemaker-examples MIRROR
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
awsdata-sciencedeep-learningexamplesinference
Jupyter Notebook 10896 0
google-ai-edge/mediapipe MIRROR
Cross-platform, customizable ML solutions for live and streaming media.
androidaudio-processingcalculatorcomputer-visionc-plus-plus
C++ 0 0
hpcaitech/ColossalAI MIRROR
Making large AI models cheaper, faster and more accessible
aibig-modeldata-parallelismdeep-learningdistributed-computing
Python 0 0
deepspeedai/DeepSpeed MIRROR
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
billion-parameterscompressiondata-parallelismdeep-learninggpu
Python 0 0
ggml-org/whisper.cpp MIRROR
Port of OpenAI's Whisper model in C/C++
inferenceopenaispeech-recognitionspeech-to-texttransformer
C++ 0 0
vllm-project/vllm MIRROR
A high-throughput and memory-efficient inference and serving engine for LLMs
amdblackwellcudadeepseekdeepseek-v3
Python 0 0