llama.cpp
The industry-standard C++ inference engine for high-performance, local LLM execution across all hardware architectures.
5d ago
Best for LLM Inference EngineHas API
PricingFree
Free
Quantized LLM Inference
Model Fine-tuning (LoRA)
Text Embeddings