Last updated
Pure Rust LLM inference engine with 1.58-bit ternary support and Test-Time Training
$ winget install --id imonoonoko.BitLlama --exact --version 1.0.0Run in Command Prompt, PowerShell, or Windows Terminal. Prompts for any agreements.
For Intune admins
Automated application patching for Microsoft Intune. Pckgr keeps a curated library of 1,000+ apps continuously up-to-date in your tenant via Microsoft Graph - no manual repackaging, no chasing vendor sites.
Start free 30-day trialNo credit card required.
BitLlama is a Pure Rust LLM inference engine featuring 1.58-bit ternary quantization,
Test-Time Training (TTT), Soul learning system, MCP server/client, and private RAG.
Supports Llama, Gemma, Mistral, Qwen, and BitNet models.
OpenAI-compatible API server included.
| Architecture | Type | Scope | Install | Download |
|---|---|---|---|---|
| x64 | Portable | - | Direct |
Copy a command tailored to that specific architecture, type, and scope - useful when winget would otherwise pick a different default.
No known CVEs for BitLlama.
Coverage is best-effort and depends on a winget package mapping to an NVD CPE entry. Absence here is not a guarantee of safety.
More from imonoonoko or browse ai, cli, inference.