Local AI Agentic Engine for secure, sovereign, offline environments
Krionis delivers multimodal Retrieval-Augmented Generation (RAG) that runs locally. Two parts work together: the Pipeline for instant agents and the Orchestrator for massive local scale.
Quick Start — Orchestrator
Install from PyPI, prepare a shared working directory (config/system.yaml
, data/manuals/
), and run as a local service.
# 1) Install
pip install krionis-orchestrator
# 2) Prepare working directory
config/system.yaml # system + model config
data/manuals/ # curated manuals / docs / PDFs
# 3) Start the service
krionis-orchestrator start --host 0.0.0.0 --port 8080
# 4) Manage
krionis-orchestrator status
krionis-orchestrator stop
krionis-orchestrator restart
Dev mode: krionis-orchestrator dev --host 127.0.0.1 --port 8080
- Scheduling, batching & micro-batching for multi-user throughput
- Built-in agents
- REST API and minimal Web UI
- Fully offline after setup
High-level Capabilities
Local-first RAG
Embed and index with FAISS/HNSW; retrieve context and generate grounded answers entirely on your machines.
Multimodal
Ingest PDFs, text, images (OCR), audio, and video; query via one interface.
Flexible Models
Run open models (Qwen, Mistral, LLaMA, etc.) with YAML-controlled precision (fp32/fp16/bf16).
Design Choices (why Krionis feels fast & practical)
Pipeline — instant agent with telemetry
- Immediate utility: one install gives you a ready-to-use agent that searches, retrieves, and answers.
- Observability-first: built-in telemetry for retrieval quality, latency, and model behavior.
- Safe defaults: sensible chunking, indexing, and reranking out of the box.
- Swappable pieces: embeddings, rerankers, and LLMs are pluggable via YAML.
Orchestrator — massive scale, locally
- Throughput by design: batching/micro-batching + queueing for high concurrency on modest hardware.
- Agent runtime: compose multi-step chains (retrieval → drafting → validation) with backpressure.
- Resilience: timeouts, retries, and cancellation primitives built in.
- Interfaces: REST API + minimal Web UI for operators.
Architecture (at a glance)
config/system.yaml # system + model configuration
data/manuals/ # curated knowledge base
Pipeline builds indexes and handles retrieval. Orchestrator layers scheduling, batching, and agent coordination on top — all local.
Interfaces
- CLI tools for operators
- REST API for integrations
- Lightweight Web UI for monitoring and control
After initial model setup, no external calls are required — suitable for air-gapped and regulated environments.
FAQ
Does Krionis require a GPU?
No. It is designed for GPU-poor environments. If present, GPUs accelerate inference; CPUs are sufficient for many use cases.
Can I use my own models?
Yes. Select models and precision in YAML; swap in open-source LLMs and embedding models.
Does it call external APIs?
No. After setup, Krionis runs fully offline.
How do I start building?
Install the Pipeline to get an immediate agent with telemetry, then add the Orchestrator when you need multi-user scale.