Krionis — Local AI Knowledge Engine

Local AI Agentic Engine for secure, sovereign, offline environments

Krionis delivers multimodal Retrieval-Augmented Generation (RAG) that runs locally. Two parts work together: the Pipeline for instant agents and the Orchestrator for massive local scale.

Install in seconds PyPI: Orchestrator PyPI: Pipeline

Quick Start — Orchestrator

Install from PyPI, prepare a shared working directory (config/system.yaml, data/manuals/), and run as a local service.

# 1) Install
pip install krionis-orchestrator

# 2) Prepare working directory
config/system.yaml   # system + model config
data/manuals/         # curated manuals / docs / PDFs

# 3) Start the service
krionis-orchestrator start --host 0.0.0.0 --port 8080

# 4) Manage
krionis-orchestrator status
krionis-orchestrator stop
krionis-orchestrator restart

Dev mode: krionis-orchestrator dev --host 127.0.0.1 --port 8080

What you get immediately

Scheduling, batching & micro-batching for multi-user throughput
Built-in agents
REST API and minimal Web UI
Fully offline after setup

View on PyPI →

High-level Capabilities

Local-first RAG

Embed and index with FAISS/HNSW; retrieve context and generate grounded answers entirely on your machines.

FAISS

HNSW

SentenceTransformers

Multimodal

Ingest PDFs, text, images (OCR), audio, and video; query via one interface.

PDF

OCR

Audio

Video

Flexible Models

Run open models (Qwen, Mistral, LLaMA, etc.) with YAML-controlled precision (fp32/fp16/bf16).

Local LLMs

Quantization

Hugging Face

Design Choices (why Krionis feels fast & practical)

Pipeline — instant agent with telemetry

Immediate utility: one install gives you a ready-to-use agent that searches, retrieves, and answers.
Observability-first: built-in telemetry for retrieval quality, latency, and model behavior.
Safe defaults: sensible chunking, indexing, and reranking out of the box.
Swappable pieces: embeddings, rerankers, and LLMs are pluggable via YAML.

PyPI: krionis-pipeline →

Orchestrator — massive scale, locally

Throughput by design: batching/micro-batching + queueing for high concurrency on modest hardware.
Agent runtime: compose multi-step chains (retrieval → drafting → validation) with backpressure.
Resilience: timeouts, retries, and cancellation primitives built in.
Interfaces: REST API + minimal Web UI for operators.

PyPI: krionis-orchestrator →

Architecture (at a glance)

config/system.yaml   # system + model configuration
data/manuals/         # curated knowledge base

Pipeline builds indexes and handles retrieval. Orchestrator layers scheduling, batching, and agent coordination on top — all local.

Interfaces

CLI tools for operators
REST API for integrations
Lightweight Web UI for monitoring and control

After initial model setup, no external calls are required — suitable for air-gapped and regulated environments.

FAQ

Does Krionis require a GPU?
No. It is designed for GPU-poor environments. If present, GPUs accelerate inference; CPUs are sufficient for many use cases.

Can I use my own models?
Yes. Select models and precision in YAML; swap in open-source LLMs and embedding models.

Does it call external APIs?
No. After setup, Krionis runs fully offline.

How do I start building?
Install the Pipeline to get an immediate agent with telemetry, then add the Orchestrator when you need multi-user scale.