Architecture¶

The deployed shape is a single Docker Compose file, one FastAPI backend, one static-bundled frontend, one persistent worker, and a GPU-bound vLLM server. State is split across SQLite (structural) and Qdrant (vectors).

Topology¶

flowchart LR
  subgraph host [Host, GPUs 0..4]
    subgraph compose [docker compose]
      frontend[frontend<br/>nginx :8080]
      backend[backend<br/>FastAPI :8000]
      worker[worker<br/>enrich-loop]
      vllm["vllm<br/>OpenAI API :8765"]
      qdrant[qdrant :6333]
      prom[prometheus<br/>profile=observability]
      dcgm[dcgm-exporter<br/>profile=observability]
      graf[grafana :3000<br/>profile=observability]
    end
    gpus[["NVIDIA GPUs"]]
  end
  frontend -->|/api/*| backend
  backend  -->|HTTP| qdrant
  backend  -->|HTTP| vllm
  worker   -->|HTTP| vllm
  worker   -->|HTTP| qdrant
  worker   -->|sqlite| vol[("./data bind mount")]
  backend  --> vol
  vllm     -->|"nvidia runtime"| gpus
  dcgm     -->|NVML| gpus
  prom     --> backend
  prom     --> dcgm
  graf     --> prom

Services at a glance¶

Service	Container image	What it does
`frontend`	`docker/frontend.Dockerfile` (nginx)	Serves the deck.gl atlas bundle and proxies `/api/*` to `backend`.
`backend`	`docker/backend.Dockerfile`	FastAPI app exposing map/data/route/candidate endpoints + `/api/metrics`.
`worker`	`docker/backend.Dockerfile`	Runs `infolake-services enrich-loop`; writes summaries, claims, embeddings.
`vllm`	`docker/vllm.Dockerfile`	OpenAI-compatible LLM endpoint; consumes `models/` bind mount.
`qdrant`	`qdrant/qdrant`	Vector DB for documents, passages, claims.
`prometheus`	official	Scrapes `backend` + `dcgm-exporter` (observability profile only).
`grafana`	official	Pre-provisioned Infolake dashboard (observability profile only).
`dcgm-exporter`	NVIDIA	Per-GPU utilisation, VRAM, temperature (observability profile only).

Data stores¶

SQLite + FTS5 — data/infolake.db. Everything structural: documents, domains, passages, claims, regions, pipeline runs, stars. Forward-only Alembic migrations.
Qdrant — three collections: documents, passages, claims. Vectors only; row IDs refer back to SQLite.
Filesystem artefacts — data/mapping/ holds eigenvectors, UMAP coordinates, and region labels consumed by /api/map-data; checkpoints/<session_id>.json holds resumable crawl state.

Extensibility seams¶

Every cross-cutting capability is a Protocol discovered through importlib.metadata.entry_points. Third-party packages register implementations via pyproject.toml; Infolake picks them up without code changes.

Entry-point group	Protocol	Default implementation
`infolake.llm_backends`	`LLMBackend`	`OutlinesLLMClient`
`infolake.embedders`	`Embedder`	`TextEmbeddingClient` (BGE-small)
`infolake.fetchers`	`Fetcher`	`Crawl4AIClient`
`infolake.pipeline_stages`	`PipelineStage`	`CrawlStage`, `EnrichDocumentsStage`, …
`infolake.projections`	`Projector`	`UMAPProjector`
`infolake.graph_edges`	`GraphEdgeSource`	citations, co_citations, semantic, claim_overlap, domain_links

Example registration in another package:

toml [project.entry-points."infolake.llm_backends"] vertex_ai = "my_pkg.llm:VertexAIBackend"

Then set llm.constrained_decoding = "vertex_ai" in config.json — no edits to Infolake required.

Configuration¶

One Pydantic-validated JSON file: config/default.json. Every key is declared in src/infolake/core/schemas.py with model_config = ConfigDict(extra="forbid"), so a typo or stale key fails loudly at boot. See the full README for the complete knob table.

Observability¶

infolake-doctor --json — config + DB + GPU (pynvml) + runtime (psutil) + service probes as one SystemReport.
/api/metrics — Prometheus endpoint (when diagnostics.telemetry.enabled = true).
make up-obs — start Prometheus (:9090), Grafana (:3000) with the pre-provisioned dashboard, and dcgm-exporter (:9400).

The development loop¶

bash make lint # ruff check make fmt # ruff format + autofix make typecheck # mypy on core + backend make check # lint + typecheck + test make precommit # run every pre-commit hook against all files

CI mirrors this in .github/workflows/ci.yml: lint-python, test-python, frontend, and a non-GPU docker-smoke build.