Skip to content

Architecture

The deployed shape is a single Docker Compose file, one FastAPI backend, one static-bundled frontend, one persistent worker, and a GPU-bound vLLM server. State is split across SQLite (structural) and Qdrant (vectors).

Topology

flowchart LR
  subgraph host [Host, GPUs 0..4]
    subgraph compose [docker compose]
      frontend[frontend<br/>nginx :8080]
      backend[backend<br/>FastAPI :8000]
      worker[worker<br/>enrich-loop]
      vllm["vllm<br/>OpenAI API :8765"]
      qdrant[qdrant :6333]
      prom[prometheus<br/>profile=observability]
      dcgm[dcgm-exporter<br/>profile=observability]
      graf[grafana :3000<br/>profile=observability]
    end
    gpus[["NVIDIA GPUs"]]
  end
  frontend -->|/api/*| backend
  backend  -->|HTTP| qdrant
  backend  -->|HTTP| vllm
  worker   -->|HTTP| vllm
  worker   -->|HTTP| qdrant
  worker   -->|sqlite| vol[("./data bind mount")]
  backend  --> vol
  vllm     -->|"nvidia runtime"| gpus
  dcgm     -->|NVML| gpus
  prom     --> backend
  prom     --> dcgm
  graf     --> prom

Services at a glance

Service Container image What it does
frontend docker/frontend.Dockerfile (nginx) Serves the deck.gl atlas bundle and proxies /api/* to backend.
backend docker/backend.Dockerfile FastAPI app exposing map/data/route/candidate endpoints + /api/metrics.
worker docker/backend.Dockerfile Runs infolake-services enrich-loop; writes summaries, claims, embeddings.
vllm docker/vllm.Dockerfile OpenAI-compatible LLM endpoint; consumes models/ bind mount.
qdrant qdrant/qdrant Vector DB for documents, passages, claims.
prometheus official Scrapes backend + dcgm-exporter (observability profile only).
grafana official Pre-provisioned Infolake dashboard (observability profile only).
dcgm-exporter NVIDIA Per-GPU utilisation, VRAM, temperature (observability profile only).

Data stores

  • SQLite + FTS5data/infolake.db. Everything structural: documents, domains, passages, claims, regions, pipeline runs, stars. Forward-only Alembic migrations.
  • Qdrant — three collections: documents, passages, claims. Vectors only; row IDs refer back to SQLite.
  • Filesystem artefactsdata/mapping/ holds eigenvectors, UMAP coordinates, and region labels consumed by /api/map-data; checkpoints/<session_id>.json holds resumable crawl state.

Extensibility seams

Every cross-cutting capability is a Protocol discovered through importlib.metadata.entry_points. Third-party packages register implementations via pyproject.toml; Infolake picks them up without code changes.

Entry-point group Protocol Default implementation
infolake.llm_backends LLMBackend OutlinesLLMClient
infolake.embedders Embedder TextEmbeddingClient (BGE-small)
infolake.fetchers Fetcher Crawl4AIClient
infolake.pipeline_stages PipelineStage CrawlStage, EnrichDocumentsStage, …
infolake.projections Projector UMAPProjector
infolake.graph_edges GraphEdgeSource citations, co_citations, semantic, claim_overlap, domain_links

Example registration in another package:

toml [project.entry-points."infolake.llm_backends"] vertex_ai = "my_pkg.llm:VertexAIBackend"

Then set llm.constrained_decoding = "vertex_ai" in config.json — no edits to Infolake required.

Configuration

One Pydantic-validated JSON file: config/default.json. Every key is declared in src/infolake/core/schemas.py with model_config = ConfigDict(extra="forbid"), so a typo or stale key fails loudly at boot. See the full README for the complete knob table.

Observability

  • infolake-doctor --json — config + DB + GPU (pynvml) + runtime (psutil) + service probes as one SystemReport.
  • /api/metrics — Prometheus endpoint (when diagnostics.telemetry.enabled = true).
  • make up-obs — start Prometheus (:9090), Grafana (:3000) with the pre-provisioned dashboard, and dcgm-exporter (:9400).

The development loop

bash make lint # ruff check make fmt # ruff format + autofix make typecheck # mypy on core + backend make check # lint + typecheck + test make precommit # run every pre-commit hook against all files

CI mirrors this in .github/workflows/ci.yml: lint-python, test-python, frontend, and a non-GPU docker-smoke build.