Skip to content

Canonical source

This page mirrors CHANGELOG.md at the repository root. Edit that file, not this one.

Changelog

All notable changes to Infolake (Truth Atlas).

The format is based on Keep a Changelog, and the project adheres to Semantic Versioning.

[5.2.0] — 2026-04-22

Added

  • src/ layout with one installable distribution (infolake), declared in pyproject.toml.
  • Plugin system under infolake.extensions: Protocol interfaces and an entry-point backed Registry for LLM backends, embedders, fetchers, pipeline stages, projectors, and graph edge sources.
  • infolake.diagnostics package with pynvml-based per-GPU snapshots, psutil-based host runtime snapshots, service probes, OpenTelemetry + Prometheus wiring, and a unified SystemReport model.
  • infolake-doctor CLI (JSON / text / --watch / --gpu-only / --no-gpu).
  • infolake.core.container.Container DI dataclass; get_config() / get_db() lazy accessors with PEP 562 __getattr__ for backward-compatible config / db imports.
  • Full compose.yml with qdrant, vllm (GPU), worker (GPU), backend, frontend, plus an observability profile adding prometheus, grafana, and dcgm-exporter. compose.override.yml for dev (bind-mounted source, hot reload, optional Vite dev server).
  • Multi-stage Dockerfiles under docker/ (CUDA base, vLLM, worker, backend, nginx frontend) with real HTTP healthchecks, GPU device reservations, and bind-mounted state dirs.
  • pyproject.toml-driven tool config: ruff (lint + format), mypy (strict on core / backend), pytest + coverage, deptry. Root .pre-commit-config.yaml with ruff/ruff-format/mypy/detect-secrets hooks. GitHub Actions CI (lint, test, frontend, docker-smoke).
  • Makefile with up, up-obs, down, ps, logs, build, test, lint, fmt, typecheck, check, doctor targets.
  • /api/ready readiness endpoint returning a distilled SystemReport.
  • config/default.json + config/local.example.json; AppConfig extended with diagnostics.telemetry and pipeline.stages sections (both extra="forbid").

Changed

  • atlas_coreinfolake.core; atlas_backendinfolake.backend; crawling/enrichment/mappinginfolake.pipelines.*; databaseinfolake.db; scriptsinfolake.cli; atlas_core.servicesinfolake.services. All Python imports were rewritten via a one-shot AST-aware rewriter (only from ... import / import ... statements modified).
  • Oversized modules converted to facade packages: repositories.py (1 179 LOC) → infolake.core.repositories.{documents,domains,fts,pipeline_runs,regions,stars}; llm_client.py (700 LOC) → infolake.core.llm.{protocol,outlines,instructor,legacy,factory}; mapping/pipeline.py (996 LOC) → infolake.pipelines.mapping.{pipeline,build_graph,spectral,project,color,label}; crawling/orchestrator.py (710 LOC) → infolake.pipelines.crawling.{orchestrator,scheduler,fetcher,writer}.
  • Module-level import side effects removed: config = Config() and db = Database() no longer instantiate at import time; lazy PEP 562 __getattr__ preserves the legacy names.
  • tests/conftest.py rewritten: INFOLAKE_CONFIG_PATH points at a temp fixture JSON, no more Config._instance monkey patching.
  • scripts/services.py (583 LOC) replaced by infolake-services CLI: orchestration verbs (up, down, ps, logs, build, status) proxy to docker compose; in-container runners (enrich, enrich-loop, crawl, pipeline, smoke) keep their Python-level behaviour.
  • Backend readiness: FastAPI.on_startup wires optional Prometheus + OpenTelemetry via infolake.diagnostics.telemetry.install_all.
  • Environment variables renamed: ATLAS_CONFIG_PATHINFOLAKE_CONFIG_PATH; ATLAS_LOG_LEVELINFOLAKE_LOG_LEVEL; ATLAS_ALEMBIC_DATABASE_URLINFOLAKE_ALEMBIC_DATABASE_URL. Legacy names are still read as fall-through.

Removed

  • Root node_modules/ and stray package.json / package-lock.json. The shadcn dev dependency lives under frontend/package.json.
  • Committed runtime artefacts at the repo root (run.log, database_inspection_report.txt, test_seed copy.txt) and stray frontend/{logs,data,checkpoints} directories.
  • docker-compose.yml (superseded by compose.yml + compose.override.yml).

[5.1.x] — Pre-refactor history

Version Date Changes
5.1.0 Apr 2026 Schema trimmed to v5.1 table/column set, Meilisearch removed, FTS moved to SQLite FTS5, Pydantic+SQLModel boundary stack, Outlines/instructor LLM clients.
5.1.1 Apr 2026 Environment/config/services cleanup: single config.json, Alembic two-revision cutover applied, deprecated scripts/modules removed, unified scripts/services.py launcher added, operator docs refreshed.
5.1.2 Apr 2026 OutlinesLLMClient migrated to the outlines ≥1.0 API (from_openai / from_transformers + Generator); unblocks enrichment against the installed outlines 1.2.x.