Architecture: My Independent AI¶
Overview¶
My Independent AI is a modular, privacy-first RAG (Retrieval-Augmented Generation) system. It lets you ask natural language questions over your own personal data — emails, documents, chats — entirely on your own hardware, with no data sent to external AI providers.
Core Principles¶
- Privacy First — PII is scrubbed locally before data is stored. A local SQLite mapping table allows de-anonymization only at read time.
- Fully Local by Default — LLMs and embeddings run via Ollama. The vector database is Qdrant running on your machine.
- Modular Importers — Each data source (Gmail, WhatsApp, Dropbox, etc.) is an independent plugin following the
BaseImportercontract. - Cloud-Optional — GCP (Cloud Run, GCS) is supported for running importers on a schedule, but is not required.
Repository Structure¶
/
├── apps/
│ ├── admin-dashboard/ # Streamlit UI — chat interface
│ ├── importers/ # Data ingestion plugins
│ ├── orchestrator/ # Ingestion pipeline runner
│ └── qdrant-sidecar/ # GCP VM sidecar for cloud Qdrant
├── libs/
│ ├── embedding/ # Ollama-based embedding client
│ ├── vector-storage/ # Qdrant client wrapper
│ ├── privacy-core/ # PII scrubbing (Presidio) + SQLite mapping
│ ├── content-processing/ # PDF/text extraction helpers
│ └── db-sync/ # SQLite + Litestream utilities
├── infrastructure/
│ └── gcp/ # Terraform for Cloud Run, GCS, VM (optional)
├── scripts/ # Operational helpers (deploy, maintenance)
├── docker-compose.yml # Local full-stack (Ollama + Qdrant + Dashboard)
└── docs/ # Architecture, ADRs, contributing guide
Component Diagram¶
┌─────────────────────────────────────────────────────────┐
│ User Browser │
└───────────────────┬─────────────────────────────────────┘
│ http://localhost:8501
┌───────────────────▼─────────────────────────────────────┐
│ Admin Dashboard (Streamlit) │
│ • Chat UI with streaming responses │
│ • Multi-turn conversation memory │
│ • De-anonymization via privacy-core │
└──────┬────────────────────────────────┬─────────────────┘
│ embed query │ chat completion
┌──────▼──────┐ ┌─────────▼───────┐
│ Ollama │ │ Ollama │
│ nomic-embed │ │ llama3.2 │
│ :11434 │ │ :11434 │
└─────────────┘ └─────────────────┘
│ vector
┌──────▼──────────┐
│ Qdrant │
│ personal_data │
│ collection │
│ :6333 │
└─────────────────┘
▲
│ upsert embeddings
┌──────┴──────────────────────────────────────────────────┐
│ Importers (run on schedule or manually) │
│ gmail · whatsapp · synology-nas · dropbox · calendar │
└─────────────────────────────────────────────────────────┘
Data Flow¶
- Import — An importer fetches data from a source (Gmail API, WhatsApp export, etc.), scrubs PII using
privacy-core, and upserts embeddings into Qdrant vialibs/embedding+libs/vector-storage. - Query — The dashboard embeds the user's question with
nomic-embed-text, retrieves the top-k relevant chunks from Qdrant, then passes those chunks + conversation history tollama3.2to generate a grounded answer. - De-anonymize — The response is passed through
privacy-core'sMappingDBto replace anonymized entity IDs (e.g.<PERSON_a1b2c3>) with the original names before display.
Local Stack (Docker Compose)¶
| Service | Image | Role |
|---|---|---|
ollama |
ollama/ollama |
LLM + embedding inference |
ollama-init |
ollama/ollama |
One-shot model pull (llama3.2, nomic-embed-text) |
qdrant |
qdrant/qdrant |
Vector database |
dashboard |
Built from apps/admin-dashboard/Dockerfile |
Web UI |
Cloud Deployment (Optional)¶
For running importers on a schedule without leaving your laptop on, see infrastructure/gcp/. This provisions:
- Cloud Run Jobs — one per cloud importer (Gmail, Calendar, Dropbox)
- GCS bucket — intermediate staging area for imported data
- GCP VM — hosts a persistent Qdrant instance
- Cloud Scheduler — triggers importers on cron schedules