Skip to content

Contributing to My Independent AI

Thank you for your interest! This guide covers everything you need to contribute — from running the project locally to adding a new data source importer.


Local Development Setup

Prerequisites

Install dependencies

uv sync --all-packages --group dev

Start backing services

# Option A: full Docker Compose (recommended)
docker compose up ollama qdrant

# Option B: individually
ollama serve
docker run -p 6333:6333 qdrant/qdrant

Run the dashboard

uv run streamlit run apps/admin-dashboard/app.py

Run tests

uv run pytest                         # all tests
uv run pytest apps/importers          # importer tests only
uv run pytest apps/admin-dashboard    # dashboard tests only

Monorepo Structure

This is a uv workspace. Each app and lib has its own pyproject.toml, and uv manages all dependencies from the root pyproject.toml / uv.lock.

Path Type Description
apps/admin-dashboard App Streamlit chat UI
apps/importers App All data source importers
apps/orchestrator App Ingestion pipeline runner
libs/embedding Lib Ollama embedding client
libs/vector-storage Lib Qdrant client wrapper
libs/privacy-core Lib PII scrubbing + de-anonymization
libs/content-processing Lib PDF/text extraction

When adding a new lib dependency to an app:

# From the repo root, add to a specific workspace member
uv add --package admin-dashboard some-library

Adding a New Importer

All importers live in apps/importers/ and follow the BaseImporter contract.

1. Create the importer file

apps/importers/
└── your_source/
    ├── __init__.py
    └── importer.py
# apps/importers/your_source/importer.py
from importers.base import BaseImporter
from typing import Any, Dict, List

class YourSourceImporter(BaseImporter):
    """Imports data from YourSource."""

    def authenticate_source(self) -> bool:
        # Set up API client, load credentials, etc.
        return True

    def extract(self) -> List[Dict[str, Any]]:
        # Fetch raw records from the source.
        # Return a list of dicts — each dict is one document.
        return []

    def transform(self, raw_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        # Convert raw records to the standard format:
        # { "content": str, "metadata": { "source": str, "timestamp": str, ... } }
        return raw_data

2. Register it in the orchestrator config

# apps/importers/config.yaml
importers:
    local:
        - name: your_source
          description: "What this imports"
          enabled: true
          schedule: "0 3 * * *" # daily at 3 AM

3. Add tests

apps/importers/tests/
└── test_your_source_importer.py

Follow the existing test patterns in apps/importers/tests/. At minimum, test:

  • authenticate_source() returns True with valid config
  • extract() returns a list
  • transform() produces the expected schema

4. Run your importer manually

uv run python -m importers.orchestrator --importer your_source

Code Style

  • Formatter: ruff format (via uv run ruff format .)
  • Linter: ruff check (via uv run ruff check .)
  • Type hints: Required for all public methods in libs/
  • Docstrings: Google-style preferred

Submitting a PR

  1. Fork the repo and create a feature branch: git checkout -b feat/your-importer
  2. Make your changes, add tests
  3. Run uv run pytest — all tests must pass
  4. Open a PR with a clear description of what data source you're adding and why