Untitled

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

# NewsDatabase - AI-Powered News Vector Database

## 1. Project Overview

- **Vision:** AI-Powered Personal News Management with Vector Database
- **Current Phase:** Production-ready with advanced content management features
- **Key Architecture:** FastAPI + ChromaDB + Sentence Transformers + MCP Server with enterprise security
- **Status:** Complete implementation with 955+ AI news documents, featuring duplicate detection, automatic categorization, and semantic search
- **Core Features:** Duplicate prevention, AI-powered categorization (10 categories), sub-50ms semantic search, real-time analytics
- **Security Features:** API key authentication, token bucket rate limiting, CORS configuration, comprehensive error boundaries
- **MCP Integration:** Fully operational with Claude Code knowledge base integration via 3 specialized tools
- **Content Management:** Automatic organization with duplicate detection and intelligent categorization

## 2. Project Structure

**⚠️ CRITICAL: AI agents MUST read the [Project Structure documentation](/docs/ai-context/project-structure.md) before attempting any task to understand the complete technology stack, file tree and project organization.**

NewsDatabase is a production-ready personal news management system with advanced content intelligence features. Built on FastAPI + ChromaDB + Sentence Transformers, it provides automatic duplicate detection, AI-powered categorization into 10 content types, and sub-50ms semantic search. The system includes enterprise-grade security (API authentication, rate limiting, CORS), comprehensive error handling, and seamless Claude Code integration. Currently operational with 955+ AI news documents, featuring intelligent content organization and real-time analytics. For the complete tech stack, architecture, and new features, see [docs/ai-context/project-structure.md](/docs/ai-context/project-structure.md).

## 3. Coding Standards & AI Instructions

### General Instructions

- Your most important job is to manage your own context. Always read any relevant files BEFORE planning changes.
- When updating documentation, keep updates concise and on point to prevent bloat.
- Write code following KISS, YAGNI, and DRY principles.
- When in doubt follow proven best practices for implementation.
- Do not commit to git without user approval.
- Do not run any servers, rather tell the user to run servers for testing.
- Always consider industry standard libraries/frameworks first over custom implementations.
- Never mock anything. Never use placeholders. Never omit code.
- Apply SOLID principles where relevant. Use modern framework features rather than reinventing solutions.
- Be brutally honest about whether an idea is good or bad.
- Make side effects explicit and minimal.
- Design database schema to be evolution-friendly (avoid breaking changes).


### File Organization & Modularity

- Default to creating multiple small, focused files rather than large monolithic ones
- Each file should have a single responsibility and clear purpose
- Keep files under 350 lines when possible - split larger files by extracting utilities, constants, types, or logical components into separate modules
- Separate concerns: utilities, constants, types, components, and business logic into different files
- Prefer composition over inheritance - use inheritance only for true 'is-a' relationships, favor composition for 'has-a' or behavior mixing

- Follow existing project structure and conventions - place files in appropriate directories. Create new directories and move files if deemed appropriate.
- Use well defined sub-directories to keep things organized and scalable
- Structure projects with clear folder hierarchies and consistent naming conventions
- Import/export properly - design for reusability and maintainability

### Type Hints (REQUIRED)

- **Always** use type hints for function parameters and return values
- Use `from typing import` for complex types
- Prefer `Optional[T]` over `Union[T, None]`
- Use Pydantic models for data structures

```python
# Good
from typing import Optional, List, Dict, Tuple

async def ingest_news_file(
    file_path: str,
    metadata: Dict[str, Any],
    embedding_model: Optional[str] = None
) -> Tuple[str, Dict[str, Any]]:
    """Ingest a news markdown file into the vector database."""
    pass
```

### Naming Conventions

- **Classes**: PascalCase (e.g., `NewsDatabase`, `EmbeddingService`)
- **Functions/Methods**: snake_case (e.g., `ingest_news_file`, `search_similar`)
- **Constants**: UPPER_SNAKE_CASE (e.g., `MAX_FILE_SIZE`, `DEFAULT_MODEL`)
- **Private methods**: Leading underscore (e.g., `_validate_markdown`, `_generate_embedding`)
- **Pydantic Models**: PascalCase with `Schema` suffix (e.g., `NewsFileSchema`, `SearchRequestSchema`)


### Documentation Requirements

- Every module needs a docstring
- Every public function needs a docstring
- Use Google-style docstrings
- Include type information in docstrings

```python
def calculate_similarity(query: str, document: str) -> float:
    """Calculate semantic similarity between query and document.

    Args:
        query: Search query text
        document: Document content to compare against

    Returns:
        Similarity score between 0 and 1

    Raises:
        ValueError: If either text is empty
    """
    pass
```

### Security First (Production Implemented)

- **API Key Authentication**: Bearer token authentication implemented for all protected endpoints
- **Rate Limiting**: Token bucket algorithm with per-client tracking and burst protection
- **Input Validation**: Multi-layer validation at API boundaries and service layers
- **CORS Configuration**: Configurable cross-origin policies for production security
- **Error Boundaries**: Comprehensive error handling with security-aware logging
- **Secrets Management**: All secrets in environment variables, never in code
- **Security Logging**: Log security events without exposing sensitive data
- **Request Sanitization**: Comprehensive input sanitization before processing
- **Graceful Degradation**: Security failures handled with appropriate fallbacks

### Error Handling (Production Implemented)

- **Custom Exception Types**: VectorDBError, EmbeddingError, ProcessingError with categorization
- **Error Boundaries**: Decorators for comprehensive error catching and handling
- **Structured Logging**: JSON-formatted logs with correlation IDs and context
- **Security-Aware Messages**: Error responses that don't reveal system internals
- **Retry Mechanisms**: Exponential backoff for transient failures
- **Graceful Fallbacks**: Fallback values and degraded service modes

### Observable Systems & Logging Standards (Production Implemented)

- **Structured Logging**: JSON format with timestamp, level, correlation_id, event, context
- **Security Event Logging**: Authentication attempts, rate limit violations, error boundaries
- **Performance Monitoring**: Request timing, embedding generation metrics, vector search performance
- **Health Monitoring**: Real-time system health checks with component status
- **Production Debugging**: Correlation IDs for tracing requests across service boundaries

### State Management

- Have one source of truth for each piece of state
- Make state changes explicit and traceable
- Design for distributed vector operations - use request IDs for tracking, avoid storing large embeddings in memory
- Keep file metadata lightweight and cache-friendly

### API Design Principles (Production Implemented)

- **RESTful design** with consistent URL patterns and HTTP status codes
- **Authentication required** for all data-modifying and sensitive endpoints
- **Rate limiting** with different tiers for heavy vs. light operations
- **CORS support** with configurable origin policies
- **Consistent JSON response format**:
  - Success: `{ "data": {...}, "error": null }`
  - Error: `{ "data": null, "error": {"message": "...", "code": "..."} }`
- **Security headers** including rate limit status in responses
- **Input validation** at multiple layers with detailed error messages


## 4. Claude Code Development Kit Commands

This project uses the Claude Code Development Kit, which provides sophisticated command workflows for development tasks. These commands auto-load documentation and coordinate multi-agent workflows.

### Available Commands

#### `/full-context` - Comprehensive Analysis

**Usage**: `@commands/full-context.md "analyze authentication system"`
**Purpose**: Deep understanding before implementing changes
**When to use**:

- Starting work on new features
- Understanding system interconnections
- Planning architectural changes

#### `/code-review` - Multi-Agent Review

**Usage**: `@commands/code-review.md "review news ingestion module"`
**Purpose**: Get security, performance, and architecture insights
**When to use**:

- After implementing features
- Before merging changes
- Need confidence in code quality

#### `/update-docs` - Documentation Sync

**Usage**: `@commands/update-docs.md "document new search API"`
**Purpose**: Keep documentation current with code changes
**When to use**:

- After modifying code
- After adding features
- When project structure changes

#### `/refactor` - Intelligent Restructuring

**Usage**: `@commands/refactor.md "break up large database module"`
**Purpose**: Restructure code while maintaining functionality
**When to use**:

- Breaking up large files
- Improving code organization
- Extracting reusable components

#### `/handoff` - Session Continuity

**Usage**: `@commands/handoff.md "completed vector search implementation"`
**Purpose**: Preserve context between sessions
**When to use**:

- Ending work sessions
- Context limit approaching
- Switching between major tasks

### Typical Development Workflow

```bash
/full-context "implement news ingestion API"    # Understand requirements
# ... implement the feature ...
/code-review "review ingestion system"         # Validate quality
/update-docs "document ingestion API"          # Keep docs current
/handoff "completed news ingestion feature"    # Preserve context
```

## 5. Multi-Agent Workflows & Context Injection

### Automatic Context Injection for Sub-Agents

When using the Task tool to spawn sub-agents, the core project context (CLAUDE.md, project-structure.md, docs-overview.md) is automatically injected into their prompts via the subagent-context-injector hook. This ensures all sub-agents have immediate access to essential project documentation without the need of manual specification in each Task prompt.


## 6. MCP Server Integrations

### 🌟 NewsDatabase Knowledge Base Server (Production Ready)

**Repository**: This project - `/mcp_newsdb_server.py`
**Status**: **🟢 FULLY OPERATIONAL** with 955 AI news documents

**Current Integration Status:**

- **✅ MCP Server**: Successfully integrated with Claude Code
- **✅ Knowledge Base**: 955 AI news content chunks indexed and searchable
- **✅ Performance**: Sub-50ms semantic search with similarity ranking
- **✅ Health Monitoring**: Real-time system status and performance tracking
- **✅ Production Stability**: Tested, documented, and operationally verified

**When to use:**

- Searching for AI and technology news information
- Getting context on AI developments, companies, and trends
- Finding relevant background information for AI-related tasks
- Accessing curated AI news content with semantic search
- Research on machine learning, neural networks, and AI breakthroughs

**Available MCP Tools:**

```python
# Search 955+ AI news documents with enhanced semantic similarity
mcp__newsdb-knowledge__search_news_knowledge(
    query="OpenAI GPT-4 developments",
    limit=5,
    similarity_threshold=0.3  # Automatic duplicate exclusion and category awareness
)

# Get comprehensive knowledge base and feature statistics
mcp__newsdb-knowledge__get_knowledge_stats()  # Includes category distribution and duplicate stats

# Check system health including new content management features
mcp__newsdb-knowledge__check_knowledge_health()
```

**Key capabilities:**

- **955+ AI News Documents**: Comprehensive curated knowledge base with intelligent organization
- **Sub-50ms Search**: High-performance semantic search with category filtering and duplicate exclusion
- **Duplicate Prevention**: Automatic hash-based and semantic duplicate detection
- **Smart Categorization**: AI-powered classification into 10 content categories with topic extraction
- **Enhanced Analytics**: Category distribution, duplicate statistics, and content insights
- **Real-time Health Monitoring**: System status, performance tracking, and feature health
- **Similarity Scoring**: Results ranked 0.0-1.0 for relevance with metadata enrichment
- **Production Stability**: Comprehensive testing including advanced features, security, and integration

**Example Search Queries:**

```python
# AI company developments
search_news_knowledge("OpenAI ChatGPT GPT-4")

# Technology trends
search_news_knowledge("machine learning breakthroughs")

# Industry analysis
search_news_knowledge("AI startup funding venture capital")

# Research developments
search_news_knowledge("neural networks deep learning research")

# Specific technologies
search_news_knowledge("transformer architecture attention mechanisms")

# Market analysis
search_news_knowledge("AI regulation policy government")
```

**Integration Architecture:**

```
NewsDatabase API ↔ MCP Server ↔ Claude Code
      ↓              ↓            ↓
  FastAPI         HTTP/JSON    Tool Calls
  ChromaDB        Async        search_news_knowledge
  955 Docs        Bridge       get_knowledge_stats
  Embeddings      Real-time    check_knowledge_health
```

### Gemini Consultation Server

**When to use:**

- Complex coding problems requiring deep analysis or multiple approaches
- Code reviews and architecture discussions
- Debugging complex issues across multiple files
- Performance optimization and refactoring guidance
- Detailed explanations of complex implementations
- Highly security relevant tasks

**Automatic Context Injection:**

- The kit's `gemini-context-injector.sh` hook automatically includes two key files for new sessions:
  - `/docs/ai-context/project-structure.md` - Complete project structure and tech stack
  - `/MCP-ASSISTANT-RULES.md` - Your project-specific coding standards and guidelines
- This ensures Gemini always has comprehensive understanding of your technology stack, architecture, and project standards

**Usage patterns:**

```python
# New consultation session (project structure auto-attached by hooks)
mcp__gemini__consult_gemini(
    specific_question="How should I optimize the vector search performance?",
    problem_description="Need to reduce latency in similarity search operations",
    code_context="Current system processes embeddings sequentially...",
    attached_files=[
        "src/database/vector_search.py"  # Your specific files
    ],
    preferred_approach="optimize"
)

# Follow-up in existing session
mcp__gemini__consult_gemini(
    specific_question="What about memory usage with large collections?",
    session_id="session_123",
    additional_context="Implemented your suggestions, now seeing high memory usage with ChromaDB"
)
```

**Key capabilities:**

- Persistent conversation sessions with context retention
- File attachment and caching for multi-file analysis
- Specialized assistance modes (solution, review, debug, optimize, explain)
- Session management for complex, multi-step problems

**Important:** Treat Gemini's responses as advisory feedback. Evaluate the suggestions critically, incorporate valuable insights into your solution, then proceed with your implementation.

### Context7 Documentation Server

**Repository**: [Context7 MCP Server](https://github.com/upstash/context7)

**When to use:**

- Working with external libraries/frameworks (React, FastAPI, Next.js, etc.)
- Need current documentation beyond training cutoff
- Implementing new integrations or features with third-party tools
- Troubleshooting library-specific issues

**Usage patterns:**

```python
# Resolve library name to Context7 ID for relevant libraries
mcp__context7__resolve_library_id(libraryName="fastapi")
mcp__context7__resolve_library_id(libraryName="chromadb")

# Fetch focused documentation
mcp__context7__get_library_docs(
    context7CompatibleLibraryID="/tiangolo/fastapi",
    topic="async-endpoints",
    tokens=8000
)

mcp__context7__get_library_docs(
    context7CompatibleLibraryID="/chroma-core/chroma",
    topic="collections",
    tokens=8000
)
```

**Key capabilities:**

- Up-to-date library documentation access
- Topic-focused documentation retrieval
- Support for specific library versions
- Integration with current development practices


## 7. Development Commands & Verification

### Essential Development Commands

```bash
# Start the main application
python main.py

# Run security feature tests
python test_security.py

# Run comprehensive functionality tests
python test_newsdb.py

# Test MCP integration
python test_mcp_integration.py

# Verify MCP setup
./verify_mcp_setup.sh

# Type checking
mypy src/

# Health check (when server running)
curl http://localhost:8000/health

# Test authenticated endpoint (requires API_KEY_VALUE set)
curl -H "Authorization: Bearer your-api-key" http://localhost:8000/stats
```

### Security Configuration Commands

```bash
# Copy environment template
cp .env.template .env

# Edit environment variables for security
# Set API_KEY_VALUE, CORS_ORIGINS, rate limits, etc.
vim .env

# Test security features
python test_security.py

# Monitor security logs (if running)
tail -f logs/app.log | grep -E "auth|rate|error"
```

### Post-Task Completion Protocol

After completing any coding task, follow this checklist:

### 1. Security & Quality Assurance

- **Security testing**: Run `python test_security.py` to verify all security features
- **Type checking**: Run `mypy src/` to ensure all type hints are valid
- **Core functionality**: Execute `python test_newsdb.py` to verify core functionality
- **MCP integration**: Run `python test_mcp_integration.py` to test Claude Code integration
- **API health**: Verify `curl http://localhost:8000/health` returns healthy status

### 2. Security Feature Verification

- **Authentication**: Test API key authentication on protected endpoints
- **Rate limiting**: Verify rate limits are enforced correctly
- **CORS policy**: Test cross-origin request handling
- **Error boundaries**: Confirm proper error handling and logging
- **Input validation**: Test with malformed inputs to verify validation

### 3. Functional Verification

- Ensure all API endpoints respond correctly with authentication
- Test vector operations with real data samples
- Validate MCP tool functionality in Claude Code
- Confirm search performance meets sub-50ms targets
- Test error scenarios and recovery mechanisms
- If errors are found, fix them before marking the task complete

### 4. Production Readiness

- Verify security configuration is production-appropriate
- Test with realistic load patterns and rate limits
- Confirm monitoring and logging are working correctly
- Validate error handling doesn't expose sensitive information
- Ensure all secrets are properly externalized to environment variables

## 8. Implemented System Architecture

### Core Modules (Production Ready with Advanced Features)

```
src/
├── api/              # FastAPI endpoints with security and content management ✅ PRODUCTION READY
│   ├── main.py       # 13 API endpoints including duplicate detection and categorization
│   ├── auth.py       # API key authentication and security middleware
│   └── rate_limiter.py # Token bucket rate limiting with burst protection
├── database/         # ChromaDB integration and vector operations ✅ PRODUCTION READY
├── ingestion/        # Markdown processing, embeddings, and content analysis ✅ PRODUCTION READY
├── models/           # Comprehensive Pydantic schemas with new feature models ✅ PRODUCTION READY
├── services/         # Business logic with intelligent content management ✅ PRODUCTION READY
│   ├── news_service.py       # Core orchestration and business logic
│   ├── duplicate_detector.py # Hash and semantic duplicate detection
│   └── categorizer.py        # AI-powered content categorization (10 categories)
└── utils/            # Configuration and comprehensive error handling ✅ PRODUCTION READY
    ├── config.py     # Secure configuration management
    └── error_handler.py # Comprehensive error boundaries and logging
```

### Operational Components

1. **Intelligent Ingestion Pipeline**: ✅ Process MD files → Generate embeddings → Detect duplicates → Categorize content → Store in ChromaDB
2. **Advanced Search API**: ✅ Semantic search with category filtering, duplicate exclusion, and metadata inclusion
3. **Duplicate Detection System**: ✅ Hash-based exact matching + semantic similarity analysis with configurable thresholds
4. **Content Categorization**: ✅ AI-powered categorization into 10 content types with topic extraction
5. **File Management**: ✅ Upload, validation, metadata extraction, and intelligent content analysis
6. **Enhanced API Layer**: ✅ 13 RESTful endpoints with authentication, rate limiting, CORS, and content management
7. **Security Layer**: ✅ API key authentication, rate limiting, comprehensive error boundaries
8. **MCP Integration**: ✅ Real-time knowledge base for Claude Code with enhanced search capabilities
9. **Analytics & Monitoring**: ✅ Content statistics, category distribution, duplicate tracking, and system health
10. **Error Handling**: ✅ Comprehensive error boundaries with structured security-aware logging

### Production Technology Stack

- **FastAPI**: ✅ Async API framework with 13 endpoints, security middleware, sub-50ms responses
- **ChromaDB**: ✅ Vector database with 955+ documents, metadata, and intelligent organization
- **Sentence Transformers**: ✅ all-MiniLM-L6-v2 model for embeddings and similarity analysis
- **Content Intelligence**: ✅ Duplicate detection (hash + semantic) and AI categorization (10 categories)
- **Pydantic**: ✅ Complete data validation with advanced feature models and security-aware schemas
- **Security Features**: ✅ API authentication, rate limiting, CORS, comprehensive error boundaries
- **Testing Suite**: ✅ Security, functionality, feature integration, and MCP integration coverage
- **MCP Protocol**: ✅ Claude Code integration with 3 operational tools and enhanced search
- **Analytics**: ✅ Real-time content statistics, category distribution, and duplicate tracking