Skip to content

abbaouiAchraf/advanced-sqlagent-cookbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Advanced PydanticAI SQL Agent

Python 3.9+ License: MIT Docker PydanticAI

A production-ready, enterprise-grade SQL Agent that converts natural language questions into safe SQL queries using PydanticAI and Azure OpenAI. Built with comprehensive security mechanisms, type-safe responses, and multi-interface support.

✨ Key Features

πŸ›‘οΈ Security & Safety

  • Multi-layered SQL injection prevention with pattern matching and validation
  • Dangerous operation blocking (INSERT, UPDATE, DELETE, DROP, etc.)
  • Query complexity analysis and performance optimization
  • Rate limiting (30 queries/minute) with abuse prevention
  • Comprehensive audit logging with security monitoring

🎯 Type Safety & Validation

  • Type-safe responses with Pydantic models (Success, InvalidRequest, DangerousQuery)
  • Structured output validation ensuring consistent API responses
  • Input sanitization and query normalization
  • Comprehensive error handling with user-friendly messages

πŸ”§ Enterprise Ready

  • Azure OpenAI integration with retry logic and failover
  • Multi-database support (SQLite, PostgreSQL, MySQL, SQL Server)
  • Docker deployment with production-ready configurations
  • Connection pooling and performance optimization
  • Environment-based configuration with secrets management

πŸ–₯️ Multiple Interfaces

  • Interactive CLI with natural language processing
  • Streamlit web interface with real-time query visualization
  • REST API ready architecture for integration
  • Comprehensive examples and usage patterns

πŸ“Š Advanced Analytics

  • Structured analytics queries with pre-built templates
  • Dynamic query generation based on business metrics
  • Data visualization support with export capabilities
  • Sample data with realistic e-commerce scenarios

πŸš€ Quick Start

Option 1: Python Installation

Prerequisites

  • Python 3.9 or higher
  • Azure OpenAI account with deployment

Setup

# 1. Clone the repository
git clone https://github.com/abbaouiAchraf/advanced-sqlagent-cookbook.git
cd advanced-sqlagent-cookbook

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -e .

# 4. Configure environment
cp .env.example .env
# Edit .env with your Azure OpenAI credentials

# 5. Run the application
python main.py

Option 2: Docker Installation (Recommended)

Prerequisites

  • Docker and Docker Compose
  • Azure OpenAI credentials

Quick Start

# 1. Clone and configure
git clone https://github.com/abbaouiAchraf/advanced-sqlagent-cookbook.git
cd advanced-sqlagent-cookbook
cp .env.example .env
# Edit .env with your Azure OpenAI credentials

# 2. Start with Docker
docker-compose up --build

# 3. Access applications
# - Web Interface: http://localhost:8502
# - Interactive CLI: docker exec -it advanced-sql-agent python main.py

βš™οΈ Configuration

Required Environment Variables

# Azure OpenAI Configuration (Required)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key-here
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4
AZURE_OPENAI_API_VERSION=2024-06-01

# Database Configuration (Optional - defaults to SQLite)
DATABASE_URL=sqlite:///./sql_agent.db

# Security Configuration (Optional)
ENABLE_SAFETY_CHECKS=true
RATE_LIMIT_QPM=30
MAX_RETRIES=5

# Logging Configuration (Optional)
LOG_LEVEL=INFO

Database Options

# SQLite (Default - no setup required)
DATABASE_URL=sqlite:///./data/sql_agent.db

# PostgreSQL
DATABASE_URL=postgresql://username:password@localhost:5432/database_name

# SQL Server
DATABASE_URL=mssql+pyodbc://username:password@server/database?driver=ODBC+Driver+17+for+SQL+Server

# MySQL
DATABASE_URL=mysql+pymysql://username:password@localhost:3306/database_name

πŸ’» Usage Examples

Interactive CLI Mode

python main.py

# Choose option 1 for interactive mode
# Example questions:
# - "How many customers do we have?"
# - "What is the total revenue by country?"
# - "Show me the top 5 products by sales"
# - "Which customers haven't placed orders recently?"

Web Interface

streamlit run examples/streamlit_app.py
# Or with Docker: http://localhost:8502

# Features:
# - Real-time query execution
# - Visual results with charts
# - Query history and export
# - Security metrics dashboard

Programmatic Usage

from src import EnhancedPydanticSQLAgent, DatabaseManager, settings

# Initialize
db_manager = DatabaseManager(settings.database_url)
agent = EnhancedPydanticSQLAgent(db_manager, user_id="demo")

# Query
result = await agent.query("How many orders were placed this month?")

if isinstance(result, Success):
    print(f"SQL: {result.sql_query}")
    print(f"Data: {result.data}")
    print(f"Rows: {result.row_count}")

Sample Queries You Can Try

# Business Intelligence
"What's our total revenue this year?"
"Show me top customers by order value"
"Which products are running low on stock?"

# Analytics
"Average order value by country"
"Monthly sales trends"
"Customer retention analysis"

# Security Test (These will be blocked)
"DELETE FROM customers"  # πŸ›‘οΈ Blocked
"DROP TABLE orders"      # πŸ›‘οΈ Blocked

🐳 Docker Deployment

Development Environment

# Start all services with SQLite
docker-compose up --build

# Access points:
# - Web UI: http://localhost:8502
# - CLI: docker exec -it advanced-sql-agent python main.py

Production with PostgreSQL

# Start with PostgreSQL backend
docker-compose --profile postgres up -d --build

# Services:
# - PostgreSQL: localhost:5432
# - Web UI: http://localhost:8503
# - API ready for integration

Available Services

Service Port Description
sql-agent 8501 Interactive CLI application
streamlit-web 8502 Web interface (SQLite)
postgres 5432 PostgreSQL database
sql-agent-postgres 8503 Web interface (PostgreSQL)

For detailed Docker documentation, see docker/README.md.

πŸ“ Project Structure

Advanced-SqlAgent-cookbook/
β”œβ”€β”€ src/                          # Core application code
β”‚   β”œβ”€β”€ agents/                   # SQL Agent implementations
β”‚   β”‚   └── sql_agent.py         # Main PydanticAI agent
β”‚   β”œβ”€β”€ config/                   # Configuration management
β”‚   β”‚   β”œβ”€β”€ azure_config.py      # Azure OpenAI setup
β”‚   β”‚   └── settings.py          # Environment settings
β”‚   β”œβ”€β”€ database/                 # Database management
β”‚   β”‚   β”œβ”€β”€ connection.py        # Connection pooling
β”‚   β”‚   β”œβ”€β”€ manager.py           # Database operations
β”‚   β”‚   └── schema.py            # Schema introspection
β”‚   β”œβ”€β”€ models/                   # Pydantic models
β”‚   β”‚   β”œβ”€β”€ database_models.py   # Database schemas
β”‚   β”‚   β”œβ”€β”€ request_models.py    # Request validation
β”‚   β”‚   └── response_models.py   # Response types
β”‚   β”œβ”€β”€ safety/                   # Security & validation
β”‚   β”‚   β”œβ”€β”€ security.py          # Audit & rate limiting
β”‚   β”‚   └── validators.py        # SQL safety checks
β”‚   └── utils/                    # Utilities
β”‚       β”œβ”€β”€ helpers.py           # SQL formatting & analysis
β”‚       └── logging.py           # Logging configuration
β”œβ”€β”€ examples/                     # Usage examples
β”‚   β”œβ”€β”€ basic_usage.py           # Simple queries demo
β”‚   β”œβ”€β”€ advanced_analytics.py   # Analytics examples
β”‚   └── streamlit_app.py         # Web interface
β”œβ”€β”€ scripts/                      # Database initialization
β”‚   β”œβ”€β”€ init_database.py         # SQLite sample data
β”‚   └── init_postgres.sql        # PostgreSQL setup
β”œβ”€β”€ docker/                       # Docker documentation
β”‚   └── README.md                # Deployment guide
β”œβ”€β”€ tests/                        # Test suite
β”œβ”€β”€ main.py                       # Interactive application
β”œβ”€β”€ docker-compose.yml           # Multi-service setup
β”œβ”€β”€ Dockerfile                   # Container definition
β”œβ”€β”€ pyproject.toml               # Dependencies & config
└── .env.example                 # Environment template

πŸ—οΈ Architecture Overview

Core Components

  • EnhancedPydanticSQLAgent: Main agent using PydanticAI framework
  • DatabaseManager: Handles connections and query execution
  • SQLSafetyValidator: Multi-layer security validation
  • Response Models: Type-safe response handling (Success, InvalidRequest, DangerousQuery)

Security Layers

  1. Input Validation: Pydantic model validation
  2. Pattern Matching: Dangerous operation detection
  3. SQL Parsing: Syntax and structure validation
  4. Rate Limiting: Abuse prevention (30 queries/minute)
  5. Audit Logging: Security monitoring and tracking

Data Flow

User Question β†’ Input Validation β†’ Azure OpenAI β†’ SQL Generation β†’ 
Safety Validation β†’ Database Execution β†’ Response Formatting β†’ User

πŸ› οΈ Development

Setup Development Environment

# Install development dependencies
pip install -e .[dev]

# Run tests
pytest --cov=src --cov-report=term-missing

# Code formatting
black src/ tests/ examples/
isort src/ tests/ examples/

# Type checking
mypy src/

# Linting
ruff check src/ tests/ examples/

Sample Data

The project includes comprehensive sample data:

  • 15 customers from various countries
  • 25 products across 5 categories (Electronics, Footwear, Clothing, Furniture, Appliances)
  • 40 orders with realistic statuses and pricing
  • E-commerce relationships with proper foreign keys

🚨 Troubleshooting

Common Issues

Azure OpenAI Connection

# Error: "Azure OpenAI connection failed"
# Solution: Verify your .env configuration
cat .env | grep AZURE_OPENAI

# Check endpoint format (should end with /)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

Database Connection

# SQLite permission error
chmod 644 ./data/sql_agent.db

# PostgreSQL connection timeout
# Solution: Increase connection timeout in settings
CONNECTION_POOL_TIMEOUT=60

Docker Issues

# Port already in use
docker-compose down  # Stop existing services
lsof -i :8502        # Find process using port
kill -9 <PID>        # Kill the process

# Container build fails
docker system prune -a  # Clean Docker cache
docker-compose build --no-cache

Rate Limiting

# Error: "Rate limit exceeded"
# Solution: Wait 1 minute or adjust rate limit
RATE_LIMIT_QPM=60  # Increase from default 30

Getting Help

  1. Check the logs: docker-compose logs sql-agent
  2. Verify configuration: Ensure .env file has correct values
  3. Test database: Run python scripts/init_database.py
  4. Security metrics: Check python main.py β†’ option 1 β†’ security status

πŸ“š Additional Resources

Documentation

Examples and Tutorials

  • examples/basic_usage.py - Simple query examples
  • examples/advanced_analytics.py - Business intelligence queries
  • examples/streamlit_app.py - Web interface implementation

Security

  • SQL injection prevention patterns
  • Rate limiting implementation
  • Audit logging and monitoring
  • Query complexity analysis

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and add tests
  4. Run the test suite: pytest
  5. Format code: black . && isort .
  6. Commit changes: git commit -m 'Add amazing feature'
  7. Push to branch: git push origin feature/amazing-feature
  8. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“§ Contact

ABBAOUI Achraf


⭐ Star this repository if you find it useful! ⭐

About

Enterprise-grade SQL Agent using PydanticAI and Azure OpenAI for secure natural language to SQL conversion with comprehensive safety validation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors