Skip to content

tonykipkemboi/ollama_pdf_rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

47 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿค– Chat with PDF locally using Ollama + LangChain

A powerful local RAG (Retrieval Augmented Generation) application that lets you chat with your PDF documents using Ollama and LangChain. This project includes multiple interfaces: a modern Next.js web app, a Streamlit interface, and Jupyter notebooks for experimentation.

Python Tests

โœจ Features

  • ๐Ÿ”’ 100% Local - All processing happens on your machine, no data leaves
  • ๐Ÿ“„ Multi-PDF Support - Upload and query across multiple documents
  • ๐Ÿง  Multi-Query RAG - Intelligent retrieval with source citations
  • ๐ŸŽฏ Advanced RAG - LangChain-powered pipeline with ChromaDB
  • ๐Ÿ–ฅ๏ธ Two Modern UIs - Next.js (primary) and Streamlit interfaces
  • ๐Ÿ”Œ REST API - FastAPI backend for programmatic access
  • ๐Ÿ““ Jupyter Notebooks - For experimentation and learning

๐Ÿ–ผ๏ธ Screenshots

Next.js Interface (Recommended)

Next.js UI Modern chat interface with PDF management, source citations, and reasoning steps

Streamlit Interface

Streamlit UI Classic Streamlit interface with PDF viewer and chat functionality

๐Ÿ“บ Video Tutorial

Watch the video

๐Ÿ—๏ธ Project Structure

ollama_pdf_rag/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ api/                  # FastAPI REST API
โ”‚   โ”‚   โ”œโ”€โ”€ routers/          # API endpoints
โ”‚   โ”‚   โ”œโ”€โ”€ services/         # Business logic
โ”‚   โ”‚   โ””โ”€โ”€ main.py           # API entry point
โ”‚   โ”œโ”€โ”€ app/                  # Streamlit application
โ”‚   โ”‚   โ”œโ”€โ”€ components/       # UI components
โ”‚   โ”‚   โ””โ”€โ”€ main.py           # Streamlit entry point
โ”‚   โ””โ”€โ”€ core/                 # Core RAG functionality
โ”‚       โ”œโ”€โ”€ document.py       # PDF processing
โ”‚       โ”œโ”€โ”€ embeddings.py     # Vector embeddings
โ”‚       โ”œโ”€โ”€ llm.py            # LLM configuration
โ”‚       โ””โ”€โ”€ rag.py            # RAG pipeline
โ”œโ”€โ”€ web-ui/                   # Next.js frontend
โ”‚   โ”œโ”€โ”€ app/                  # Next.js app router
โ”‚   โ”œโ”€โ”€ components/           # React components
โ”‚   โ””โ”€โ”€ lib/                  # Utilities & AI integration
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ pdfs/                 # PDF storage
โ”‚   โ””โ”€โ”€ vectors/              # ChromaDB storage
โ”œโ”€โ”€ notebooks/                # Jupyter notebooks
โ”œโ”€โ”€ tests/                    # Unit tests
โ”œโ”€โ”€ docs/                     # Documentation
โ”œโ”€โ”€ run.py                    # Streamlit runner
โ”œโ”€โ”€ run_api.py                # FastAPI runner
โ””โ”€โ”€ start_all.sh              # Start all services

๐Ÿš€ Getting Started

Prerequisites

  1. Install Ollama

    • Visit Ollama's website to download and install
    • Pull required models:
      ollama pull llama3.2  # or your preferred chat model
      ollama pull nomic-embed-text  # for embeddings
  2. Clone Repository

    git clone https://github.com/tonykipkemboi/ollama_pdf_rag.git
    cd ollama_pdf_rag
  3. Set Up Python Environment

    python -m venv venv
    source venv/bin/activate  # On Windows: .\venv\Scripts\activate
    pip install -r requirements.txt
  4. Set Up Next.js Frontend (for the modern UI)

    cd web-ui
    pnpm install
    pnpm db:migrate
    cd ..

๐ŸŽฎ Running the Application

Option 1: Next.js + FastAPI (Recommended)

Start both services:

# Terminal 1: Start the FastAPI backend
python run_api.py
# Runs on http://localhost:8001

# Terminal 2: Start the Next.js frontend
cd web-ui && pnpm dev
# Runs on http://localhost:3000

Or use the convenience script:

./start_all.sh

Service URLs:

Service URL Description
Next.js Frontend http://localhost:3000 Modern chat interface
FastAPI Backend http://localhost:8001 REST API
API Documentation http://localhost:8001/docs Swagger UI

Option 2: Streamlit Interface

python run.py
# Runs on http://localhost:8501

Option 3: Jupyter Notebook

jupyter notebook

Open notebooks/experiments/updated_rag_notebook.ipynb to experiment with the code.

๐Ÿ’ก Usage

Next.js Interface

  1. Upload PDFs - Click the ๐Ÿ“Ž button or drag & drop files
  2. View PDFs - Uploaded PDFs appear in the sidebar with chunk counts
  3. Select Model - Choose from your locally available Ollama models
  4. Ask Questions - Type your question and get answers with source citations
  5. View Reasoning - See the AI's thinking process and retrieved chunks

Streamlit Interface

  1. Upload PDF - Use the file uploader or toggle "Use sample PDF"
  2. Select Model - Choose from available Ollama models
  3. Ask Questions - Chat with your PDF through the interface
  4. Adjust Display - Use the zoom slider for PDF visibility
  5. Clean Up - Delete collections when switching documents

๐Ÿ”Œ API Reference

The FastAPI backend provides these endpoints:

Method Endpoint Description
POST /api/v1/pdfs/upload Upload and process a PDF
GET /api/v1/pdfs List all uploaded PDFs
DELETE /api/v1/pdfs/{pdf_id} Delete a PDF
POST /api/v1/query Query PDFs with RAG
GET /api/v1/models List available Ollama models
GET /api/v1/health Health check

See full documentation at http://localhost:8001/docs when running.

๐Ÿงช Testing

# Run all tests
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=src

Pre-commit Hooks

pip install pre-commit
pre-commit install

โš ๏ธ Troubleshooting

  • Ollama not responding: Ensure Ollama is running (ollama serve)
  • Model not found: Pull models with ollama pull <model-name>
  • No chunks retrieved: Re-upload PDFs to rebuild the vector database
  • Port conflicts: Check if ports 3000, 8001, or 8501 are in use

Common Errors

ONNX DLL Error (Windows)

DLL load failed while importing onnx_copy2py_export

Install Microsoft Visual C++ Redistributable and restart.

CPU-Only Systems

Reduce chunk size if experiencing memory issues:

  • Modify chunk_size to 500-1000 in src/core/document.py

๐Ÿค Contributing

  • Open issues for bugs or suggestions
  • Submit pull requests
  • Comment on the YouTube video for questions
  • โญ Star the repository if you find it useful!

๐Ÿ“ License

This project is open source and available under the MIT License.


โญ๏ธ Star History

Star History Chart

Built with โค๏ธ by Tony Kipkemboi

Follow me on X | LinkedIn | YouTube | GitHub

About

A full-stack demo showcasing a local RAG (Retrieval Augmented Generation) pipeline to chat with your PDFs.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors