This project is a video context search API that allows users to search for specific moments or scenes within a collection of videos using natural language queries. It leverages machine learning embeddings and FAISS (Facebook AI Similarity Search) for efficient similarity search over video frame representations. The API is built with FastAPI and uses Pydantic for data validation.
- Video Embeddings: Each video is processed to extract frame-level embeddings using a text embedding model.
- Indexing: The embeddings are indexed using FAISS for searching using cosine similarity.
- Metadata: Video metadata (such as video name and timestamp) is stored in a pickle file for quick lookup.
- API: The FastAPI backend exposes a
/searchendpoint that takes a natural language query and returns the most relevant video segments. - Search: When a query is received, it is embedded, searched against the FAISS index, and the top results are returned with video names and timestamps.
- Python 3.10 or higher
- pip
- (Optional) virtualenv for isolated environments
git clone http://github.com/abdulhakkeempa/video-context-search.git
cd video-context-searchpython -m venv venv
.\venv\Scripts\activate (Widnows)
source venv/bin/activate (Mac/Linux)pip install -r requirements.txt
pip install git+https://github.com/openai/CLIP.git- Place your videos in
data/videos/
python scripts/main.pyvideo_frames.index(FAISS index file)video_metadata.pkl(Pickle file with video metadata)
uvicorn main:app --reloadThe API will be available at http://127.0.0.1:8000.
You can use Swagger UI to interact with the /api/search endpoint.
POST /api/search
{
"query": "A woman in water",
"top_k": 5
}main.py- FastAPI app entry pointapi/routes.py- API route definitionsservices/- Embedding, search, and indexing logicmodels/schema.py- Pydantic modelsdata/videos/- Video filesvideo_frames.index- FAISS indexvideo_metadata.pkl- Video metadata