ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
-
Updated
Apr 21, 2026 - Python
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
From data to vector database effortlessly
AzLogDcrIngestPS - Unleashing the power of Log Ingestion API with Azure LogAnalytics custom table v2, Azure Data Collection Rules and Azure Data Ingestion Pipeline
Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords
Google Analytics connector, pre-processor and model for predicting churning users for digital publishers.
Extract Transform and Load unstructured data into the Clarifai's AI platform
This is an Elasticsearch Ingest Pipeline Processor that calls an HTTP(s) endpoint and adds the response back to the ingest document for further processing.
Real-time flight data fetching, cleaning, and analytics API using FastAPI, Pandas, PostgreSQL, and Python.
My experiments with Apache Spark for Humans ⭐
DAUT – Documentation Auto Updater - AI-powered documentation generator for your codebase. MCP-Connector
Enterprise-grade ingestion blueprint for Postgres to Databricks powered by dlt. Features dual-mode operation (Full Load + CDC Load) and robust CI/CD via Databricks Asset Bundles.
A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangGraph. Inspired from my Upgrad_IIITB PG Course.
Multi-disease segmentation chest X-rays by YOLO and DenseNet121, CoAtNet models
Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.
DataStax or Cassandra Ingest from Relational Databases with StreamSets
Fast Python web crawler for RAG and AI ingestion. Extracts clean Markdown from any site for LLMs and vector stores.
Data stack for WeLearn LPI projects. This pipeline can collect, vectorize and store data from various sources.
A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangChain. Inspired from my Upgrad_IIITB PG Course.
an active engineering lab — a real-time portfolio tracking my decisions, commits, and projects as they happen.
Add a description, image, and links to the ingestion-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the ingestion-pipeline topic, visit your repo's landing page and select "manage topics."