Skip to content
Change the repository type filter

All

    Repositories list

    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optim…
      Python
      Other
      2.3k001Updated Apr 23, 2026Apr 23, 2026
    • docs

      Public
      DeepInfra platform documentation
      MDX
      MIT License
      0000Updated Apr 21, 2026Apr 21, 2026
    • dynamo

      Public
      A Datacenter Scale Distributed Inference Serving Framework
      Rust
      Other
      1k002Updated Apr 17, 2026Apr 17, 2026
    • hub-docs

      Public
      Docs of the Hugging Face Hub
      Handlebars
      Apache License 2.0
      438000Updated Apr 14, 2026Apr 14, 2026
    • vllm-omni

      Public
      A framework for efficient model inference with omni-modality models
      Python
      Apache License 2.0
      819000Updated Mar 6, 2026Mar 6, 2026
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      16k001Updated Feb 27, 2026Feb 27, 2026
    • Use Hugging Face with JavaScript
      TypeScript
      MIT License
      682000Updated Feb 23, 2026Feb 23, 2026
    • tiktoken

      Public
      tiktoken is a fast BPE tokeniser for use with OpenAI's models.
      Python
      MIT License
      1.4k000Updated Feb 8, 2026Feb 8, 2026
    • A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models …
      Python
      Apache License 2.0
      366000Updated Jan 12, 2026Jan 12, 2026
    • cookbooks

      Public
      A collection of cookbooks, tutorials, and examples for using AI models on DeepInfra. This repository provides practical guides, performance benchmarks, and prod…
      Jupyter Notebook
      0000Updated Dec 15, 2025Dec 15, 2025
    • openbench

      Public
      Provider-agnostic, open-source evaluation infrastructure for language models
      Python
      MIT License
      98000Updated Nov 13, 2025Nov 13, 2025
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      5.5k000Updated Oct 14, 2025Oct 14, 2025
    • SpecForge

      Public
      Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
      Python
      MIT License
      213000Updated Oct 8, 2025Oct 8, 2025
    • Roo-Code

      Public
      Roo Code gives you a whole dev team of AI agents in your code editor.
      TypeScript
      Apache License 2.0
      3.1k000Updated Sep 4, 2025Sep 4, 2025
    • kilocode

      Public
      Open Source AI coding assistant for planning, building, and fixing code. We're a superset of Roo, Cline, and our own features. Follow us: kilocode.ai/social
      TypeScript
      Apache License 2.0
      2.4k000Updated Aug 28, 2025Aug 28, 2025
    • ocr-tools

      Public
      Python
      2510Updated Aug 2, 2025Aug 2, 2025
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      33k000Updated Jul 29, 2025Jul 29, 2025
    • olmocr

      Public
      Toolkit for linearizing PDFs for LLM datasets/training
      Python
      Apache License 2.0
      1.4k000Updated Jul 9, 2025Jul 9, 2025
    • Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
      Python
      Apache License 2.0
      788000Updated May 28, 2025May 28, 2025
    • The Triton TensorRT-LLM Backend
      Python
      Apache License 2.0
      138000Updated May 8, 2025May 8, 2025
    • Sample Next.js ai chat app using Deep Infra inference and Vercel ai sdk
      TypeScript
      2100Updated Mar 17, 2025Mar 17, 2025
    • cutlass

      Public
      CUDA Templates for Linear Algebra Subroutines
      C++
      Other
      1.8k000Updated Mar 15, 2025Mar 15, 2025
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      2.6k000Updated Feb 20, 2025Feb 20, 2025
    • Zonos

      Public
      Python
      Apache License 2.0
      815000Updated Feb 12, 2025Feb 12, 2025
    • Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
      Python
      MIT License
      300000Updated Oct 21, 2024Oct 21, 2024
    • Model components of the Llama Stack APIs
      Python
      MIT License
      1.3k000Updated Oct 10, 2024Oct 10, 2024
    • Secure your NGINX locations with JWT
      Shell
      MIT License
      134000Updated Jun 17, 2024Jun 17, 2024
    • deepctl

      Public
      Command line tool for Deep Infra cloud ML inference service
      Rust
      Apache License 2.0
      33420Updated Jun 10, 2024Jun 10, 2024
    • 🦜🔗 Build context-aware reasoning applications 🦜🔗
      TypeScript
      MIT License
      3.1k000Updated May 31, 2024May 31, 2024
    • Official TypeScript wrapper for DeepInfra Inference API
      TypeScript
      MIT License
      32062Updated May 13, 2024May 13, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.