Skip to content
Change the repository type filter

All

    Repositories list

    • autometrics

      Public
      Approximate Human Judgment with Automatically Generated Evaluators
      Python
      MIT License
      2200Updated Apr 22, 2026Apr 22, 2026
    • SparkMe

      Public
      Python
      Apache License 2.0
      3500Updated Apr 5, 2026Apr 5, 2026
    • contextual_privacy_defense

      Public
      Code for : Contextualized Privacy Defense for LLM Agents
      Jupyter Notebook
      1200Updated Apr 1, 2026Apr 1, 2026
    • CARE

      Public
      All code of CARE: model training, frontend, backend, and analysis
      Jupyter Notebook
      MIT License
      0000Updated Jan 31, 2026Jan 31, 2026
    • collaborative-gym

      Public
      Framework and toolkits for building and evaluating collaborative agents that can work together with humans.
      Python
      MIT License
      1912601Updated Dec 4, 2025Dec 4, 2025
    • RealtimeGym

      Public
      Python
      MIT License
      53400Updated Nov 11, 2025Nov 11, 2025
    • culture-cartography

      Public
      Code and data for the EMNLP 2025 paper "Culture Cartography: Mapping the Landscape of Cultural Knowledge"
      0110Updated Nov 4, 2025Nov 4, 2025
    • cs329x_hw3

      Public
      CS329X Homework 3 Assignment Fall 2025
      Jupyter Notebook
      MIT License
      1000Updated Oct 30, 2025Oct 30, 2025
    • cs329x_hw2

      Public
      Jupyter Notebook
      1500Updated Oct 30, 2025Oct 30, 2025
    • search_privacy_risk

      Public
      Code for the paper "Searching Privacy Risks in Multi-Agent Systems via Simulation"
      Jupyter Notebook
      22100Updated Oct 13, 2025Oct 13, 2025
    • GenUI

      Public
      Code for the paper: Generative Interfaces for Language Models
      TypeScript
      Other
      1812120Updated Oct 7, 2025Oct 7, 2025
    • HTML
      0200Updated Oct 7, 2025Oct 7, 2025
    • Python
      MIT License
      104620Updated Sep 29, 2025Sep 29, 2025
    • Python
      41300Updated Sep 25, 2025Sep 25, 2025
    • workbank

      Public
      WORKBank Database derived from large-scale audit of worker desire and technological capability of AI agents for work.
      Jupyter Notebook
      32500Updated Jul 23, 2025Jul 23, 2025
    • Python
      MIT License
      33400Updated Jun 10, 2025Jun 10, 2025
    • CAVA

      Public
      Python
      41110Updated May 30, 2025May 30, 2025
    • Python
      0400Updated Apr 14, 2025Apr 14, 2025
    • DiVA-Eval

      Public
      Python
      1401Updated Mar 31, 2025Mar 31, 2025
    • EmoCF

      Public
      Counterfactually Generated Emotional Audio Recognition
      Python
      0000Updated Mar 6, 2025Mar 6, 2025
    • A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM agents. (NeurIPS 2024 D…
      Python
      MIT License
      104410Updated Mar 4, 2025Mar 4, 2025
    • [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduction Games"
      Python
      51600Updated Feb 22, 2025Feb 22, 2025
    • FLANG

      Public
      When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain
      Python
      Apache License 2.0
      125730Updated Feb 11, 2025Feb 11, 2025
    • Python
      0001Updated Jan 30, 2025Jan 30, 2025
    • Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups
      Python
      35110Updated Dec 23, 2024Dec 23, 2024
    • JavaScript
      0000Updated Dec 3, 2024Dec 3, 2024
    • Official Repository for the ACL 2024 Paper: Unintended Impacts of LLM Alignment on Global Representation
      Python
      MIT License
      0700Updated Nov 29, 2024Nov 29, 2024
    • Code for the paper: Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping
      Python
      MIT License
      83820Updated Oct 29, 2024Oct 29, 2024
    • Astro
      0000Updated Oct 22, 2024Oct 22, 2024
    • DARG

      Public
      The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
      Python
      31800Updated Oct 13, 2024Oct 13, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.