trustworthiness

Star

Here are 33 public repositories matching this topic...

FairyFali / SLMs-Survey

Star

Survey of Small Language Models from Penn State, ...

survey slm language-model trustworthiness large-language-models llm small-language-models

Updated Nov 6, 2025

DeFacto / DeFacto

Star

Deep Fact Validation

rdf trust credibility factchecking triplescore trustworthiness factchecking4rdf fact-validation triple-validation

Updated Sep 1, 2022
Java

chenglin1112 / AgentTrust

Star

Real-time trustworthiness evaluation and safety interception for AI agents. Semantic analysis, safe alternative suggestions, multi-step attack chain detection, and LLM-as-Judge.

python agent security benchmark mcp ai-safety trustworthiness guardrails llm

Updated Apr 16, 2026
Python

DeFacto / WebCredibility

Star

Provides web credibility models (Likert scale) to assign a trustworthiness score to a given website.

trust credibility fact-checking trustworthiness web-credibility

Updated Sep 19, 2019
Python

LauJames / Topic-FlipRAG

Star

[USENIX Security 2025] Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval-Augmented Generation Models

robustness trustworthiness adversarial-attacks rag topic-modelling poisoning-attack

Updated Jun 21, 2025
Jupyter Notebook

hollobit / WG3_TCM

Star

a matrix to provide the clarified definition and relationship information of trustworthiness characteristics between in the AI/ML standards

Updated Feb 9, 2026

trustops / awesome-trustops

Star

A list of tools and methods for building trustworthy software following TrustOps principles.

security devops privacy trust proofs tee zero-knowledge trustworthiness trusted-execution-environment secure-computing confidential-computing verifiability trustops

Updated Oct 18, 2024

jaiprakash1824 / VLM_Adv_Attack

Star

In the dynamic landscape of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks like PGD adversarial attack.

pytorch attention-mechanism clip vulnerability-detection pathology trustworthiness adversarial-attacks attention-visualization pathology-image histopathology-images pgd-adversarial-attacks contrastive-learning trustworthy-machine-learning vision-transformer trustworthy-ai plip-model histopathology-image-classfication vision-language-model

Updated May 18, 2024
Jupyter Notebook

merrafelice / Semantic-Aware-Shilling-Attacks

Star

In this paper, we introduce SAShA, a new attack strategy that leverages semantic features extracted from a knowledge graph in order to strengthen the efficacy of the attack to standard CF models. We performed an extensive experimental evaluation in order to investigate whether SAShA is more effective than baseline attacks against CF models by ta…

security semantic-web knowledge-graph recommender-system shilling-attack trustworthiness

Updated Feb 10, 2022
Python

tpertner / squeeze

Star

Squeeze your model with pressure prompts to see if its behavior leaks.

reliability evaluation calibration alignment quality-assurance metamorphic-testing ai-safety trustworthiness hallucinations prompt-engineering llm-eval llm-evals

Updated Mar 1, 2026
Python

nmsa / tma-framework

Star

Trustworthiness Monitoring & Assessment Framework

monitoring assessment trustworthiness

Updated Jan 8, 2023
JavaScript

rajdeep345 / MTLTS

Star

Codes and Datasets for our WSDM 2022 Paper: "MTLTS: A Multi-Task Framework To Obtain Trustworthy Summaries From Crisis-Related Microblogs"

verification summarization trustworthiness rumor-detection trustworthy-ai

Updated Feb 26, 2022
Python

worldbank / pcn

Star

Proof-Carrying Numbers (PCN): Trust is earned only by proof — the absence of a verification mark communicates uncertainty.

ai trustworthiness pcn llm proof-carrying-numbers aifordata dataforai

Updated Apr 2, 2026
TypeScript

Smendowski / data-embedding-and-visualization

Star

Visualization and embedding of large datasets using various Dimensionality Reduction (DR) techniques such as t-SNE, UMAP, PaCMAP & IVHD. Implementation of custom metrics to assess DR quality with complete explaination and workflow.

visualization metrics reduction pca t-sne mds autoencoders isomap dr trustworthiness dimensionality pacmap sheppard ivhd dimiensionality co-ranking drquality knngain

Updated Jul 6, 2022
Jupyter Notebook

danielebifolco / CodeGenLink

Star

CodeGenLink is a Visual Studio Code extension that interacts with GitHub Copilot Chat to generate code, analyze its origin, and identify the associated license.

licensing trustworthiness llm code-provenance

Updated Jul 16, 2025
TypeScript

j-m / faktnews

Star

Independent continuation of a project from AstonHack 2017

firefox chrome opera browser-extension fake-news trustworthiness

Updated Oct 21, 2019
JavaScript

a-neti-neti / goemotions-eda-annotation-diagnostics

Star

Emotion architecture from Reddit comments: rater behavior, semantic clusters, and contradiction mapping in GoEmotions.

Updated Aug 4, 2025
Jupyter Notebook

dshealthkdd / dshealth-2021

Star

Website for health data science at KDD 2021

data-science healthcare papers trustworthiness xai

Updated Aug 30, 2021
HTML

rishi-banerjee1 / blindbench

Star

Which LLM do you actually trust? Blind-test 100+ AI models with truth scoring and reasoning failure classification. No branding, no marketing — just data.

Updated Mar 31, 2026
JavaScript

sensible-ki / sensible-ki.github.io

Star

Secure and trustworthy mobile AI.

security artificial-intelligence mobile-security trustworthiness

Updated Mar 27, 2026
HTML

Improve this page

Add a description, image, and links to the trustworthiness topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the trustworthiness topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trustworthiness

Here are 33 public repositories matching this topic...

FairyFali / SLMs-Survey

DeFacto / DeFacto

chenglin1112 / AgentTrust

DeFacto / WebCredibility

LauJames / Topic-FlipRAG

hollobit / WG3_TCM

trustops / awesome-trustops

jaiprakash1824 / VLM_Adv_Attack

merrafelice / Semantic-Aware-Shilling-Attacks

tpertner / squeeze

nmsa / tma-framework

rajdeep345 / MTLTS

worldbank / pcn

Smendowski / data-embedding-and-visualization

danielebifolco / CodeGenLink

j-m / faktnews

a-neti-neti / goemotions-eda-annotation-diagnostics

dshealthkdd / dshealth-2021

rishi-banerjee1 / blindbench

sensible-ki / sensible-ki.github.io

Improve this page

Add this topic to your repo