llm-security

Evaluation of Google's Instruction Tuned Gemma-2B, an open-source Large Language Model (LLM). Aimed at understanding the breadth of the model's knowledge, its reasoning capabilities, and adherence to ethical guardrails, this project presents a systematic assessment across a diverse array of domains.

gemma responsible-ai huggingface-transformers llm llms llmops genai llm-security llm-inference genai-usecase largelanguagemodels gemma-2b

Updated Feb 26, 2024
Jupyter Notebook

lastlayer / last-layer-vercel

Star

Example of running last_layer with FastAPI on vercel

llm-security llm-privacy llm-guard llm-guardrails

Updated Apr 5, 2024
Python

CommissarSilver / TraWiC

Star

Trained Without My Consent (TraWiC): Detecting Code Inclusion In Language Models Trained on Code

intellectual-property llm-security llm-training code-llms llm-evaluation

Updated Jun 20, 2024
Python

minuva / fast-prompt-attack-detect

Star

User prompt attack detection system

nlp api jailbreak security-tools fastapi llm-security llm-local llm-vulnerabilities llm-guardrails

Updated May 31, 2024
Python

wangywUST / OutputJailbreak

Star

Repository for our paper "Frustratingly Easy Jailbreak of Large Language Models via Output Prefix Attacks". https://www.researchsquare.com/article/rs-4385503/latest

nlp jailbreak llm llm-security

Updated Jun 19, 2024
Jupyter Notebook

rohilrg / CatchPromptInjection

Star

This repo focus on how to deal with prompt injection problem faced by LLMs

openai-api transformers-models llm langchain prompt-injection llm-security

Updated Oct 19, 2023
Python

CyberAlbSecOP / MINOTAUR_Impossible_GPT_Security_Challenge

Star

MINOTAUR: The STRONGEST Secure Prompt EVER! Prompt Security Challenge, Impossible GPT Security, Prompts Cybersecurity, Prompting Vulnerabilities, FlowGPT, Secure Prompting, Secure LLMs, Prompt Hacker, Cutting-edge Ai Security, Unbreakable GPT Agent, Anti GPT Leak, System Prompt Security.

cyber-security security-challenge ai-security prompt-engineering prompt-injection gpt-security llm-security ai-jailbreak ai-jailbreak-prompts prompt-security system-prompt super-prompt prompt-security-challenge ai-cyber-security gpts-security flow-gpt

Updated Mar 27, 2024

AiShieldsOrg / AiShieldsWeb

Star

AiShields is an open-source Artificial Intelligence Data Input and Output Sanitizer

ai application-security appsec sensitive-data-security data-security ai-security aisec applicationsecurity llm prompt-engineering aisecurity llm-security llmsecurity llmsec prompt-injection-remediation model-denial-of-service-remediation insecure-output-handling-remediation overreliance-remediation prompt-engineering-security artificial-intelligence-security

Updated Jun 5, 2024
Python

balavenkatesh3322 / guardrails-demo

Star

LLM Security Project with Llama Guard

security attack-defense llm aisecurity generative-ai llmops llm-security llama-2 prompt-injection-tool llama-guard

Updated Feb 18, 2024
Python

levitation-opensource / Manipulative-Expression-Recognition

Star

MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.