LLM security check

smferro54 · October 10, 2023, 4:30pm

A data science team wants to use a Pre-built open source NLP model. The team wants to use huggingface open source ML models. Now, simply downloading and executing the model is not allowed by choice of the developers. The model needs to be scanned for vulnerabilities and issues and then approved for usage.

How can we scan the model and know the vulnerabilities it might bring? Is there a tool that can do similar checks as the DAST tool does for the py libraries?

Niraj · October 30, 2023, 8:51am

I am also looking for something similar. Not seen many work around vulnerability scans for LLMs

puneetk1 · November 21, 2023, 12:09pm

Use garak to scan LLM. Its open source at the moment.

github.com

leondz/garak/blob/main/FAQ.md

# garak LLM probe: Frequently Asked Questions


## How do I pronounce garak?

Good question! Emphasis on the first bit, GA-rak. 

Both 'a's like a in English hat, or à in French, or æ in IPA.

## What's this tool for?

`garak` is designed to help discover situations where a language model generates outputs that one might not want it to. If you know `nmap` or `metasplot` for traditional netsec/infosec analysis, then `garak` aims to operate in a similar space for language models.

## How does it work?

`garak` has probes that try to look for different "vulnerabilities". Each probs sends specific prompts to models, and gets multiple generations for each prompt. LLM output is often stochastic, so a single test isn't very informative. These generations are then processed by "detectors", which will look for "hits". If a detector registers a hit, that attempt is registered as failing. Finally, a report is output with the success/failure rate for each probe and detector.

## Do these results have scientific validity?

No. The scores from any probe don't operate on any kind of normalised scale. Higher passing percentage is better, but that's it. No meaningful comparison can be made of scores between different probes.

This file has been truncated. show original

Topic		Replies	Views
Open-Source Fine-tuned LLM Models for Data Extraction Tasks Models	1	1690	September 24, 2024
Using hugging face models with private company data? Beginners	7	26481	January 11, 2024
How do i choose a optimal LLM for Pentesting Awesome paper	2	1084	December 13, 2023
Guidance on getting started with fine tuned uncensored model Beginners	2	1142	March 8, 2025
General question about text classification Models Beginners	4	271	November 21, 2024

LLM security check

Related topics