Need Help Finding Appropriate Dataset(s)

AIdrive · August 16, 2023, 12:26pm

Hi all,

I have recently started working on an AI app to detect C/C++ source code vulnerabilities. My understanding is that for the training and validation, I need to input (to the model) both safe and unsafe code examples. The problem is that I cannot find a dataset anywhere, that clearly delineates between the two — they all either contain nothing but unsafe code examples, or contain a single file (pkl or json) that contains both safe and unsafe together/merged.

I thought there may be some datasets that would have something like one directory (or file) that contains only safe, and another that contains only unsafe.

Any help here would be appreciated.

Thanks.

Topic		Replies	Views
Finding An Appropriate Dataset Research	0	274	August 17, 2023
Tools, datasets ,benchmarks in AI Safety 🤗Datasets	0	105	June 20, 2024
Why our dataset have unsafe files? 🤗Datasets	6	765	September 25, 2023
This model has one file that has been marked as unsafe Beginners	0	527	July 24, 2023
Request for Further Information on Datasets Beginners	0	281	November 26, 2020

Need Help Finding Appropriate Dataset(s)

Related topics