Self-Admitted Security Debt Dataset


I am looking for any datasets that contain code comments (I am particularly interested in Java comments) that themselves contain self-admitted security debt (SASD). Self-admitted security debt is a subset of self-admitted technical debt (SATD) and is when short-term design expediencies are accepted for productivity gains. In the case of SASD, the term-term design experiences result in vulnerabilities and these will be denoted by a code comment e.g TODO fix this SQL injection vulnerability.

Any help would be much appreciated

Maybe you can check bigcode/the-stack ?