


Warning: JavaScript registry npm vulnerable to 'manifest confusion' abuse.It contains 1,258 security commits and 2,791 non-security commits culled from more than 351 popular GitHub projects, covering 119 more CWEs. Together, these form PySecDB, which the academics say represents the first security commit dataset in Python. For example, CVE-2021-27213 includes a link to the actual code change in the relevant project's GitHub repo, a fix of CWE 502, Deserialization of Untrusted Data. The base dataset consists of security commits associated with CVE identifiers. PySecDB has three parts: a base dataset, a pilot dataset, and an augmented dataset. "Since the CVE records on Python programs are limited, we observe that only 46 percent of them provide the corresponding security commits and more security commits fall in the wild silently, without being indexed by CVE," the group concluded in their paper, which was accepted for the 2023 ICSME conference.

More security commits fall in the wild silently, without being indexed by CVE In a preprint paper titled, "Exploring Security Commits in Python," Shiyu Sun, Shu Wang, Xinda Wang, Yunlong Xing, Kun Sun from George Mason University, and Elisa Zhang from Dougherty Valley High School, all in the United States, propose a remedy: a database of security commits called PySecDB to make Python code repairs more visible to the community.
