Learning from Vulnerabilities Dataset
Dataset Supporting the ESORICS CyberICPS 2020 Workshop Paper "Learning From Vulnerabilities - Categorising, Understanding and Detecting Weaknesses in Industrial Control Systems" by Richard J. Thomas and Tom Chothia.
Read the Paper » Referencing the Dataset » Dataset Information and Release »
Referencing this Dataset
We encourage the use of our Dataset by the research community. If you do use it, we ask that you cite the Dataset and credit the University of Birmingham.
The Citation and BibTeX can be exported using the buttons below.
R.J. Thomas and T. Chothia. (2020) "Learning from Vulnerabilities - Categorising, Understanding and
Detecting Weaknesses in Industrial Control Systems" in: Katsikas S. et al. (eds) Computer
Security. CyberICPS 2020. Lecture Notes in Computer Science. Springer, Cham.
@InProceedings{uob-esorics2020,
author="Richard J. Thomas and Tom Chothia",
title="Learning from Vulnerabilities - Categorising, Understanding and Detecting Weaknesses in Industrial
Control Systems",
booktitle="Computer Security",
year="2020",
publisher="Springer International Publishing",
address="Cham"}
The Dataset
Everything you need to know about this Dataset.
Dataset Information:
The 'Learning from Vulnerabilities' Dataset was curated by scraping CISA ICS-CERT Advisories, the NIST NVD CVE feeds, MITRE CVE exports and the MITRE CWE list. The workflow that imports the data held in these sources to form our Dataset is given in our paper.
This Dataset contains all ICS advisories between 2011 and March 2020. Some key statistics are given below:
- 2,566 ICS CVEs
- 1,240 ICS Advisories scraped
- 283 distinct CWE references
Data Schema
The Dataset has been broken down into a set of tables for simple referencing and to provide 'single sources of truth'. The schema is given below, with a description of the fields contained in those tables. Each schema matching SQL and CSV files. Where text is given in '[]', this relates to the corresponding Dataset file.
- [validation_|merged_]icsa_vendors - Directory of ICS vendors with vulnerabilities (from ICS-CERT vendor
lists)
vendor_id
- Unique identifier for the vendorname
- The vendor name
- icsa_cwe - CWE Dictionary mapping ID to name and contextual information (from MITRE CWE)
cwe_id
- The CWE Identifier (e.g. CWE-200)name
- The name of the CWEweakness_abstraction
- The type of CWE (e.g. Class, Category)status
- The state of the CWE (e.g. Incomplete, Draft)description
- A brief description of the CWEextended_description
- A fuller outline of the vulnerability, including its effectsbackground_details
- Technical details about the CWE (e.g. what should be expected and the reason for a vulnerability existing from this CWE).comments
- Birmingham-contributed comments
- cwe_groups - Defining groups of CWEs (e.g. CWE Top 25 and SFP Clusters)
category_id
- The internal ID of the categorycwe_category_name
- Simple name for the category (e.g. CWE Top 25, or OWASP Theme)comments
- Birmingham-contributed comments
- cwe_group_member - defines mappings of CWE to appropriate groups in
cwe_groups
cwe_category
- Thecategory_id
that the CWE should be incwe_member
- Thecwe_id
to be mapped into this group
- [validation_|merged_]icsa_alert
icsa_id
- The ICS Advisory number (from ICS Advisory)icsa_url
- The URL to the ICS Advisoryicsa_release
- The date the ICS Advisory was released (from ICS Advisory)icsa_update
- The date the ICS Advisory was updated (if set) (from ICS Advisory)icsa_description
- Description from the ICS Advisory of the issues identified (from ICS Advisory)icsa_is_update
- Flag set if an ICS Advisory was updated since publicationicsa_vendor
- The vendor affected (from ICS Advisory)icsa_oneliner
- A one-liner of the ICS Advisory (used in older ICS Advisories)
- [validation_|merged_]icsa_statistics_paper_dev - table containing all our analysed data (except product
type)
icsa_id
- The ICS Advisory number (from ICS Advisory)cwe_id
- The CWE number assigned to the vulnerability in the ICS Advisorycve_id
- The stated CVE for the vulnerabilitycvss_version
- The CVSS version (from the NVD CVE) - either 2 or 3cve_description
- The CVE description (from the NVD CVE)cvss_base_score
- The CVSS base score component (from the NVD CVE)cvss_impact_score
- The CVSS impact score component (only in CVSS v3) (from the NVD CVE)cvss_exploitability_score
- The CVSS exploitability score (only in CVSS v3)cvss_severity
- The CVSS severity (LOW, MEDIUM, HIGH | LOW, MEDIUM, HIGH, CRITICAL) (from the NVD CVE)nvd_cwe_id
- The NVD-assigned CWE ID (from the NVD CVE)cvss_vector
- The full CVSS vector (from the NVD CVE)accessVector
- The stated access vector for the CVE (from the NVD CVE)complexity
- The stated attack complexity, taken from the CVSS vector (from the NVD CVE)availabilityImpact
- The stated availability impact, taken from the CVSS vector (from the NVD CVE)integrityImpact
- The stated integrity impact, taken from the CVSS vector (from the NVD CVE)confidentialityImpact
- The stated confidentiality impact, taken from the CVSS vector (from the NVD CVE)allPriv
- CVSS v2 'All Privileges Required' (from the NVD CVE)userPriv
- CVSS v2 'User Privileges Required' (from the NVD CVE)otherPriv
- CVSS v2 'Other Privileges Required' (from the NVD CVE)userInteractionRequired
- CVSS v3 'User Interaction Required' (from the NVD CVE)privilegesRequired
- CVSS v3 'Privileges Required' (from the NVD CVE)u_sys_created
- ICS Advisory creation date (from ICS Advisory)u_sys_updated
- ICS Advisory update date (from ICS Advisory)u_sfp_cluster
- SFP cluster the CWE belongs tou_old_cat
- The old category (SFP Cluster Name) - this generated the TfL-style Map in our Paperu_new_cat
- Our new detectable category the CVE belongs tou_other_cat
- Ifu_new_cat
is Other, we state what the 'Other' category should beu_product_type
- The type of product affected (Birmingham-contributed) - exists in [extended_] tables
- icsa_statistics - table containing initial data imported
icsa_id
- The ICS Advisory number (from ICS Advisory)cwe_id
- The CWE number assigned to the vulnerability in the ICS Advisorycve_id
- The stated CVE for the vulnerabilitycvss_version
- The CVSS version (from the NVD CVE) - either 2 or 3cve_description
- The CVE description (from the NVD CVE)cvss_base_score
- The CVSS base score component (from the NVD CVE)cvss_impact_score
- The CVSS impact score component (only in CVSS v3) (from the NVD CVE)cvss_exploitability_score
- The CVSS exploitability score (only in CVSS v3)cvss_severity
- The CVSS severity (LOW, MEDIUM, HIGH | LOW, MEDIUM, HIGH, CRITICAL) (from the NVD CVE)nvd_cwe_id
- The NVD-assigned CWE ID (from the NVD CVE)cvss_vector
- The full CVSS vector (from the NVD CVE)accessVector
- The stated access vector for the CVE (from the NVD CVE)complexity
- The stated attack complexity, taken from the CVSS vector (from the NVD CVE)availabilityImpact
- The stated availability impact, taken from the CVSS vector (from the NVD CVE)integrityImpact
- The stated integrity impact, taken from the CVSS vector (from the NVD CVE)confidentialityImpact
- The stated confidentiality impact, taken from the CVSS vector (from the NVD CVE)allPriv
- CVSS v2 'All Privileges Required' (from the NVD CVE)userPriv
- CVSS v2 'User Privileges Required' (from the NVD CVE)otherPriv
- CVSS v2 'Other Privileges Required' (from the NVD CVE)userInteractionRequired
- CVSS v3 'User Interaction Required' (from the NVD CVE)privilegesRequired
- CVSS v3 'Privileges Required' (from the NVD CVE)u_sys_created
- ICS Advisory creation date (from ICS Advisory)u_sys_updated
- ICS Advisory update date (from ICS Advisory)u_sfp_cluster
- SFP cluster the CWE belongs to
Dataset Releases
This Dataset is available as a set of SQL, CSV and JSON files for database servers and integration with other data analysis tools and software.
Base Tables
These tables define common data that do not change (e.g. Vendors, CWEs and mappings). These are used as foreign keys for the original, validation and full datasets.
Schema File | SQL | CSV | JSON |
---|---|---|---|
esorics2020-icsa_vendors | SQL | CSV | JSON |
esorics2020-icsa_cwe | SQL | CSV | JSON |
esorics2020-cwe_groups | SQL | CSV | JSON |
esorics2020-cwe_group_member | SQL | CSV | JSON |
Original Dataset
The Dataset containing ICS advisories from 2011 to August 2019.
Schema File | SQL | CSV | JSON |
---|---|---|---|
esorics2020-icsa_alert | SQL | CSV | JSON |
esorics2020-icsa_statistics | SQL | CSV | JSON |
esorics2020-icsa_statistics_paper_dev | SQL | CSV | JSON |
esorics2020-extended_icsa_statistics_paper_dev | SQL | CSV | JSON |
Validation Dataset
The Dataset of ICS Advisories and data through to between September 2019 and March 2020 which were used to validate the categorisation for our paper.
Schema File | SQL | CSV | JSON |
---|---|---|---|
esorics2020-validation_icsa_vendors | SQL | CSV | JSON |
esorics2020-validation_icsa_alert | SQL | CSV | JSON |
esorics2020-validation_icsa_statistics_paper_dev | SQL | CSV | JSON |
esorics2020-extended_validation_icsa_statistics_paper_dev | SQL | CSV | JSON |
Full Dataset
The full dataset of ICS Advisories and data through to March 2020.
Schema File | SQL | CSV | JSON |
---|---|---|---|
esorics2020-merged_icsa_vendors | SQL | CSV | JSON |
esorics2020-merged_icsa_alert | SQL | CSV | JSON |
esorics2020-merged_icsa_statistics_paper_dev | SQL | CSV | JSON |
Have Questions?
If you have any questions, please feel free to get in touch with us. Our contact addresses are in the paper.