HPC Cloud traces for better cloud service reliability
Download dataset
This dataset includes system metrics (anonymised) such as CPU and memory utilisation, as well as hard drive metrics from SMART (Self-Monitoring, Analysis, and Reporting Technology), collected from more than 100 cloud servers and is used in our editor's choice study “A Combined System Metrics Approach to Cloud Service Reliability Using Artificial Intelligence”. If you use this dataset in your study, please cite [1, 2].
[1]. Chhetri, T.R., Dehury, C.K., Lind, A., Srirama, S.N. and Fensel, A., 2022. A Combined System Metrics Approach to Cloud Service Reliability Using Artificial Intelligence. Big Data and Cognitive Computing, 6(1), p.26.
[2]. Dehury, C.K., Chhetri, T.R., Lind, A., Srirama, S.N. and Fensel, A., 2021. HPC Cloud traces for better cloud service reliability.
The dataset contains 20 feature columns, details of which are provided in Table below.
Metrics name | Description |
CPU utilization | Host CPU usage in % |
Memory utilization | Memory usage in bytes |
IO utilization | IO usage in time |
Network overhead | Network usage in bytes |
Bits read | Data written out from disk in bytes |
Bits write | Data written into disk in bytes |
Smart 188 | Command time out |
Smart 197 | Current pending sector count |
Smart 198 | Uncorrectable sector count |
Smart 9 | Power-on hours |
Smart 1 | Read error Rate |
Smart 5 | Reallocated sectors count |
Smart 187 | Reported uncorrectable errors |
Smart 7 | Seek error rate |
Smart 3 | Spin up time |
Smart 4 | Start/stop count |
Smart 194 | Temperature |
Smart 199 | UltraDMA CRC error count |
Time | Timestamp |
id | Anonymised server |