HPC Cloud traces for better cloud service reliability
HPC Cloud traces for better cloud service reliability
Overview
This dataset includes system metrics (anonymised) such as CPU and memory utilisation, as well as hard drive metrics from SMART (Self-Monitoring, Analysis, and Reporting Technology), collected from more than 100 cloud servers and is used in our study “A Combined System Metrics Approach to Cloud Service Reliability Using Artificial Intelligence”.
The dataset contains 20 feature columns, details of which are provided in Table below.
SN | Metrics Name | Description |
---|---|---|
1 | CPU utilisation | Host CPU usage in %. |
2 | Memory utilisation | Memory usage in bytes |
3 | IO utilisation | IO usage in time |
4 | Network overhead | Network usage in bytes |
5 | Bits read | Data written out from disk in bytes |
6 | Bits write | Data written into disk in bytes |
7 | Smart 188 | Command time out |
8 | Smart 197 | Current pending sector count |
9 | Smart 198 | Uncorrectable sector count |
10 | Smart 9 | Power-on hours |
11 | Smart 1 | Read error Rate |
12 | Smart 5 | Reallocated sectors count |
13 | Smart 187 | Reported uncorrectable errors |
14 | Smart 7 | Seek error rate |
15 | Smart 3 | Spin up time |
16 | Smart 4 | Start/stop count |
17 | Smart 194 | Temperature |
18 | Smart 199 | UltraDMA CRC error count |
19 | Time | Timestamp |
20 | id | Anonymised server |
Dataset link
Directory structure
- Root
- README.md
- anonymised.py - The code used for anonymisation.
- data - The directory that contains the actual data (total 101 files).