Skip to content

Commit 432baf8

Browse files
committed
Update download links
1 parent a4dad7f commit 432baf8

File tree

2 files changed

+27
-29
lines changed

2 files changed

+27
-29
lines changed

.github/workflows/crawl_downloads.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: crawl-zenodo-downloads
1+
name: Crawl-zenodo-downloads
22

33
on:
44
push:
@@ -7,6 +7,7 @@ on:
77
- zenodo
88
schedule:
99
- cron: '0 4 * * *'
10+
workflow_dispatch:
1011

1112
jobs:
1213
build:

README.md

Lines changed: 25 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,45 +9,42 @@
99

1010
Loghub maintains a collection of system logs, which are freely accessible for AI-driven log analytics research. Some of the logs are production data released from previous studies, while some others are collected from real systems in our lab environment. Wherever possible, the logs are NOT sanitized, anonymized or modified in any way. These log datasets are freely available for research or academic work.
1111

12+
🤗 We proudly announce that the loghub datasets have attained total <a href="https://doi.org/10.5281/zenodo.1144100"><img src="https://img.shields.io/endpoint?&url=https://cdn.jsdelivr.net/gh/logpai/loghub@zenodo/downloads.json&labelColor=1AF&color=DDEEFF&style=flat&label=Downloads"></a> by more than [**450 organizations**](https://github.com/logpai/loghub/wiki/Loghub-download-list) from both industry and academia.
13+
1214
**Logs currently available**:
1315

14-
| Dataset | Description | Labeled | Time Span | #Lines | Raw Size | Contributed By |
16+
| Dataset | Description | Labeled | Time Span | #Lines | Raw Size | Download |
1517
| :---------------------------- | :--------| :--------: | --------: | ---------: | ------: | :------: |
1618
|<tr><th colspan=7 align="center">:open_file_folder: **Distributed systems**</th></tr>|
17-
| [HDFS_v1](./HDFS#hdfs_v1) | Hadoop distributed file system log | :heavy_check_mark: | 38.7 hours | 11,175,629 | 1.47GB | [Link](https://www.sigops.org/sosp/sosp09/papers/xu-sosp09.pdf) |
18-
| [HDFS_v2](./HDFS#hdfs_v2) | Hadoop distributed file system log| | N.A. | 71,118,073 | 16.06GB | |
19-
| [HDFS_v3](./HDFS#hdfs_v3_tracebench) | Instrumented HDFS trace log (TraceBench) | :heavy_check_mark: | N.A. | 14,778,079 | 2.96GB | [Link](http://zbchen.github.io/Papers_files/cloudcom2014.pdf) |
20-
| [Hadoop](./Hadoop) | Hadoop mapreduce job log | :heavy_check_mark: | N.A. | 394,308 | 48.61MB | [Link](http://ieeexplore.ieee.org/document/7883294/) |
21-
| [Spark](./Spark) | Spark job log || N.A. | 33,236,604 | 2.75GB | |
22-
| [Zookeeper](./Zookeeper) | ZooKeeper service log | | 26.7 days | 74,380 | 9.95MB | |
23-
| [OpenStack](./OpenStack) | OpenStack infrastructure log | :heavy_check_mark: | N.A. | 207,820 | 58.61MB | [Link](https://acmccs.github.io/papers/p1285-duA.pdf) |
19+
| [HDFS_v1](./HDFS#hdfs_v1) | Hadoop distributed file system log | :heavy_check_mark: | 38.7 hours | 11,175,629 | 1.47GB | [:link:](https://zenodo.org/records/8196385/files/HDFS_v1.zip?download=1) |
20+
| [HDFS_v2](./HDFS#hdfs_v2) | Hadoop distributed file system log| | N.A. | 71,118,073 | 16.06GB | [:link:](https://zenodo.org/records/8196385/files/HDFS_v2.zip?download=1) |
21+
| [HDFS_v3](./HDFS#hdfs_v3_tracebench) | Instrumented HDFS trace log (TraceBench) | :heavy_check_mark: | N.A. | 14,778,079 | 2.96GB | [:link:](https://zenodo.org/records/8196385/files/HDFS_v3_TraceBench.zip?download=1) |
22+
| [Hadoop](./Hadoop) | Hadoop mapreduce job log | :heavy_check_mark: | N.A. | 394,308 | 48.61MB | [:link:](https://zenodo.org/records/8196385/files/Hadoop.zip?download=1) |
23+
| [Spark](./Spark) | Spark job log || N.A. | 33,236,604 | 2.75GB | [:link:](https://zenodo.org/records/8196385/files/Spark.tar.gz?download=1) |
24+
| [Zookeeper](./Zookeeper) | ZooKeeper service log | | 26.7 days | 74,380 | 9.95MB | [:link:](https://zenodo.org/records/8196385/files/Zookeeper.tar.gz?download=1) |
25+
| [OpenStack](./OpenStack) | OpenStack infrastructure log | :heavy_check_mark: | N.A. | 207,820 | 58.61MB | [:link:](https://zenodo.org/records/8196385/files/OpenStack.tar.gz?download=1) |
2426
|<tr><th colspan=7 align="center">:open_file_folder: **Super computers**</th></tr>|
25-
| [BGL](./BGL) | Blue Gene/L supercomputer log | :heavy_check_mark: | 214.7 days | 4,747,963 | 708.76MB | [Link](http://ieeexplore.ieee.org/document/4273008/) |
26-
| [HPC](./HPC) | High performance cluster log | | N.A. | 433,489 | 32.00MB | [Link](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.503.7668&rep=rep1&type=pdf) |
27-
| [Thunderbird](./Thunderbird) | Thunderbird supercomputer log | :heavy_check_mark: | 244 days | 211,212,192 | 29.60GB | [Link](http://ieeexplore.ieee.org/document/4273008/) |
27+
| [BGL](./BGL) | Blue Gene/L supercomputer log | :heavy_check_mark: | 214.7 days | 4,747,963 | 708.76MB | [:link:](https://zenodo.org/records/8196385/files/BGL.zip?download=1) |
28+
| [HPC](./HPC) | High performance cluster log | | N.A. | 433,489 | 32.00MB | [:link:](https://zenodo.org/records/8196385/files/HPC.zip?download=1) |
29+
| [Thunderbird](./Thunderbird) | Thunderbird supercomputer log | :heavy_check_mark: | 244 days | 211,212,192 | 29.60GB | [:link:](https://zenodo.org/records/8196385/files/Thunderbird.tar.gz?download=1) |
2830
|<tr><th colspan=7 align="center">:open_file_folder: **Operating systems**</th></tr>|
29-
| [Windows](./Windows) | Windows event log | | 226.7 days | 114,608,388 | 26.09GB | |
30-
| [Linux](./Linux) | Linux system log | | 263.9 days | 25,567 | 2.25MB | [Link](http://log-sharing.dreamhosters.com) |
31-
| [Mac](./Mac) | Mac OS log | | 7.0 days | 117,283 | 16.09MB | |
31+
| [Windows](./Windows) | Windows event log | | 226.7 days | 114,608,388 | 26.09GB | [:link:](https://zenodo.org/records/8196385/files/Windows.tar.gz?download=1) |
32+
| [Linux](./Linux) | Linux system log | | 263.9 days | 25,567 | 2.25MB | [:link:](https://zenodo.org/records/8196385/files/Linux.tar.gz?download=1) |
33+
| [Mac](./Mac) | Mac OS log | | 7.0 days | 117,283 | 16.09MB | [:link:](https://zenodo.org/records/8196385/files/Mac.tar.gz?download=1) |
3234
|<tr><th colspan=7 align="center">:open_file_folder: **Mobile systems**</th></tr>|
33-
| [Android_v1](./Android#android_v1) | Android framework log | | N.A. | 1,555,005 | 183.37MB | |
34-
| [Android_v2](./Android#android_v2) | Android framework log | | N.A. | 30,348,042 | 3.38GB | |
35-
| [HealthApp](./HealthApp) | Health app log | | 10.5 days | 253,395 | 22.44MB | |
35+
| [Android_v1](./Android#android_v1) | Android framework log | | N.A. | 1,555,005 | 183.37MB | [:link:](https://zenodo.org/records/8196385/files/Android_v1.zip?download=1) |
36+
| [Android_v2](./Android#android_v2) | Android framework log | | N.A. | 30,348,042 | 3.38GB | [:link:](https://zenodo.org/records/8196385/files/Android_v2.zip?download=1) |
37+
| [HealthApp](./HealthApp) | Health app log | | 10.5 days | 253,395 | 22.44MB | [:link:](https://zenodo.org/records/8196385/files/HealthApp.tar.gz?download=1) |
3638
|<tr><th colspan=7 align="center">:open_file_folder: **Server applications**</th></tr>|
37-
| [Apache](./Apache) | Apache web server error log | | 263.9 days | 56,481 | 4.90MB | [Link](http://log-sharing.dreamhosters.com) |
38-
| [OpenSSH](./OpenSSH) | OpenSSH server log | | 28.4 days | 655,146 | 70.02MB | |
39+
| [Apache](./Apache) | Apache web server error log | | 263.9 days | 56,481 | 4.90MB | [:link:](https://zenodo.org/records/8196385/files/Apache.tar.gz?download=1) |
40+
| [OpenSSH](./OpenSSH) | OpenSSH server log | | 28.4 days | 655,146 | 70.02MB | [:link:](https://zenodo.org/records/8196385/files/SSH.tar.gz?download=1) |
3941
|<tr><th colspan=7 align="center">:open_file_folder: **Standalone software**</th></tr>|
40-
| [Proxifier](./Proxifier) | Proxifier software log | | N.A. | 21,329 | 2.42MB | |
41-
42-
43-
### Datasets download
44-
We host only a small sample (2k lines) of each log dataset on Github. If you are interested in these raw datasets, please download them [via Zenodo](https://doi.org/10.5281/zenodo.1144100).
42+
| [Proxifier](./Proxifier) | Proxifier software log | | N.A. | 21,329 | 2.42MB | [:link:](https://zenodo.org/records/8196385/files/Proxifier.tar.gz?download=1) |
4543

46-
:bell: We proudly announce that the loghub datasets have attained total <a href="https://doi.org/10.5281/zenodo.1144100"><img src="https://img.shields.io/endpoint?&url=https://cdn.jsdelivr.net/gh/logpai/loghub@zenodo/downloads.json&labelColor=grey&color=4EB999&style=flat&label=Downloads"></a> by more than [**450 organizations**](https://github.com/logpai/loghub/wiki/Loghub-download-list) from both industry and academia.
4744

48-
### 🌈 Citation
45+
### 🔥 Citation
4946

50-
Please cite the following paper if you use the loghub datasets for research.
47+
Please cite the following paper if you use the loghub datasets in your research.
5148

5249
+ Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, Michael R. Lyu. [Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics](https://arxiv.org/abs/2008.06448). IEEE International Symposium on Software Reliability Engineering (ISSRE), 2023.
5350

@@ -82,5 +79,5 @@ Welcome to join our WeChat group for any question and discussion. Alternatively,
8279

8380
![Scan QR code](https://cdn.jsdelivr.net/gh/logpai/logpai.github.io@master/img/wechat.png)
8481

85-
### License
82+
### 🌈 License
8683
The datasets are freely available for research or academic work. For any usage or distribution of the datasets, please refer to the loghub repository URL https://github.com/logpai/loghub and cite [the loghub paper](https://github.com/logpai/loghub/blob/master/CITATION) where applicable.

0 commit comments

Comments
 (0)