Spelling fixes and CI updates (#31)

* updated to test the latest HDF5 release * fixed spelling and added spell checker to GH actions
HDFGroup · Sep 3, 2024 · 9c90cc3 · 9c90cc3
1 parent 3e52d5a
commit 9c90cc3
Show file tree

Hide file tree

Showing 20 changed files with 71 additions and 51 deletions.
diff --git a/.codespellrc b/.codespellrc
@@ -0,0 +1,6 @@
+# Ref: https://github.com/codespell-project/codespell#using-a-config-file
+[codespell]
+skip = .git,.codespellrc
+check-hidden = true
+#ignore-regex = 
+#ignore-words-list =
diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml
@@ -0,0 +1,14 @@
+# GitHub Action to automate the identification of common misspellings in text files
+# https://github.com/codespell-project/codespell
+# https://github.com/codespell-project/actions-codespell
+name: codespell
+on: [push, pull_request]
+permissions:
+  contents: read
+jobs:
+  codespell:
+    name: Check for spelling errors
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/[email protected]
+      - uses: codespell-project/actions-codespell@master
diff --git a/.github/workflows/hdf5-latest.yml b/.github/workflows/hdf5-latest.yml
@@ -14,19 +14,19 @@ jobs:
     runs-on: ubuntu-latest
     steps:
     - name: Checkout vol-cache
-      uses: actions/checkout@v4.1.1
+      uses: actions/checkout@v3
     - name: Checkout latest HDF5 release
       run: |
         wget https://github.com/HDFGroup/hdf5/releases/latest/download/hdf5.tar.gz
         tar xzf hdf5.tar.gz
         ln -sf hdf5-* hdf5
     - name: Checkout Argobots
-      uses: actions/checkout@v4.1.1
+      uses: actions/checkout@v3
       with:
         repository: pmodels/argobots
         path: abt
     - name: Checkout vol-async
-      uses: actions/checkout@v4.1.1
+      uses: actions/checkout@v3
       with:
         repository: hpc-io/vol-async
         path: vol-async
@@ -84,7 +84,7 @@ jobs:
         ctest --output-on-failure
         
     - name: Upload
-      uses: actions/upload-artifact@v4
+      uses: actions/upload-artifact@v3
       with:
         name: git.txt
         path: ${{ runner.workspace }}/vol-cache/hdf5/git.txt
diff --git a/README.md b/README.md
@@ -50,7 +50,7 @@ cd hdf5
 make all install 
 ```
 
-When running configure, ake sure you **DO NOT** have the option "--disable-shared".
+When running configure, make sure you **DO NOT** have the option "--disable-shared".
 
 ### Build Argobots library
 
@@ -118,7 +118,7 @@ Currently, we use environmental variables to enable and disable the cache functi
 ### Parallel write
 
 * **write_cache.cpp** is the benchmark code for evaluating the parallel write performance. In this testing case, each MPI rank has a local
-   buffer BI to be written into a HDF5 file organized in the following way: [B0|B1|B2|B3]|[B0|B1|B2|B3]|...|[B0|B1|B2|B3]. The repeatition of [B0|B1|B2|B3] is the number of iterations
+   buffer BI to be written into a HDF5 file organized in the following way: [B0|B1|B2|B3]|[B0|B1|B2|B3]|...|[B0|B1|B2|B3]. The repetition of [B0|B1|B2|B3] is the number of iterations
   * --dim D1 D2: dimension of the 2D array [BI] // this is the local buffer size
   * --niter NITER: number of iterations. Notice that the data is accumulately written to the file.
   * --scratch PATH: the location of the raw data
@@ -146,7 +146,7 @@ This will generate a hdf5 file, images.h5, which contains 8192 samples. Each 224
 
 For the read benchmark, it is important to isolate the DRAM caching effect. By default, during the first iteration, the system will cache all the data on the memory (RSS), unless the memory capacity is not big enough to cache all the data. This ends up with a very high bandwidth at second iteration, and it is independent of where the node-local storage are.
 
-To remove the cache / buffering effect for read benchmarks, one can allocate a big array that is close to the size of the RAM, so that it does not have any extra space to cache the input HDF5 file. This can be achieve by setting ```MEMORY_PER_PROC``` (memory per process in Giga Byte). **However, this might cause the compute node to crash.** The other way is to read dummpy files by seeting ```CACHE_NUM_FILES``` (number of dummpy files to read per process).
+To remove the cache / buffering effect for read benchmarks, one can allocate a big array that is close to the size of the RAM, so that it does not have any extra space to cache the input HDF5 file. This can be achieve by setting ```MEMORY_PER_PROC``` (memory per process in Giga Byte). **However, this might cause the compute node to crash.** The other way is to read dummpy files by setting ```CACHE_NUM_FILES``` (number of dummpy files to read per process).
 
 ## Citation
 If you use Cache VOL, please cite the following paper

diff --git a/benchmarks/prepare_dataset.cpp b/benchmarks/prepare_dataset.cpp
@@ -9,7 +9,7 @@
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
 
 /*
-  This file is for testing reading the data set in parallel in data paralle
+  This file is for testing reading the data set in parallel in data parallel
   training. We assume that the dataset is in a single HDF5 file. Each dataset is
   stored in the following way:
 
@@ -19,7 +19,7 @@
 
   When we read the data, each rank will read a batch of sample randomly or
   contiguously from the HDF5 file. Each sample has a unique id associate with
-  it. At the begining of epoch, we mannually partition the entire dataset with
+  it. At the beginning of epoch, we manually partition the entire dataset with
   nproc pieces - where nproc is the number of workers.
 
  */

diff --git a/benchmarks/read_cache.cpp b/benchmarks/read_cache.cpp
@@ -9,7 +9,7 @@
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
 
 /*
-  This code is to prototying the idea of incorparating node-local storage
+  This code is for prototyping the idea of incorporating node-local storage
   into repeatedly read workflow. We assume that the application is reading
   the same dataset periodically from the file system. Out idea is to bring
   the data to the node-local storage in the first iteration, and read from
@@ -119,17 +119,17 @@ void clear_cache(char *rank) {
 
 using namespace std;
 
-int msleep(long miliseconds) {
+int msleep(long milliseconds) {
   struct timespec req, rem;
 
-  if (miliseconds > 999) {
-    req.tv_sec = (int)(miliseconds / 1000); /* Must be Non-Negative */
-    req.tv_nsec = (miliseconds - ((long)req.tv_sec * 1000)) *
+  if (milliseconds > 999) {
+    req.tv_sec = (int)(milliseconds / 1000); /* Must be Non-Negative */
+    req.tv_nsec = (milliseconds - ((long)req.tv_sec * 1000)) *
                   1000000; /* Must be in range of 0 to 999999999 */
   } else {
     req.tv_sec = 0; /* Must be Non-Negative */
     req.tv_nsec =
-        miliseconds * 1000000; /* Must be in range of 0 to 999999999 */
+        milliseconds * 1000000; /* Must be in range of 0 to 999999999 */
   }
   return nanosleep(&req, &rem);
 }

diff --git a/benchmarks/write.cpp b/benchmarks/write.cpp
@@ -44,17 +44,17 @@ void mkdirRecursive(const char *path, mode_t mode) {
     mkdir(opath, mode);
 }
 
-int msleep(long miliseconds) {
+int msleep(long milliseconds) {
   struct timespec req, rem;
 
-  if (miliseconds > 999) {
-    req.tv_sec = (int)(miliseconds / 1000); /* Must be Non-Negative */
-    req.tv_nsec = (miliseconds - ((long)req.tv_sec * 1000)) *
+  if (milliseconds > 999) {
+    req.tv_sec = (int)(milliseconds / 1000); /* Must be Non-Negative */
+    req.tv_nsec = (milliseconds - ((long)req.tv_sec * 1000)) *
                   1000000; /* Must be in range of 0 to 999999999 */
   } else {
     req.tv_sec = 0; /* Must be Non-Negative */
     req.tv_nsec =
-        miliseconds * 1000000; /* Must be in range of 0 to 999999999 */
+        milliseconds * 1000000; /* Must be in range of 0 to 999999999 */
   }
   return nanosleep(&req, &rem);
 }

diff --git a/benchmarks/write_cache.cpp b/benchmarks/write_cache.cpp
@@ -31,17 +31,17 @@
 #include <unistd.h>
 //#include "h5_async_lib.h"
 
-int msleep(long miliseconds) {
+int msleep(long milliseconds) {
   struct timespec req, rem;
 
-  if (miliseconds > 999) {
-    req.tv_sec = (int)(miliseconds / 1000); /* Must be Non-Negative */
-    req.tv_nsec = (miliseconds - ((long)req.tv_sec * 1000)) *
+  if (milliseconds > 999) {
+    req.tv_sec = (int)(milliseconds / 1000); /* Must be Non-Negative */
+    req.tv_nsec = (milliseconds - ((long)req.tv_sec * 1000)) *
                   1000000; /* Must be in range of 0 to 999999999 */
   } else {
     req.tv_sec = 0; /* Must be Non-Negative */
     req.tv_nsec =
-        miliseconds * 1000000; /* Must be in range of 0 to 999999999 */
+        milliseconds * 1000000; /* Must be in range of 0 to 999999999 */
   }
   return nanosleep(&req, &rem);
 }

diff --git a/docs/pdf-docs/cache_vol.tex b/docs/pdf-docs/cache_vol.tex
@@ -159,7 +159,7 @@ \subsection{Cache management policy and APIs}
 \item \function{H5Fcache\_create} -- creating a cache in the system’s local storage if it is not yet been created. This is for parallel write. 
 \item \function{H5Fcache\_remove} -- removing the cache associated with the file in the system's local storage (This will call \function{H5LSremove\_cache}). After \function{H5Fcache\_remove} is called, \function{H5Dwrite} will directly write data to the parallel file system. 
 \end{itemize}
-\item Dataset cace related functions (for read only)
+\item Dataset cache related functions (for read only)
 \begin{itemize}
 \item \function{H5Dcache\_create}-- reserving space for the data
 \item \function{H5Dcache\_remove} -- clearing the cache on the local storage related to the dataset Besides these, we will also have the following two functions for prefetching / reading data from the cache

diff --git a/docs/source/bestpractices.rst b/docs/source/bestpractices.rst
@@ -9,7 +9,7 @@ Write workloads
 1) MPI Thread multiple should be enabled for optimal performance;
 2) There should be enough compute work after the H5Dwrite calls to overlap with the data migration from the fast storage layer to the parallel file system;
 3) The compute work should be inserted in between H5Dwrite and H5Dclose. For iterative checkpointing workloads, one can postpone the dataset close and group close calls after next iteration of compute. The API functions are provided to do this.  
-4) If there are multiple H5Dwrite calls issued consecutatively, one should pause the async excution first and then restart the async execution after all the H5Dwrite calls were issued.
+4) If there are multiple H5Dwrite calls issued consecutatively, one should pause the async execution first and then restart the async execution after all the H5Dwrite calls were issued.
 5) For check pointing workloads, it is better to open / close the file only once to avoid unnecessary overhead on setting and removing file caches. 
 
 An application may have the following HDF5 operations to write check point data:

diff --git a/docs/source/cacheapi.rst b/docs/source/cacheapi.rst
@@ -16,7 +16,7 @@ Beside using environment variable setup, the Cache VOL connector provides a set
 
     This enable the finer control of the cache effect for any specific file through the file access property list. The environment variable "HDF5_CACHE_WR" and "HDF5_CACHE_RD" will enable or disable the cache effect for all the files. In our design, the environment variable override the specific setting from the file access property list. 
 
-* Pause/restart all async data migration operations. This is particular useful for the cases when we have multiple writes lauched consecutively. One can pause the async execution before the dataset write calls, and then start the async execution. This allows the main thread to stage all the data from different writes at once and then the I/O thread starts migrating them to the parallel file system, to avoid potential contension effect between the main thread and the I/O thread. 
+* Pause/restart all async data migration operations. This is particular useful for the cases when we have multiple writes launched consecutively. One can pause the async execution before the dataset write calls, and then start the async execution. This allows the main thread to stage all the data from different writes at once and then the I/O thread starts migrating them to the parallel file system, to avoid potential contension effect between the main thread and the I/O thread. 
 
 .. code-block::
 

diff --git a/docs/source/gettingstarted.rst b/docs/source/gettingstarted.rst
@@ -11,7 +11,7 @@ Some configuration parameters used in the instructions:
     export HDF5_VOL_DIR=/path/to/vols/install/dir
     export ABT_DIR=/path/to/argobots/install/dir
 
-We suggest the user to put all the VOL dynamic libraries (such as async, cache_ext, daos, ect) into the same folder: HDF5_VOL_DIR to allow stacking multiple connectors. 
+We suggest the user to put all the VOL dynamic libraries (such as async, cache_ext, daos, etc) into the same folder: HDF5_VOL_DIR to allow stacking multiple connectors. 
 
 Installation
 ============

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -3,7 +3,7 @@
 HDF5 Cache VOL Connector
 ===============================================================
 
-As the scientific computing enters in the to exascale and big dataera, the amount of data produced by the simulation is significantly increased. Meanwhile, data analytics and artificial intelligence have risen up to become two important pillars in scientific computing, both of which are data intensive workloads. Therefore, being able to store or load data efficiently to and from the storage system becomes increasingly important to scientific computing. On the hardware level, many pre-exascale and exascale systems are designed to be equiped with fast storage layer in between the compute node memory and the parallel file system. Examples include burst buffer NVMes SSDs on Summit, Theta, Polaris and the upcoming Frontier system. It is a challenging problem to effectively incorparate these fast storage layer to improve the parallel I/O performance.
+As the scientific computing enters in the to exascale and big dataera, the amount of data produced by the simulation is significantly increased. Meanwhile, data analytics and artificial intelligence have risen up to become two important pillars in scientific computing, both of which are data intensive workloads. Therefore, being able to store or load data efficiently to and from the storage system becomes increasingly important to scientific computing. On the hardware level, many pre-exascale and exascale systems are designed to be equipped with fast storage layer in between the compute node memory and the parallel file system. Examples include burst buffer NVMes SSDs on Summit, Theta, Polaris and the upcoming Frontier system. It is a challenging problem to effectively incorporate these fast storage layer to improve the parallel I/O performance.
 
 We design a HDF5 Cache Virtual Object Layer (VOL) connector that provides support for reading and writing data directly from / to the fast storage layer, while performing the data migration between the fast storage layer and permanent global parallel file system in the background, to allow hiding majority of the I/O overhead behind the computation of the application. The VOL framework provides an easy-to-use programming interface for the application to adapt.