-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cori outdir and add architecture argument #26
Conversation
rly
commented
Feb 11, 2021
- Expand $CSCRATCH env variable on cori
- Allow setting of cori architecture
With these changes, I can run: deep-index train-job --cori -P m3513 -a haswell -t 00:15:00 --debug ar122_r95.genomic.medium.deep_index.input.h5 test.sh
Never mind. Solved that. |
Now I am getting an h5py error:
Any ideas? It seems like a configuration error. I tried adding Here is my #!/bin/bash
#SBATCH -q debug
#SBATCH -A m3513
#SBATCH -t 00:15:00
#SBATCH -n 1
#SBATCH -o /global/cscratch1/sd/rly/exabiome/deep-index/train/datasets/default/chunks_W4000_S4000/roznet/M/n1_g4_A1_b64_r0.001_o256/train.%j.lsf_log #SBATCH -e /global/cscratch1/sd/rly/exabiome/deep-index/train/datasets/default/chunks_W4000_S4000/roznet/M/n1_g4_A1_b64_r0.001_o256/train.%j.lsf_log #SBATCH -C haswell
conda activate /global/cscratch1/sd/rly/env/deeptaxon
module load cray-hdf5
JOB="$SLURM_JOB_ID"
OPTIONS="-d -M -b 64 -g 4 -n 1 -o 256 -W 4000 -S 4000 -r 0.001 -A 1 -e 10 -s 1101233524 -E n1_g4_A1_b64_r0.001_o256"
OUTDIR="/global/cscratch1/sd/rly/exabiome/deep-index/train/datasets/default/chunks_W4000_S4000/roznet/M/n1_g4_A1_b64_r0.001_o256/train.$JOB"
INPUT="/global/u1/r/rly/ar122_r95.genomic.medium.deep_index.input.h5"
LOG="$OUTDIR.log"
CMD="deep-index train --slurm $OPTIONS roznet $INPUT $OUTDIR"
cp $0 $OUTDIR.sh
mkdir -p $OUTDIR
srun $CMD > $LOG 2>&1 |
I'm not sure what that is. Can you try moving the input file to CSCRATCH? |