You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
I want to train CNN with multi machines.
Hardware: cluster ( Tesla K20)
system: Red Hat Enterprise Linux Server release 6.4 (Santiago)
When I try to run theexample/image-classification/train_cifar10.py using comand :
@Feywell if you're still interested there's an tutorial being worked on at the moment which clearly documents the steps required for distributed training. Check out the PR here, and the tutorial file is mxnet/example/distributed_training/README.md.
@yzhliu can you please close the issue? @Feywell if you have issues setting up distritibuted training, please create a post on https://discuss.mxnet.io Thanks!
Description
I want to train CNN with multi machines.
Hardware: cluster ( Tesla K20)
system: Red Hat Enterprise Linux Server release 6.4 (Santiago)
When I try to run the
example/image-classification/train_cifar10.py
using comand :
I check my python package, argparse is existed.
Environment info (Required)
Package used (Python/R/Scala/Julia):
(I'm using Python)
Build info (Required if built from source)
Compiler:GCC ( 5.4.0)
Build config:
command:
I don't know what errors there are. How can I fix this error?
Thank you!
The text was updated successfully, but these errors were encountered: