Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MXNET R, GPU speed-up much less for regression example than classification #5052

Closed
khalida opened this issue Feb 18, 2017 · 4 comments
Closed
Labels

Comments

@khalida
Copy link

khalida commented Feb 18, 2017

In the example code below I run simple classification and regression examples for MXNET running from R. In both examples I first solve with CPU, and then GPU.

In the classification example I get a ~ 32x speed-up when using a GPU, however the speed-up when running the regression example is much less (about 3x).

Details are given below, my main questions would be:

  1. Is there are reason the speed-up is much less significant for the regression example?
  2. Do people who have mxnet get similar/comparable results to me? (Alternatively is there a benchmark I can run and compare to benchmark performances?)

The output I get:

Classification CPU time:: 50.104 sec elapsed
Classification GPU time:: 1.54 sec elapsed
Regression CPU time:: 50.39 sec elapsed
Regression GPU time:: 15.762 sec elapsed

Example code:

## Load required packages
require(mlbench)
require(mxnet)
require(tictoc)

## Options:
nHidden <- 100
nRounds <- 200
batchSize <- 32

## Classification VS Regression GPU-speedup example
data(Sonar, package="mlbench")
Sonar[,61] = as.numeric(Sonar[,61])-1
train.ind = c(1:50, 100:150)
train.x = data.matrix(Sonar[train.ind, 1:60])
train.y = Sonar[train.ind, 61]
test.x = data.matrix(Sonar[-train.ind, 1:60])
test.y = Sonar[-train.ind, 61]

tic("Classification CPU time:")
mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=nHidden, out_node=2,
                out_activation="softmax", num.round=nRounds,
                array.batch.size=batchSize, learning.rate=0.07, momentum=0.9, 
                eval.metric=mx.metric.accuracy, array.layout="rowmajor",
                device=mx.cpu(), verbose=FALSE
)
toc()

tic("Classification GPU time:")
mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=nHidden, out_node=2,
                out_activation="softmax", num.round=nRounds,
                array.batch.size=batchSize, learning.rate=0.07, momentum=0.9, 
                eval.metric=mx.metric.accuracy, array.layout="rowmajor",
                device=mx.gpu(), verbose=FALSE
)
toc()

tic("Regression CPU time:")
mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=5*nHidden, out_node=1,
                out_activation="rmse", num.round=10*nRounds,
                array.batch.size=batchSize, learning.rate=0.07, momentum=0.9, 
                eval.metric=mx.metric.rmse, array.layout="rowmajor",
                device=mx.cpu(), verbose=FALSE
)
toc()

tic("Regression GPU time:")
mx.set.seed(0)
model <- mx.mlp(train.x, train.y, hidden_node=5*nHidden, out_node=1,
                out_activation="rmse", num.round=10*nRounds,
                array.batch.size=batchSize, learning.rate=0.07, momentum=0.9, 
                eval.metric=mx.metric.rmse, array.layout="rowmajor",
                device=mx.gpu(), verbose=FALSE
)
toc()

Details of my set-up

Environment info

Operating System: Ubuntu 16.04
R details:

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          3.2                         
year           2016                        
month          10                          
day            31                          
svn rev        71607                       
language       R                           
version.string R version 3.3.2 (2016-10-31)
nickname       Sincere Pumpkin Patch  
@khalida khalida changed the title MXNET R, no GPU speed-up for regression example MXNET R, GPU speed-up much less for regression example than classification Feb 18, 2017
@matt32106
Copy link

My results with
CPU: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
GPU: GTX1060

> source('~/.active-rstudio-document')
Classification CPU time:: 1.045 sec elapsed
Classification GPU time:: 1.038 sec elapsed
Regression CPU time:: 11.341 sec elapsed
Regression GPU time:: 6.812 sec elapsed
> source('~/.active-rstudio-document')
Classification CPU time:: 0.988 sec elapsed
Classification GPU time:: 0.818 sec elapsed
Regression CPU time:: 11.484 sec elapsed
Regression GPU time:: 6.915 sec elapsed

@ankkhedia
Copy link
Contributor

@matt32106 @khalida
With the latest MXNetR version (Windows GPU - cuda 80), I got the following results

Run1:
Classification CPU time:: 1.88 sec elapsed
Classification GPU time:: 2.03 sec elapsed
Regression CPU time:: 19.58 sec elapsed
Regression GPU time:: 19.45 sec elapsed

Run2:
Classification GPU time:: 1.97 sec elapsed
Classification CPU time:: 1.75 sec elapsed
Regression GPU time:: 19.73 sec elapsed
Regression CPU time:: 20.83 sec elapsed

This is the result obtained on Windows server ec2 instance with p2.xlarge instances (Nvidia Tesla K80).

Doesn't seems like we have a benchmark to compare performances but it should depend too much on GPU/CPU/OS and system load and other issues like cold start. Trying to get results on few more configurations to analyze the numbers.

@ankkhedia
Copy link
Contributor

ankkhedia commented Sep 6, 2018

@khalida Please try to run the examples again to see the speedup. In my opinion , you are facing the problem of cold start and hence classsification CPU time is significantly more than other numbers. I also faced a similar issue. While comparing again, I didn't find too much of a difference in speedup for classification/regression example while running on Windows ec2 server.
Classification GPU time:: 1.65 sec elapsed
Classification CPU time:: 1.74 sec elapsed
Regression GPU time:: 16.11 sec elapsed
Regression CPU time:: 16.19 sec elapsed
Hope that answers your query.

@ankkhedia
Copy link
Contributor

@sandeep-krishnamurthy Could you please close the issue as the query has been answered.

@khalida Please feel free to reopen in case of more question or closed in error.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants