Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt to tensorforce#memory #6

Closed
lefnire opened this issue Jan 28, 2018 · 5 comments
Closed

Adapt to tensorforce#memory #6

lefnire opened this issue Jan 28, 2018 · 5 comments

Comments

@lefnire
Copy link
Owner

lefnire commented Jan 28, 2018

TensorForce's memory branch supposedly fixes a major flaw in the PPO implementation. It's a pretty wild-west branch for now, and I've started my own branch to follow it. I'm gonna put that on hold until they merge their branch into master and cut a release. We'll need to adapt to the new hyperparameters introduced (adding them to our hypersearch framework).

@lefnire
Copy link
Owner Author

lefnire commented Feb 3, 2018

BTW @TalhaAsmal I'm doing my recent work on the #memory branch, which tracks TensorForce's #memory branch. In case you're diving in deep, might benefit from their recent updates

@lefnire
Copy link
Owner Author

lefnire commented Feb 8, 2018

@TalhaAsmal I did a heavy-handed force-push to #memory. I was in flux using that as a personal dev branch (I'm now using #tmp as my dirty branch). I won't be force-pushing to #memory anymore, so you can work from it safely - sorry for that.

You'll want to rebase your work to it (easiest is probably: (1) create a new branch from #memory, (2) cherry-pick your own branch's commits one-by-one). After following some conversation on TensorForce's Gitter channel, looks like 0.3.5.1 is pretty much non-functional (you won't get any models to converge); that the commits in tensorforce#memory fix many problems. That's to say, our tforce_btc_trader#master is non-functional.

A big noteable recent change. tensorforce#memory switches Policy Gradient models' batching from timestep-batching to episode-batching. Timestep-batching gave us control over explicit batch sizes (1024 means 1024 timesteps, aka 1024 rows/datapoints to send to TensorFlow). Episodes on the other hand are however many timesteps the agent stays alive (doesn't dip below 0 value/cash, doesn't HODL too long). Could be 10k timesteps; could be 100k. Turned out this buffer would fill up pretty fast and crash my GPU (1080 ti, aint nothing to shake a stick at). In order to make episode-batching work, I had to reduce a lot of dimensional parameters. Reduced batch_size, max steps per episode (from 20k to 6k; also corresponding punish_repeats), and importantly the dimensionality of each timestep's state (aka the features). I added an AutoEncoder to reduce a timestep's dimensions from ~22 (plus/minus a bunch depending on arbitrage & indicators) to 6. Pretty damn destructive. I've got some ideas to improve this situation going forward:

  1. Improve the autoencoder's performance. Currently has a ~2% MSE, if we can widdle then maybe it's find to do what we're doing.
  2. Switch from PPO to DDPG, which I believe uses timestep-batching instead of episode. It's a new model in tensorforce#memory, and actually one I've seen used for trading. An important bit is to stick with continuous actions, so it can buy/sell whatever amount. eg, DQNs require discrete actions.
  3. Move to a different framework (see Try ray/RLlib #11)

@lefnire lefnire changed the title Adapt to tensorforce#memory when merged Adapt to tensorforce#memory Feb 8, 2018
@TalhaAsmal
Copy link
Contributor

I've been testing out the memory branch of this and tensorforce since last night, so not much to stay yet.

I've seen the following warning message a couple of times on the first run of hypersearch.py, but not after:

2018-02-10 02:51:38.030610: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 516.70MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.

It's also taking REALLY long on my gtx 1070. Only 3 runs in about 12 hours, but I'll leave it running probably until Monday at least to see how it performs.

Also, I'm running with a custom data set (pulled from Bitfinex API, only has OHCLV data) with 133986 rows of data (Just about 3 months data). Is this enough, or should I pump more data into it?

@lefnire
Copy link
Owner Author

lefnire commented Feb 10, 2018

@TalhaAsmal do a pull, added some stuff (changed some DB structure, mind). Also pull the latest lefnire/tensorforce#memory (or reinforceio/tensorforce#memory, they're 1-1 currently). Note, you won't be able to do gpu-split right now; they've temporarily disabled a single runner's session_config. That actually might fix your memory issue! I have a commit manually splitting in 2 for my 1080 ti, but you'd likely wanna avoid that given the issue.

The issue being RAM-maxing. Both the warning, and the amount of time it takes. Indeed, the way tensorforce#memory handles batches w/ PG models is as episode-batches rather than step-batches, which massively increases the load vs #master (hence my big rant above). So I'm not surprised you're facing that issue. This commit was a heavy-handed attempt to mitigate that; I'm surprised it didn't cut if for your case. This commit reverts the auto-encoder, so if you're still playing w/ the trader & facing memory issues, uncomment the lines here.

Since my comment, I tried DDPG in hopes to switch to timestep-batching (and save RAM). The agent doesn't currently support multi-state configs (dict-based states), so will need to revisit in the future. Will continue posting w/ updates

@TalhaAsmal
Copy link
Contributor

I noticed very strange GPU usage behaviour, shown below:

image

My usage will be ~8%-10% for a long while, then suddenly spike and just as suddenly decrease. I've never seen this before in my NN training, and I wonder if that's why performance is so bad. Is there any way to profile or debug this behaviour?

I'm running another hypersearch run tonight to see if the memory error persists.

@lefnire lefnire closed this as completed Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants