Skip to content
This repository has been archived by the owner on Aug 4, 2020. It is now read-only.

Network Topology #3

Closed
ZhengyaoJiang opened this issue Sep 19, 2017 · 10 comments
Closed

Network Topology #3

ZhengyaoJiang opened this issue Sep 19, 2017 · 10 comments

Comments

@ZhengyaoJiang
Copy link

Hi, quite happy that our work is replicated!

One problem I found is the topology.
In tensorforce-VPG.ipynb In [11].
It seems that a dense layer is added to the network, which is different in original work.

x = dense(x, size=env.action_space.shape[0],activation='relu', l2_regularization=1e-8)

The "Ensemble of Identical Independent Evaluators" will not include any dense layer. Outputs of last convolutional layer will be fed into softmax function directly. That's why we say they are "independent".

@wassname
Copy link
Owner

wassname commented Sep 19, 2017

Hi, good to hear from an author of that paper!

That's one part of the paper I was unsure about. After looking at figure 2, I wasn't sure if the cash bias introduced by a head layer, was hard coded, or a single neuron? I guess it was hard coded?

@ZhengyaoJiang
Copy link
Author

Yes, it is a constant in our work.
However, it hasn't been tested if use a variable could be better.

@wassname
Copy link
Owner

wassname commented Sep 20, 2017

Ah, good to know, that makes sense. That way you can set it to modulate how risky you want the model to be when applying.

Don't hesitate to point out anything else you notice, it's interesting.

@akaniklaus
Copy link

@ZhengyaoJiang First of all, I am also a fan of your paper and very interested with the activity in this repository. Can you please share your exact dataset so that results here can be comparable with your publication? otherwise, I guess it would be difficult to evaluate if the implementation is correct.

@wassname I am not knowledgeable yet in Deep Learning but can you please add me over Skype (ID: akaniklaus) so that we can see if I can also contribute to the project somehow. Thank you!!!

@ZhengyaoJiang
Copy link
Author

@akaniklaus Our data is stored in the database. To take advantage of the data, the data processing code should also be shared.
Actually, the community version of the code would be released in several months. It's easy to test the implementation at that time.

Besides, there is no guarantee that our data pre-processing is bug-free. I think It's great to also replicate the data accessing part to double check our results.

@wassname
Copy link
Owner

wassname commented Oct 13, 2017

@akaniklaus maybe we can skype next week, I like to be a digital "hermit" in the weekend to unwind. For now check out #2 and #4 for ideas on how to contribute.

Personally I am researching reinforcement algorithms that will converge even with noisy observations, since many of the test environments differ from trading market in that their observations are not very noisy. I found the rainbow paper interesting, since they managed to combine the latest RL tricks into one agent that converges ~4x faster. You might like to have a read if it's not to advanced for you.

I would like to add that there is also no guarantee my code is bug free :), let's be honest it probably has some bugs in! So if you notice anything please point it out especially in the environment code.

@ZhengyaoJiang it will be good to see your implementation! Can I ask you, did it converge on most runs, or did you have to try a few times to get it to converge? I ask because RL is notoriously finicky at the moment.

@ZhengyaoJiang
Copy link
Author

@wassname Yes, it can converge on most runs.

@akaniklaus
Copy link

@ZhengyaoJiang Have you ever tried your method with hourly data? in my experience OLMAR and RMR performing even better in terms of returns with it. I can also point out that the set of selected tokens affects their performance dramatically. I don't know if that would also be valid for yours.

@wassname I will read and might understand the paper but as I said, I don't have enough experience yet with Deep Learning to help you implementing that. Have you checked the following repository:
https://github.com/Kaixhin/Rainbow

@ZhengyaoJiang
Copy link
Author

@akaniklaus

Have you ever tried your method with hourly data? in my experience OLMAR and RMR performing even better in terms of returns with it.

It seems worth to try. Did you take commission fee into consideration?

I can also point out that the set of selected tokens affects their performance dramatically

I suppose you mean the selection of assets? This process is done by automatically select top-volume assets at the end of training data. Select by hand might introduce survivorship bias.

And guys, this issues has been out of topic. If you want further communication, it's a good choice to e-mail me or use hangouts.
my Gmail: [email protected]

@akaniklaus
Copy link

@ZhengyaoJiang Yes, I did take commission fee into consideration. One disturbing things about OLPS was that portfolios have very high entropy (meaning that it bets on only a few tokens (often one or two token) in each round and shifts completely.

I found this a bit disturbing and tried to reduce this behavior by using the lowest epsilon value or having a Kalman filter, but they didn't result better in backtests. I am curious if yours also make a similar behavior while distributing the wealth into tokens.

I am not talking about picking best performing ones in training data, that would certainly cause a bias. However, I would select assets based on Expert-knowledge in a real-world use, as that's also what people would normally do when buying & holding.

Ok, thank you very much. I will make further communication via your Gmail. Have a nice weekend.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants