How to not search feature correlation with all y target? #25

wanga10000 · 2022-06-08T03:10:11Z

Hi,
First of all, thanks for developing this tool, it's an excellent tool for feature selecting.
Not only for the algorithm but also for the integration and processing of all indicators.

Here's my situation,
So I got a strategy, and I want to search features correlating to win or lose.
That is, there's only a few of points in my y target that is "activated" instead of using n-point return or the other.
Therefore I tried to make y target like the following:
Assume there's a 10-day OHLCV, and the strategy activated at the third day and seventh day.
y = {0,0,1,0,0,0,-1,0,0,0} where 1 stands for win and -1 stands for loss.

It's probably not a reasonable way to do this.
Cause the tool print like 5-6 features whose correlation to targets over 0.9.
And I realized that those features the tool found only correlated to "activated" points instead of win or lose.
So I think it would be good if the algorithm can search the points that is "activated" and mask the other points.

Do you have any suggestion of implementing this kind of usage? Thanks!

jmrichardson · 2022-06-08T18:11:43Z

Hi @wanga10000 ,

I am not sure I completely follow your example. It looks like the "activated" points are third and seventh day which are win and lose (1, -1). But you said,

tool found only correlated to "activated" points instead of win or lose.

I think I would have tried the same thing as you with your target y being as you described. However, as I think you pointed out, there are indicators that are highly correlated by virtue of being close to 0 which is most of your data points. So, I am assume what you are looking for is a way to include a mask on "0"s after the indicators have been calculated and only use dcor on the -1, 1 values?

That does sound like a good feature if I understand correctly. However, at the moment I am not sure I can get to it soon as I am very busy on another project. Perhaps if you could describe how you would architect the solution. I am thinking that you may want to include a fit parameter such as "mask" that is the same size as y that could be used to filter the observations prior to dcor.

Also happy to merge a PR if you would like to implement yourself.

wanga10000 · 2022-06-09T02:36:48Z

I am assume what you are looking for is a way to include a mask on "0"s after the indicators have been calculated and only use dcor on the -1, 1 values?

Yes, exactly. If doing so, I think this tool would give more practicality to algo trading, which is really good.

I am thinking that you may want to include a fit parameter such as "mask" that is the same size as y that could be used to filter the observations prior to dcor.

That sounds like a feasible work. And you can make the mask input default to all 0 so it wouldn't affect the original usage.

Happy to see you thought that this is not a bad idea. I'll look foward to this feature coming online :)

jmrichardson mentioned this issue Jul 15, 2022

Can y.index be a subset of X.index? #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to not search feature correlation with all y target? #25

How to not search feature correlation with all y target? #25

wanga10000 commented Jun 8, 2022 •

edited

Loading

jmrichardson commented Jun 8, 2022

wanga10000 commented Jun 9, 2022

How to not search feature correlation with all y target? #25

How to not search feature correlation with all y target? #25

Comments

wanga10000 commented Jun 8, 2022 • edited Loading

jmrichardson commented Jun 8, 2022

wanga10000 commented Jun 9, 2022

wanga10000 commented Jun 8, 2022 •

edited

Loading