Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid index to scalar variable @ mode(days_diffs).mode[0] #404

Open
BCAA50000 opened this issue Feb 3, 2024 · 11 comments
Open

invalid index to scalar variable @ mode(days_diffs).mode[0] #404

BCAA50000 opened this issue Feb 3, 2024 · 11 comments

Comments

@BCAA50000
Copy link

BCAA50000 commented Feb 3, 2024

Problem Summary:
I tried multiple writer's code, always the same error.
Error Trace:alphalens.utils - > get_clean_factor_and_forward_returns -> compute_forward_returns 的 mode(days_diffs).mode[0] : IndexError: invalid index to scalar variable

`# -- coding: utf-8 --
import alphalens
import pandas as pd
import random
import warnings
warnings.filterwarnings('ignore')

if name == 'main':
# 模拟的交易日期序列
trade_date_ls = pd.date_range('1/1/2010', '31/3/2020').tolist()
# 模拟的股票代码序列
stock_id_ls = [f"{'0' * (6 - len(str(i)))}{i}.SZ" for i in range(2000)]

# 输入因子矩阵
factor_ls = []
for trade_date in trade_date_ls:
    for stock_id in stock_id_ls:
        factor_ls.append([trade_date, stock_id, random.random() / 100])
factor = pd.DataFrame(factor_ls, columns=['trade_date', 'stock_id', 'factor1'])
factor = factor.set_index(['trade_date', 'stock_id'])

# 输入价格矩阵
prices_ls = []
for trade_date in trade_date_ls:
    tmp = [random.random() / 100 for _ in range(len(stock_id_ls))]
    tmp.append(trade_date)
    prices_ls.append(tmp)
prices = pd.DataFrame(prices_ls, columns=['trade_date' if i == len(stock_id_ls) else stock_id_ls[i] for i in range(len(stock_id_ls) + 1)])
prices = prices.set_index(['trade_date'])

# periods表示调仓周期
# bins表示分组数量
input_df = alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, periods=(1, 5, ), bins=10, quantiles=None)

alphalens.tears.create_information_tear_sheet(input_df)
alphalens.tears.create_returns_tear_sheet(input_df)

`

** traceback:**
Traceback (most recent call last):
File "D:\Quant\Projects\ALPHALENS_TEST\CSDN范例.py", line 36, in
input_df = alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, periods=(1, 5, ), bins=10, quantiles=None)
File "D:\Quant\Projects\ALPHALENS_TEST\venv\lib\site-packages\alphalens\utils.py", line 827, in get_clean_factor_and_forward_returns
forward_returns = compute_forward_returns(
File "D:\Quant\Projects\ALPHALENS_TEST\venv\lib\site-packages\alphalens\utils.py", line 319, in compute_forward_returns
delta_days = period_len.components.days - mode(days_diffs).mode[0]
IndexError: invalid index to scalar variable.


**Please provide any additional information below:**
The same code can run on my friends environment.

## Versions
python3.10,
Name: alphalens
Version: 0.4.0
Name: pandas
Version: 1.5.3
Name: numpy
Version: 1.23.1
@JiwenZ
Copy link

JiwenZ commented Feb 7, 2024

set keepdims=True
delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0]

@BCAA50000
Copy link
Author

set keepdims=True delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0]

Thanks JiwenZ! I haven't try out your solution, but I solved by re-install different version of Pandas or Numpy or Python, i don't know which of it is the factor. Here is a list of enviroment version, which i tested, it can run well:
alphalens 0.4.0
numpy 1.24.4
pandas 1.3.4
Python 3.8.18

@ljztrust
Copy link

你好,我现在也碰到了跟你一模一样的问题,将factor和price数据表都准备好了,运行get_clean_factor_and_forward_returns函数总是报错IndexError: invalid index to scalar variable,换了好多个数据表,都报这个错,这只能通过降低版本来解决吗?

@BCAA50000
Copy link
Author

似乎作者有其他的办法,但我是通过版本解决的。我的环境如下。
不过呢,环境也不一定能行。我最开始出现这个问题的时候我和别人对其了环境,可是很奇怪,版本都对齐了,我还是不能运行。后面我又找了个人对齐了环境,就能运行了,最后我能用的环境如下:
alphalens 0.4.0
numpy 1.24.4
pandas 1.3.4
Python 3.8.18

@shandonguzi
Copy link

pandas 1.4.4 also works

@liangcaihua
Copy link

changed to mode([days_diffs]).mode can also 真 无语

@AnthonyTremblayy
Copy link

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

@liangcaihua
Copy link

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

Your mistake is different from mine. Try something else

@schweik6
Copy link

schweik6 commented Apr 11, 2024

I am having the same error. Changing to: delta_days = period_len.components.days - mode(days_diffs, keepdims=True).mode[0] or mode([days_diffs]).mode did not work. I get the following error:

AssertionError: Length of new_levels (3) must be <= self.nlevels (2)

Any idea how to solve?

I modified like that and also found that error, that seems, pandas version should < 2.1(or fix source code too...).
And when I use version 2.0.2, it will raise another error with "TypeError: incompatible index of inserted column with frame index"...

Finally, I use 1.3.4, works fine.

@AnthonyTremblayy
Copy link

Make sure you’re on alphalens-reloaded and not alphalens (not supported anymore). The former supports the latest version of pandas, but the latter doesn’t.

@schweik6
Copy link

Make sure you’re on alphalens-reloaded and not alphalens (not supported anymore). The former supports the latest version of pandas, but the latter doesn’t.

You're right, I'm just on alphalens, and I find the forked project alphalens-reloaded now, will try that in future, thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants