Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert lightgbm to onnx has an error #1138

Open
ArlanCooper opened this issue Nov 13, 2024 · 1 comment
Open

convert lightgbm to onnx has an error #1138

ArlanCooper opened this issue Nov 13, 2024 · 1 comment

Comments

@ArlanCooper
Copy link

my code:

from random import randint
import pandas as pd
import onnx
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType, Int64TensorType,StringTensorType

a = [i for i in range(1000)]
b = [1,2,3,4,5,6]
c = [b[randint(0,5)] for i in range(1000)]
d = [randint(0,1) for i in range(1000)]
tmp = []
for i in range(1000):
    tmp.append([a[i],c[i],d[i]])
df = pd.DataFrame(tmp,columns=["a","b","label"])   # 造数据



# 假设你已经有了一个训练好的LightGBM模型
from lightgbm import LGBMClassifier
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# 创建一个包含LightGBM模型的Pipeline
pipe = Pipeline([
    ("scaler", StandardScaler()),  # 标准化处理
    ("lgbm", LGBMClassifier())     # LightGBM分类器
])

# 假设X_train和y_train是你的训练数据和标签
pipe.fit(df[["a","b"]],df["label"])

# 定义输入数据的类型,这里我们假设第一个输入是float类型,第二个输入是int类型
input_types = [
    ('float_input', FloatTensorType([None, 1])),  # float类型的输入,假设有10个特征
    ('int_input', Int64TensorType([None, 1]))      # int类型的输入,假设有1个特征
]

# 转换模型
onnx_model = convert_sklearn(pipe, initial_types=input_types)

# 保存ONNX模型
with open("pipeline_lightgbm.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

# 验证ONNX模型
onnx.checker.check_model(onnx_model)
print("ONNX模型已成功转换并验证。")


the error:

---------------------------------------------------------------------------
MissingShapeCalculator                    Traceback (most recent call last)
Cell In[17], line 26
     20 input_types = [
     21     ('float_input', FloatTensorType([None, 1])),  # float类型的输入,假设有10个特征
     22     ('int_input', Int64TensorType([None, 1]))      # int类型的输入,假设有1个特征
     23 ]
     25 # 转换模型
---> 26 onnx_model = convert_sklearn(pipe, initial_types=input_types)
     28 # 保存ONNX模型
     29 with open("pipeline_lightgbm.onnx", "wb") as f:

File C:\ProgramData\miniforge3\lib\site-packages\skl2onnx\convert.py:206, in convert_sklearn(model, name, initial_types, doc_string, target_opset, custom_conversion_functions, custom_shape_calculators, custom_parsers, options, intermediate, white_op, black_op, final_types, dtype, naming, model_optim, verbose)
    204 if verbose >= 1:
    205     print("[convert_sklearn] convert_topology")
--> 206 onnx_model = convert_topology(
    207     topology,
    208     name,
    209     doc_string,
    210     target_opset,
    211     options=options,
    212     remove_identity=model_optim and not intermediate,
    213     verbose=verbose,
    214 )
    215 if verbose >= 1:
    216     print("[convert_sklearn] end")

File C:\ProgramData\miniforge3\lib\site-packages\skl2onnx\common\_topology.py:1533, in convert_topology(topology, model_name, doc_string, target_opset, options, remove_identity, verbose)
   1522 container = ModelComponentContainer(
   1523     target_opset,
   1524     options=options,
   (...)
   1528     verbose=verbose,
   1529 )
   1531 # Traverse the graph from roots to leaves
   1532 # This loop could eventually be parallelized.
-> 1533 topology.convert_operators(container=container, verbose=verbose)
   1534 container.ensure_topological_order()
   1536 if len(container.inputs) == 0:

File C:\ProgramData\miniforge3\lib\site-packages\skl2onnx\common\_topology.py:1350, in Topology.convert_operators(self, container, verbose)
   1347 for variable in operator.outputs:
   1348     _check_variable_out_(variable, operator)
-> 1350 self.call_shape_calculator(operator)
   1351 self.call_converter(operator, container, verbose=verbose)
   1353 # If an operator contains a sequence of operators,
   1354 # output variables are not necessarily known at this stage.

File C:\ProgramData\miniforge3\lib\site-packages\skl2onnx\common\_topology.py:1165, in Topology.call_shape_calculator(self, operator)
   1163 else:
   1164     logger.debug("[Shape2] call infer_types for %r", operator)
-> 1165     operator.infer_types()

File C:\ProgramData\miniforge3\lib\site-packages\skl2onnx\common\_topology.py:631, in Operator.infer_types(self)
    628 def infer_types(self):
    629     # Invoke a core inference function
    630     if self.type is None:
--> 631         raise MissingShapeCalculator(
    632             "Unable to find a shape calculator for type '{}'.".format(
    633                 type(self.raw_operator)
    634             )
    635         )
    636     try:
    637         shape_calc = _registration.get_shape_calculator(self.type)

MissingShapeCalculator: Unable to find a shape calculator for type '<class 'lightgbm.sklearn.LGBMClassifier'>'.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter
implemented in sklearn-onnx. If the converted is implemented
in another library, you need to register
the converted so that it can be used by sklearn-onnx (function
update_registered_converter). If the model is not yet covered
by sklearn-onnx, you may raise an issue to
https://github.com/onnx/sklearn-onnx/issues
to get the converter implemented or even contribute to the
project. If the model is a custom model, a new converter must
be implemented. Examples can be found in the gallery.


how to fix it?

@xadupre
Copy link
Collaborator

xadupre commented Nov 14, 2024

LightGBM is not part of scikit-learn, you need to register the classifier: https://onnx.ai/sklearn-onnx/auto_tutorial/plot_gexternal_lightgbm.html#register-the-converter-for-lgbmclassifier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants