xgboost 1.1.1 pred failed, while 0.90 pred success #5841

zwqjoy · 2020-06-30T07:21:26Z

1line_inst:
0 999:2000.000000

#model1.bin train with xgb0.90
#model2.bin train with xgb1.1.1

CODE1

import xgboost as xgb

print(xgb.__version__)
pred = xgb.DMatrix("1line_inst")

bst2 = xgb.Booster({'nthread': 4})  # init model
bst2.load_model('model2.bin')  # load data
print(bst2.predict(pred))

OUTPUT1

1.1.1
[15:16:14] 4x998 matrix with 2105 entries loaded from 1line_inst
Traceback (most recent call last):
  File "pred_zxb.py", line 12, in <module>
    print(bst2.predict(pred))
  File "/Users/zengwenqi/DXM/DXM-codebase/baidu/rimrdp/pipelines/venv/lib/python3.7/site-packages/xgboost/core.py", line 1580, in predict
    ctypes.byref(preds)))
  File "/Users/zengwenqi/DXM/DXM-codebase/baidu/rimrdp/pipelines/venv/lib/python3.7/site-packages/xgboost/core.py", line 190, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [15:16:14] /Users/travis/build/dmlc/xgboost/src/learner.cc:1070: Check failed: learner_model_param_.num_feature == p_fmat->Info().num_col_ (1104 vs. 998) : Number of columns does not match number of features in booster.
Stack trace:
  [bt] (0) 1   libxgboost.dylib                    0x0000000118c101c0 dmlc::LogMessageFatal::~LogMessageFatal() + 112
  [bt] (1) 2   libxgboost.dylib                    0x0000000118cbda2a xgboost::LearnerImpl::ValidateDMatrix(xgboost::DMatrix*) const + 282
  [bt] (2) 3   libxgboost.dylib                    0x0000000118cbdb13 xgboost::LearnerImpl::PredictRaw(xgboost::DMatrix*, xgboost::PredictionCacheEntry*, bool, unsigned int) const + 67
  [bt] (3) 4   libxgboost.dylib                    0x0000000118cadecc xgboost::LearnerImpl::Predict(std::__1::shared_ptr<xgboost::DMatrix>, bool, xgboost::HostDeviceVector<float>*, unsigned int, bool, bool, bool, bool, bool) + 732

CODE2

import xgboost as xgb

print(xgb.__version__)
pred = xgb.DMatrix("1line_inst")

bst2 = xgb.Booster({'nthread': 4})  # init model
bst2.load_model('model1.bin')  # load data
print(bst2.predict(pred))

OUTPUT2

0.90
[15:20:51] 1x1000 matrix with 1 entries loaded from 1line_test
[0.0208639]

The text was updated successfully, but these errors were encountered:

trivialfis · 2020-06-30T09:57:48Z

So, it seems your model is trained on a dataset with different shape?

zwqjoy · 2020-07-02T04:00:19Z

@trivialfis but the pred inst only have one feature, index 999, the 2 models training with max index feats is 2000+

trivialfis · 2020-07-02T17:05:57Z

There were some heuristics around this that never got documented nor tested. I'm trying to re-establish them in a way that can be precisely documented.

trivialfis · 2020-07-02T17:26:04Z

learner_model_param.num_feature == p_fmat->Info().num_col (1104 vs. 998)

Also, I think there's something wrong here.

the 2 models training with max index feats is 2000+

From the error message your model has 1104 number of feature while your training dataset has 2000+ features.

hcho3 · 2020-07-06T19:44:21Z

@trivialfis @RAMitchell

From #5856 (comment):

In the older versions we had some heuristics to predict on this kind of mismatching number of features, like treating them as missing if the input has less trailing features, or ignore them if input has more. After a offline chat with @RAMitchell , neither of us want to restore that heuristic as it brings more issues than benefits.

I agree that the heuristic has issues, but we need to make a provision for reading LIBSVM files with deficient number of features. The LIBSVM files unfortunately do not store the number of columns, and the column dimension is inferred from the maximum feature ID that occurs. For example, it should be possible to feed in the LIBSVM file

1 0:100

into a model trained with 10 features. The correct way to interpret this LIBSVM file is to assign 100 for the first feature and mark all the other features as missing. IMHO, rejecting LIBSVM files like this would make the built-in LIBSVM reader unusable.

ranInc · 2020-07-07T12:52:45Z

I also encountered this issue, and the same model works on the same data on 0.9 as described here.

It might have a similarity with the following issue I opened that is happening more often (to me at least):
https://github.com/dmlc/xgboost/issues/5848

trivialfis · 2020-07-13T12:36:48Z

@hcho3 Could you please help establishing the heuristics we need to support? I'm quite confused by the old heuristics.

We should support predicting on DMatrix with less features than model. Those features will be treated as missing.
Should we support predicting on DMatrix with more features than model?
Should we output a warning on mismatching number of features? With libsvm, this can be quite verbose.

trivialfis · 2020-07-13T12:37:52Z

Also, on Python side, there's a validate feature option, should we lower it down to C++ and make it a parameter?

hcho3 · 2020-07-14T05:23:27Z

We should support predicting on DMatrix with less features than model. Those features will be treated as missing.

Yes.

Should we support predicting on DMatrix with more features than model?

No. We should raise an error in this case.

Should we output a warning on mismatching number of features? With libsvm, this can be quite verbose.

No. If we only adopt the first heuristic, the behavior should be fairly predictable, I think.

on Python side, there's a validate feature option, should we lower it down to C++ and make it a parameter?

Now that feature names and types in the C++ layer, we can. I'll let you decide.

lucagiovagnoli · 2020-07-23T19:09:39Z

I've just encountered this issue as well. The newer xgboost > 1.0.0 is printing a lot of warnings to screen:

WARNING: /xgboost/src/learner.cc:979: Number of columns does not match number of features in booster

We should support predicting on DMatrix with less features than model. Those features will be treated as missing

I agree that we should support this case.

If it helps, we never saw this issue on xgboost=0.82

hcho3 · 2020-07-23T23:17:51Z

@lucagiovagnoli I think you have a different issue. XGBoost uses zero-based indexing for features by default, so the diabetes dataset would be recognized as having 9 features. To use one-based indexing, append ?indexing_mode=1 to the file path:

dtrain = xgb.DMatrix('./diabetes.libsvm?indexing_mode=1')

Alternatively, you may use sklearn.datasets.load_svmlight_file().

lucagiovagnoli · 2020-07-23T23:23:13Z

Hi @hcho3 , sorry I had deleted the comment before you replied because I noticed it was my mistake :)
Thanks for the insight, I'll try that!

Original comment for reference:

Actually, this is happening using this dataset with no missing column: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/diabetes
I wonder if the problem is that columns in the libsvm file are numbered from 1 to 8 without using zero?

trivialfis · 2020-07-29T07:03:16Z

Will look into this.

zwqjoy · 2020-08-13T09:46:04Z

@trivialfis the bug has fixed ?

Now the xgb 1.1.1 should support predicting on DMatrix with less features than model. Those features will be treated as missing. ?

or fix in xgb 1.2 ?

trivialfis · 2020-08-13T13:08:01Z

Yup, please try 1.2 rc.

mshergad · 2022-01-12T20:58:24Z

I am getting this error when I try to make predictions using version 1.5.1 when the original model was created on 0.9. I have followed the steps to use save_model such that I can load the binary format in a newer version. Is there something I am missing?

XGBoostError: [12:48:23] /Users/runner/work/xgboost/xgboost/src/learner.cc:1257: Check failed: learner_model_param_.num_feature >= p_fmat->Info().num_col_ (120 vs. 122) : Number of columns does not match number of features in booster.
Stack trace:
[bt] (0) 1 libxgboost.dylib 0x0000000139c7e814 dmlc::LogMessageFatal::~LogMessageFatal() + 116
[bt] (1) 2 libxgboost.dylib 0x0000000139d3319b xgboost::LearnerImpl::ValidateDMatrix(xgboost::DMatrix*, bool) const + 587
[bt] (2) 3 libxgboost.dylib 0x0000000139d333a2 xgboost::LearnerImpl::PredictRaw(xgboost::DMatrix*, xgboost::PredictionCacheEntry*, bool, unsigned int, unsigned int) const + 50
[bt] (3) 4 libxgboost.dylib 0x0000000139d24226 xgboost::LearnerImpl::Predict(std::__1::shared_ptrxgboost::DMatrix, bool, xgboost::HostDeviceVector*, unsigned int, unsigned int, bool, bool, bool, bool, bool) + 598
[bt] (4) 5 libxgboost.dylib 0x0000000139c756db XGBoosterPredictFromDMatrix + 1323
[bt] (5) 6 _ctypes.cpython-37m-darwin.so 0x0000000110124967 ffi_call_unix64 + 79
[bt] (6) 7 ??? 0x00007ff7b0919510 0x0 + 140701795980560

trivialfis · 2022-01-13T04:24:34Z

@mshergad Your data must have all the features that the model has. The original issue here is having more features in data than the model, which was clarified and tested after this issue. But having fewer features in data than the model is invalid.

trivialfis · 2022-01-13T04:25:48Z

Ideally one should match the datasets for test and train in the ETL process.

mshergad · 2022-01-13T04:44:24Z

@mshergad Your data must have all the features that the model has. The original issue here is having more features in data than the model, which was clarified and tested after this issue. But having fewer features in data than the model is invalid.

Thank you so much for the quick reply. My data has a total of 122 columns where 120 are features that the model is trained on and additionally two columns that have the target and the target_probability.

When I pass the entire dataset it throws this error that I showed. If I understand correctly learner_model_param_.num_feature is 120 and p_fmat->Info().num_col_ is 122. The dataframe has 122 columns.

So I dropped two columns and passed 120 features that the model is originally trained on ...and then it says it expects the two other column names.

So I'm going in circles currently.

When I use the same code with version 0.90 where the model was originally created, using the same command gives me the predictions.

Please do advice!

mshergad · 2022-01-13T04:45:43Z

Here's the error when the drop the two columns it is not trained on.

trivialfis · 2022-01-13T04:49:30Z

@mshergad Could you please share your code and data (maybe synthesized) so that I can take a closer look?

mshergad · 2022-01-13T04:53:42Z

Sounds good. I will have to seek permissions. I will synthesize and share this soon. Thanks again!

trivialfis mentioned this issue Jul 5, 2020

Add XGBoosterGetNumFeature #5856

Merged

lucagiovagnoli mentioned this issue Jul 29, 2020

Upgrade to xgboost 1.0.0 - Use h2oai Predictor combust/mleap#708

Merged

trivialfis added the Blocking label Jul 29, 2020

trivialfis mentioned this issue Jul 29, 2020

Fix prediction heuristic #5955

Merged

trivialfis closed this as completed in #5955 Jul 29, 2020

This was referenced Sep 8, 2020

Release xgboost 1.2 with GPU support aws/sagemaker-xgboost-container#133

Closed

Release xgboost 1.2 with GPU support aws/sagemaker-xgboost-container#134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xgboost 1.1.1 pred failed, while 0.90 pred success #5841

xgboost 1.1.1 pred failed, while 0.90 pred success #5841

zwqjoy commented Jun 30, 2020 •

edited by hcho3

Loading

trivialfis commented Jun 30, 2020

zwqjoy commented Jul 2, 2020

trivialfis commented Jul 2, 2020 •

edited

Loading

trivialfis commented Jul 2, 2020

hcho3 commented Jul 6, 2020 •

edited

Loading

ranInc commented Jul 7, 2020 •

edited

Loading

trivialfis commented Jul 13, 2020

trivialfis commented Jul 13, 2020 •

edited

Loading

hcho3 commented Jul 14, 2020

lucagiovagnoli commented Jul 23, 2020

hcho3 commented Jul 23, 2020

lucagiovagnoli commented Jul 23, 2020

trivialfis commented Jul 29, 2020

zwqjoy commented Aug 13, 2020 •

edited

Loading

trivialfis commented Aug 13, 2020

mshergad commented Jan 12, 2022

trivialfis commented Jan 13, 2022

trivialfis commented Jan 13, 2022

mshergad commented Jan 13, 2022

mshergad commented Jan 13, 2022

trivialfis commented Jan 13, 2022

mshergad commented Jan 13, 2022

xgboost 1.1.1 pred failed, while 0.90 pred success #5841

xgboost 1.1.1 pred failed, while 0.90 pred success #5841

Comments

zwqjoy commented Jun 30, 2020 • edited by hcho3 Loading

trivialfis commented Jun 30, 2020

zwqjoy commented Jul 2, 2020

trivialfis commented Jul 2, 2020 • edited Loading

trivialfis commented Jul 2, 2020

hcho3 commented Jul 6, 2020 • edited Loading

ranInc commented Jul 7, 2020 • edited Loading

trivialfis commented Jul 13, 2020

trivialfis commented Jul 13, 2020 • edited Loading

hcho3 commented Jul 14, 2020

lucagiovagnoli commented Jul 23, 2020

hcho3 commented Jul 23, 2020

lucagiovagnoli commented Jul 23, 2020

trivialfis commented Jul 29, 2020

zwqjoy commented Aug 13, 2020 • edited Loading

trivialfis commented Aug 13, 2020

mshergad commented Jan 12, 2022

trivialfis commented Jan 13, 2022

trivialfis commented Jan 13, 2022

mshergad commented Jan 13, 2022

mshergad commented Jan 13, 2022

trivialfis commented Jan 13, 2022

mshergad commented Jan 13, 2022

zwqjoy commented Jun 30, 2020 •

edited by hcho3

Loading

trivialfis commented Jul 2, 2020 •

edited

Loading

hcho3 commented Jul 6, 2020 •

edited

Loading

ranInc commented Jul 7, 2020 •

edited

Loading

trivialfis commented Jul 13, 2020 •

edited

Loading

zwqjoy commented Aug 13, 2020 •

edited

Loading