how to print the second best pipeline? #1229

m-alshehri · 2021-09-27T11:10:07Z

Hello,
I was just wondering if there's any way to print out the confusion matrix, classification report and the pipeline for the second-best pipeline?

the model now is printing the best pipeline as below but would also like to print the second-best pipeline.

model = TPOTClassifier(generations=10, scoring='balanced_accuracy', verbosity=2)
model.fit(X_train, y_train)

Optimization Progress: 48%
530/1100 [2:07:27<3:41:41, 23.34s/pipeline]

Generation 1 - Current best internal CV score: 0.8820838802533277
Generation 2 - Current best internal CV score: 0.8828284663262757
Generation 3 - Current best internal CV score: 0.8828284663262757
Generation 4 - Current best internal CV score: 0.8842320902149032
Generation 5 - Current best internal CV score: 0.8842320902149032
Generation 6 - Current best internal CV score: 0.8842320902149032
Generation 7 - Current best internal CV score: 0.8842320902149032
Generation 8 - Current best internal CV score: 0.8842320902149032
Generation 9 - Current best internal CV score: 0.8842320902149032
Generation 10 - Current best internal CV score: 0.8842320902149032

Best pipeline: BernoulliNB(KNeighborsClassifier(input_matrix, n_neighbors=41, p=1, weights=uniform), alpha=0.01, fit_prior=True)
TPOTClassifier(config_dict=None, crossover_rate=0.1, cv=5,
               disable_update_check=False, early_stop=None, generations=10,
               log_file=None, max_eval_time_mins=5, max_time_mins=None,
               memory=None, mutation_rate=0.9, n_jobs=1, offspring_size=None,
               periodic_checkpoint_folder=None, population_size=100,
               random_state=None, scoring='balanced_accuracy', subsample=1.0,
               template=None, use_dask=False, verbosity=2, warm_start=False)

Acc.: 0.8521771865980675
              precision    recall  f1-score   support

           0       1.00      0.85      0.92      7902
           1       0.05      0.97      0.10        67

    accuracy                           0.85      7969
   macro avg       0.53      0.91      0.51      7969
weighted avg       0.99      0.85      0.91      7969

Confusion Matrix:
[[6726 1176]
 [   2   65]]

Apologies if this was previously asked but searching Second Best returned nothing

Thanks,
m-alshehri

The text was updated successfully, but these errors were encountered:

wayneking517 · 2021-11-03T17:25:17Z

Give this a try:

my_dict = list(tpot.evaluated_individuals_.items())

model_scores = pd.DataFrame()
for model in my_dict:
    model_name = model[0]
    model_info = model[1]
    cv_score = model[1].get('internal_cv_score')  # Pull out cv_score as a column (i.e., sortable)
    model_scores = model_scores.append({'model': model_name,
                                        'cv_score': cv_score,
                                        'model_info': model_info,},
                                       ignore_index=True)

model_scores = model_scores.sort_values('cv_score', ascending=False)
top_models = model_scores.iloc[0:5,:]
top_models.to_csv('top_models.csv', index = False)

See https://github.com/EpistasisLab/tpot/issues/703

perib mentioned this issue Sep 21, 2023

TPOT2 and the future of TPOT development -- From the Devs #1322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to print the second best pipeline? #1229

how to print the second best pipeline? #1229

m-alshehri commented Sep 27, 2021 •

edited

Loading

wayneking517 commented Nov 3, 2021

how to print the second best pipeline? #1229

how to print the second best pipeline? #1229

Comments

m-alshehri commented Sep 27, 2021 • edited Loading

wayneking517 commented Nov 3, 2021

m-alshehri commented Sep 27, 2021 •

edited

Loading