Quickstart example#

Quick example of use of the library to optimize a decision tree classifier. Firstly, we import the necessary libraries to get data and plot the results.

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from mloptimizer.interfaces import HyperparameterSpaceBuilder, GeneticSearch

Load the iris dataset to obtain a vector of features X and a vector of labels y. Another dataset or a custom one can be used

X, y = load_iris(return_X_y=True)

Split the dataset into training and test sets

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Define the HyperparameterSpace, you can use the default hyperparameters for the machine learning model that you want to optimize. In this case we use the default hyperparameters for a DecisionTreeClassifier. Another dataset or a custom one can be used

hyperparam_space = HyperparameterSpaceBuilder.get_default_space(estimator_class=DecisionTreeClassifier)

The GeneticSearch class is the main wrapper for the optimization of a machine learning model.

opt = GeneticSearch(
        estimator_class=DecisionTreeClassifier, hyperparam_space=hyperparam_space,
        **{"generations": 30, "population_size": 100}
    )

To optimizer the classifier we need to call the optimize_clf method. The method returns the best classifier with the best hyperparameters found.

opt.fit(X, y)

print(opt.best_estimator_)
WARNING:root:The folder . already exists and it will be used
INFO:mloptimizer.log:Initiating genetic optimization...
INFO:mloptimizer.log:Algorithm: Optimizer

Genetic execution:   0%|          | 0/31 [00:00<?, ?it/s, best fitness=?]
Genetic execution:   3%|▎         | 1/31 [00:00<00:00, 146.20it/s, best fitness=0.96]
Genetic execution:   3%|▎         | 1/31 [00:00<00:00, 31.73it/s, best fitness=0.973]
Genetic execution:   6%|▋         | 2/31 [00:00<00:02, 10.84it/s, best fitness=0.973]
Genetic execution:  13%|█▎        | 4/31 [00:00<00:03,  7.16it/s, best fitness=0.973]
Genetic execution:  16%|█▌        | 5/31 [00:00<00:03,  7.13it/s, best fitness=0.973]
Genetic execution:  16%|█▌        | 5/31 [00:00<00:03,  7.13it/s, best fitness=0.98]
Genetic execution:  19%|█▉        | 6/31 [00:00<00:03,  7.10it/s, best fitness=0.98]
Genetic execution:  23%|██▎       | 7/31 [00:00<00:03,  7.04it/s, best fitness=0.98]
Genetic execution:  26%|██▌       | 8/31 [00:01<00:03,  7.00it/s, best fitness=0.98]
Genetic execution:  29%|██▉       | 9/31 [00:01<00:03,  6.95it/s, best fitness=0.98]
Genetic execution:  32%|███▏      | 10/31 [00:01<00:03,  6.90it/s, best fitness=0.98]
Genetic execution:  35%|███▌      | 11/31 [00:01<00:02,  6.86it/s, best fitness=0.98]
Genetic execution:  39%|███▊      | 12/31 [00:01<00:02,  6.82it/s, best fitness=0.98]
Genetic execution:  42%|████▏     | 13/31 [00:01<00:02,  6.76it/s, best fitness=0.98]
Genetic execution:  45%|████▌     | 14/31 [00:01<00:02,  6.71it/s, best fitness=0.98]
Genetic execution:  48%|████▊     | 15/31 [00:02<00:02,  6.67it/s, best fitness=0.98]
Genetic execution:  52%|█████▏    | 16/31 [00:02<00:02,  6.63it/s, best fitness=0.98]
Genetic execution:  55%|█████▍    | 17/31 [00:02<00:02,  6.59it/s, best fitness=0.98]
Genetic execution:  58%|█████▊    | 18/31 [00:02<00:01,  6.55it/s, best fitness=0.98]
Genetic execution:  61%|██████▏   | 19/31 [00:02<00:01,  6.50it/s, best fitness=0.98]
Genetic execution:  65%|██████▍   | 20/31 [00:02<00:01,  6.45it/s, best fitness=0.98]
Genetic execution:  68%|██████▊   | 21/31 [00:03<00:01,  6.40it/s, best fitness=0.98]
Genetic execution:  71%|███████   | 22/31 [00:03<00:01,  6.35it/s, best fitness=0.98]
Genetic execution:  74%|███████▍  | 23/31 [00:03<00:01,  6.30it/s, best fitness=0.98]
Genetic execution:  77%|███████▋  | 24/31 [00:03<00:01,  6.25it/s, best fitness=0.98]
Genetic execution:  81%|████████  | 25/31 [00:03<00:00,  6.21it/s, best fitness=0.98]
Genetic execution:  84%|████████▍ | 26/31 [00:03<00:00,  6.17it/s, best fitness=0.98]
Genetic execution:  87%|████████▋ | 27/31 [00:04<00:00,  6.12it/s, best fitness=0.98]
Genetic execution:  90%|█████████ | 28/31 [00:04<00:00,  6.08it/s, best fitness=0.98]
Genetic execution:  94%|█████████▎| 29/31 [00:04<00:00,  6.04it/s, best fitness=0.98]
Genetic execution:  97%|█████████▋| 30/31 [00:04<00:00,  6.00it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:04<00:00,  5.95it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:04<00:00,  6.37it/s, best fitness=0.98]
DecisionTreeClassifier(ccp_alpha=0.00135, max_depth=13,
                       min_impurity_decrease=0.005, min_samples_split=15,
                       random_state=141580)

Train the classifier with the best hyperparameters found Show the classification report and the confusion matrix

from sklearn.metrics import classification_report, confusion_matrix, \
    ConfusionMatrixDisplay
import matplotlib.pyplot as plt

y_pred = opt.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(classification_report(y_test, y_pred))
disp = ConfusionMatrixDisplay.from_predictions(
    y_test, y_pred, display_labels=opt.best_estimator_.classes_,
    cmap=plt.cm.Blues
)
disp.plot()
plt.show()

del opt
  • plot quickstart
  • plot quickstart
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         9
           1       1.00      0.92      0.96        13
           2       0.89      1.00      0.94         8

    accuracy                           0.97        30
   macro avg       0.96      0.97      0.97        30
weighted avg       0.97      0.97      0.97        30

Total running time of the script: (0 minutes 6.367 seconds)

Gallery generated by Sphinx-Gallery