Quickstart example#

Quick example of use of the library to optimize a decision tree classifier. Firstly, we import the necessary libraries to get data and plot the results.

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from mloptimizer.interfaces import HyperparameterSpaceBuilder, GeneticSearch

Load the iris dataset to obtain a vector of features X and a vector of labels y. Another dataset or a custom one can be used

X, y = load_iris(return_X_y=True)

Split the dataset into training and test sets

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Define the HyperparameterSpace, you can use the default hyperparameters for the machine learning model that you want to optimize. In this case we use the default hyperparameters for a DecisionTreeClassifier. Another dataset or a custom one can be used

hyperparam_space = HyperparameterSpaceBuilder.get_default_space(estimator_class=DecisionTreeClassifier)

The GeneticSearch class is the main wrapper for the optimization of a machine learning model.

opt = GeneticSearch(
        estimator_class=DecisionTreeClassifier, hyperparam_space=hyperparam_space,
        **{"generations": 30, "population_size": 100}
    )

To optimizer the classifier we need to call the optimize_clf method. The method returns the best classifier with the best hyperparameters found.

opt.fit(X, y)

print(opt.best_estimator_)
WARNING:root:The folder . already exists and it will be used
INFO:mloptimizer.log:Initiating genetic optimization...
INFO:mloptimizer.log:Algorithm: Optimizer

Genetic execution:   0%|          | 0/31 [00:00<?, ?it/s, best fitness=?]
Genetic execution:   3%|▎         | 1/31 [00:00<00:00, 120.90it/s, best fitness=0.96]
Genetic execution:   3%|▎         | 1/31 [00:00<00:02, 12.32it/s, best fitness=0.98]
Genetic execution:   6%|▋         | 2/31 [00:00<00:03,  7.96it/s, best fitness=0.98]
Genetic execution:  10%|▉         | 3/31 [00:00<00:04,  6.63it/s, best fitness=0.98]
Genetic execution:  13%|█▎        | 4/31 [00:00<00:05,  5.26it/s, best fitness=0.98]
Genetic execution:  16%|█▌        | 5/31 [00:00<00:04,  5.26it/s, best fitness=0.98]
Genetic execution:  19%|█▉        | 6/31 [00:01<00:04,  5.28it/s, best fitness=0.98]
Genetic execution:  23%|██▎       | 7/31 [00:01<00:04,  5.27it/s, best fitness=0.98]
Genetic execution:  26%|██▌       | 8/31 [00:01<00:04,  5.26it/s, best fitness=0.98]
Genetic execution:  29%|██▉       | 9/31 [00:01<00:04,  5.25it/s, best fitness=0.98]
Genetic execution:  32%|███▏      | 10/31 [00:01<00:04,  5.23it/s, best fitness=0.98]
Genetic execution:  35%|███▌      | 11/31 [00:02<00:03,  5.21it/s, best fitness=0.98]
Genetic execution:  39%|███▊      | 12/31 [00:02<00:03,  5.17it/s, best fitness=0.98]
Genetic execution:  42%|████▏     | 13/31 [00:02<00:03,  5.13it/s, best fitness=0.98]
Genetic execution:  45%|████▌     | 14/31 [00:02<00:03,  5.09it/s, best fitness=0.98]
Genetic execution:  48%|████▊     | 15/31 [00:02<00:03,  5.06it/s, best fitness=0.98]
Genetic execution:  52%|█████▏    | 16/31 [00:03<00:02,  5.05it/s, best fitness=0.98]
Genetic execution:  55%|█████▍    | 17/31 [00:03<00:02,  5.01it/s, best fitness=0.98]
Genetic execution:  58%|█████▊    | 18/31 [00:03<00:02,  4.99it/s, best fitness=0.98]
Genetic execution:  61%|██████▏   | 19/31 [00:03<00:02,  4.94it/s, best fitness=0.98]
Genetic execution:  65%|██████▍   | 20/31 [00:03<00:02,  4.91it/s, best fitness=0.98]
Genetic execution:  68%|██████▊   | 21/31 [00:04<00:02,  4.89it/s, best fitness=0.98]
Genetic execution:  71%|███████   | 22/31 [00:04<00:01,  4.86it/s, best fitness=0.98]
Genetic execution:  74%|███████▍  | 23/31 [00:04<00:01,  4.82it/s, best fitness=0.98]
Genetic execution:  77%|███████▋  | 24/31 [00:04<00:01,  4.78it/s, best fitness=0.98]
Genetic execution:  81%|████████  | 25/31 [00:04<00:01,  4.74it/s, best fitness=0.98]
Genetic execution:  84%|████████▍ | 26/31 [00:05<00:01,  4.73it/s, best fitness=0.98]
Genetic execution:  87%|████████▋ | 27/31 [00:05<00:00,  4.69it/s, best fitness=0.98]
Genetic execution:  90%|█████████ | 28/31 [00:05<00:00,  4.67it/s, best fitness=0.98]
Genetic execution:  94%|█████████▎| 29/31 [00:05<00:00,  4.65it/s, best fitness=0.98]
Genetic execution:  97%|█████████▋| 30/31 [00:05<00:00,  4.63it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:06<00:00,  4.61it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:06<00:00,  4.86it/s, best fitness=0.98]
DecisionTreeClassifier(ccp_alpha=0.0025, max_depth=12,
                       min_impurity_decrease=0.006, min_samples_split=18,
                       random_state=619697)

Train the classifier with the best hyperparameters found Show the classification report and the confusion matrix

from sklearn.metrics import classification_report, confusion_matrix, \
    ConfusionMatrixDisplay
import matplotlib.pyplot as plt

y_pred = opt.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(classification_report(y_test, y_pred))
disp = ConfusionMatrixDisplay.from_predictions(
    y_test, y_pred, display_labels=opt.best_estimator_.classes_,
    cmap=plt.cm.Blues
)
disp.plot()
plt.show()

del opt
  • plot quickstart
  • plot quickstart
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      0.88      0.93         8
           2       0.92      1.00      0.96        12

    accuracy                           0.97        30
   macro avg       0.97      0.96      0.96        30
weighted avg       0.97      0.97      0.97        30

Total running time of the script: (0 minutes 8.023 seconds)

Gallery generated by Sphinx-Gallery