Note
Go to the end to download the full example code
Quickstart example#
Quick example of use of the library to optimize a decision tree classifier. Firstly, we import the necessary libraries to get data and plot the results.
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from mloptimizer.interfaces import HyperparameterSpaceBuilder, GeneticSearch
Load the iris dataset to obtain a vector of features X and a vector of labels y. Another dataset or a custom one can be used
Split the dataset into training and test sets
Define the HyperparameterSpace, you can use the default hyperparameters for the machine learning model that you want to optimize. In this case we use the default hyperparameters for a DecisionTreeClassifier. Another dataset or a custom one can be used
hyperparam_space = HyperparameterSpaceBuilder.get_default_space(estimator_class=DecisionTreeClassifier)
The GeneticSearch class is the main wrapper for the optimization of a machine learning model.
opt = GeneticSearch(
estimator_class=DecisionTreeClassifier, hyperparam_space=hyperparam_space,
**{"generations": 30, "population_size": 100}
)
To optimizer the classifier we need to call the optimize_clf method. The method returns the best classifier with the best hyperparameters found.
WARNING:root:The folder . already exists and it will be used
INFO:mloptimizer.log:Initiating genetic optimization...
INFO:mloptimizer.log:Algorithm: Optimizer
Genetic execution: 0%| | 0/31 [00:00<?, ?it/s, best fitness=?]
Genetic execution: 3%|▎ | 1/31 [00:00<00:00, 120.90it/s, best fitness=0.96]
Genetic execution: 3%|▎ | 1/31 [00:00<00:02, 12.32it/s, best fitness=0.98]
Genetic execution: 6%|▋ | 2/31 [00:00<00:03, 7.96it/s, best fitness=0.98]
Genetic execution: 10%|▉ | 3/31 [00:00<00:04, 6.63it/s, best fitness=0.98]
Genetic execution: 13%|█▎ | 4/31 [00:00<00:05, 5.26it/s, best fitness=0.98]
Genetic execution: 16%|█▌ | 5/31 [00:00<00:04, 5.26it/s, best fitness=0.98]
Genetic execution: 19%|█▉ | 6/31 [00:01<00:04, 5.28it/s, best fitness=0.98]
Genetic execution: 23%|██▎ | 7/31 [00:01<00:04, 5.27it/s, best fitness=0.98]
Genetic execution: 26%|██▌ | 8/31 [00:01<00:04, 5.26it/s, best fitness=0.98]
Genetic execution: 29%|██▉ | 9/31 [00:01<00:04, 5.25it/s, best fitness=0.98]
Genetic execution: 32%|███▏ | 10/31 [00:01<00:04, 5.23it/s, best fitness=0.98]
Genetic execution: 35%|███▌ | 11/31 [00:02<00:03, 5.21it/s, best fitness=0.98]
Genetic execution: 39%|███▊ | 12/31 [00:02<00:03, 5.17it/s, best fitness=0.98]
Genetic execution: 42%|████▏ | 13/31 [00:02<00:03, 5.13it/s, best fitness=0.98]
Genetic execution: 45%|████▌ | 14/31 [00:02<00:03, 5.09it/s, best fitness=0.98]
Genetic execution: 48%|████▊ | 15/31 [00:02<00:03, 5.06it/s, best fitness=0.98]
Genetic execution: 52%|█████▏ | 16/31 [00:03<00:02, 5.05it/s, best fitness=0.98]
Genetic execution: 55%|█████▍ | 17/31 [00:03<00:02, 5.01it/s, best fitness=0.98]
Genetic execution: 58%|█████▊ | 18/31 [00:03<00:02, 4.99it/s, best fitness=0.98]
Genetic execution: 61%|██████▏ | 19/31 [00:03<00:02, 4.94it/s, best fitness=0.98]
Genetic execution: 65%|██████▍ | 20/31 [00:03<00:02, 4.91it/s, best fitness=0.98]
Genetic execution: 68%|██████▊ | 21/31 [00:04<00:02, 4.89it/s, best fitness=0.98]
Genetic execution: 71%|███████ | 22/31 [00:04<00:01, 4.86it/s, best fitness=0.98]
Genetic execution: 74%|███████▍ | 23/31 [00:04<00:01, 4.82it/s, best fitness=0.98]
Genetic execution: 77%|███████▋ | 24/31 [00:04<00:01, 4.78it/s, best fitness=0.98]
Genetic execution: 81%|████████ | 25/31 [00:04<00:01, 4.74it/s, best fitness=0.98]
Genetic execution: 84%|████████▍ | 26/31 [00:05<00:01, 4.73it/s, best fitness=0.98]
Genetic execution: 87%|████████▋ | 27/31 [00:05<00:00, 4.69it/s, best fitness=0.98]
Genetic execution: 90%|█████████ | 28/31 [00:05<00:00, 4.67it/s, best fitness=0.98]
Genetic execution: 94%|█████████▎| 29/31 [00:05<00:00, 4.65it/s, best fitness=0.98]
Genetic execution: 97%|█████████▋| 30/31 [00:05<00:00, 4.63it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:06<00:00, 4.61it/s, best fitness=0.98]
Genetic execution: 100%|██████████| 31/31 [00:06<00:00, 4.86it/s, best fitness=0.98]
DecisionTreeClassifier(ccp_alpha=0.0025, max_depth=12,
min_impurity_decrease=0.006, min_samples_split=18,
random_state=619697)
Train the classifier with the best hyperparameters found Show the classification report and the confusion matrix
from sklearn.metrics import classification_report, confusion_matrix, \
ConfusionMatrixDisplay
import matplotlib.pyplot as plt
y_pred = opt.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(classification_report(y_test, y_pred))
disp = ConfusionMatrixDisplay.from_predictions(
y_test, y_pred, display_labels=opt.best_estimator_.classes_,
cmap=plt.cm.Blues
)
disp.plot()
plt.show()
del opt
precision recall f1-score support
0 1.00 1.00 1.00 10
1 1.00 0.88 0.93 8
2 0.92 1.00 0.96 12
accuracy 0.97 30
macro avg 0.97 0.96 0.96 30
weighted avg 0.97 0.97 0.97 30
Total running time of the script: (0 minutes 8.023 seconds)

