Evolution (logbook) graph#

mloptimizer provides a function to plot the evolution of the fitness function.

from mloptimizer.core import Optimizer
from mloptimizer.hyperparams import HyperparameterSpace
from sklearn.tree import DecisionTreeClassifier
from mloptimizer.aux.plots import plotly_logbook
import plotly
import os
from sklearn.datasets import load_iris

Load the iris dataset to obtain a vector of features X and a vector of labels y. Another dataset or a custom one can be used

X, y = load_iris(return_X_y=True)

Define the HyperparameterSpace, you can use the default hyperparameters for the machine learning model that you want to optimize. In this case we use the default hyperparameters for a DecisionTreeClassifier. Another dataset or a custom one can be used

We use the default TreeOptimizer class to optimize a decision tree classifier.

opt = Optimizer(estimator_class=DecisionTreeClassifier, features=X, labels=y,
                hyperparam_space=hyperparam_space, folder="Evolution_example")
ERROR:root:The folder Evolution_example could not be created.

To optimizer the classifier we need to call the optimize_clf method. The first argument is the number of generations and the second is the number of individuals in each generation.

clf = opt.optimize_clf(10, 10)
INFO:mloptimizer.log:Initiating genetic optimization...
INFO:mloptimizer.log:Algorithm: Optimizer

Genetic execution:   0%|          | 0/11 [00:00<?, ?it/s]
Genetic execution:  55%|█████▍    | 6/11 [00:00<00:00, 51.16it/s]
Genetic execution: 100%|██████████| 11/11 [00:00<00:00, 45.31it/s]

We can plot the evolution of the fitness function. The black lines represent the max and min fitness values across all generations. The green, red and blue line are respectively the max, min and avg fitness value for each generation. Each grey point in the graph represents an individual.

population_df = opt.runs[-1].population_2_df()
g_logbook = plotly_logbook(opt.logbook, population_df)

At the end of the evolution the graph is saved as an html at the path:

['logbook.html', 'search_space.html']

The data to generate the graph is available at the path:


del opt
['populations.csv', 'logbook.csv']

Total running time of the script: (0 minutes 3.966 seconds)

