Linear Models Optimization#

Hyperparameter optimization for Ridge, Lasso, and ElasticNet regression.

from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge, Lasso, ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np
import plotly
from mloptimizer.interfaces import HyperparameterSpaceBuilder, GeneticSearch
from mloptimizer.application.reporting.plots import plotly_search_space, plotly_logbook

Load and prepare the dataset

print("Loading Diabetes dataset...")
data = load_diabetes()
X, y = data.data, data.target

print(f"Dataset shape: {X.shape}")
Loading Diabetes dataset...
Dataset shape: (442, 10)

Split the data

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

Optimize Ridge regression

print("\n=== Ridge Regression ===")
hyperparam_space = HyperparameterSpaceBuilder.get_default_space(
    estimator_class=Ridge
)

opt = GeneticSearch(
    estimator_class=Ridge,
    hyperparam_space=hyperparam_space,
    generations=5,
    population_size=8,
    seed=42,
    use_mlflow=False,
    use_parallel=False
)

opt.fit(X_train, y_train)
y_pred = opt.best_estimator_.predict(X_test)
print(f"Ridge - Best params: {opt.best_params_}")
print(f"Ridge - Test R2: {r2_score(y_test, y_pred):.4f}")
=== Ridge Regression ===
/home/docs/checkouts/readthedocs.org/user_builds/mloptimizer/checkouts/master/examples/plot_linear_models.py:37: UserWarning: Expected mutations per offspring is very low (0.32). With mutpb=0.8, indpb=0.2, and 2 hyperparameters, the population will converge prematurely. Recommended: mutpb >= 0.8, indpb >= 0.2 (gives ~0.3 mutations/offspring).
  opt = GeneticSearch(
/home/docs/checkouts/readthedocs.org/user_builds/mloptimizer/checkouts/master/examples/plot_linear_models.py:37: UserWarning: Some hyperparameters have very small integer ranges (< 10 distinct values): 'solver' (6 values: 0 to 5). Small ranges limit search granularity. Consider increasing the range or scale for float types.
  opt = GeneticSearch(

Genetic execution:   0%|          | 0/6 [00:00<?, ?it/s, best fitness=?]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 645.97it/s, best fitness=58.2]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 226.76it/s, best fitness=54.6]
Genetic execution: 100%|██████████| 6/6 [00:00<00:00, 119.07it/s, best fitness=54.6]
Ridge - Best params: {'alpha': 0.26, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, 'positive': False, 'random_state': 42, 'solver': 'saga', 'tol': 0.0001}
Ridge - Test R2: 0.4599

Visualize Ridge optimization

population_df = opt.populations_
g_logbook = plotly_logbook(opt.logbook_, population_df)
g_logbook.update_layout(
    title="Ridge Regression Optimization Evolution",
    autosize=True,
    width=None,
    height=500
)
plotly.io.show(g_logbook, config={'responsive': True})

Optimize ElasticNet

print("\n=== ElasticNet Regression ===")
hyperparam_space = HyperparameterSpaceBuilder.get_default_space(
    estimator_class=ElasticNet
)

opt = GeneticSearch(
    estimator_class=ElasticNet,
    hyperparam_space=hyperparam_space,
    generations=5,
    population_size=8,
    seed=42,
    use_mlflow=False,
    use_parallel=False
)

opt.fit(X_train, y_train)
y_pred = opt.best_estimator_.predict(X_test)
print(f"ElasticNet - Best params: {opt.best_params_}")
print(f"ElasticNet - Test R2: {r2_score(y_test, y_pred):.4f}")
=== ElasticNet Regression ===
/home/docs/checkouts/readthedocs.org/user_builds/mloptimizer/checkouts/master/examples/plot_linear_models.py:71: UserWarning: Expected mutations per offspring is very low (0.48). With mutpb=0.8, indpb=0.2, and 3 hyperparameters, the population will converge prematurely. Recommended: mutpb >= 0.8, indpb >= 0.2 (gives ~0.5 mutations/offspring).
  opt = GeneticSearch(
/home/docs/checkouts/readthedocs.org/user_builds/mloptimizer/checkouts/master/examples/plot_linear_models.py:71: UserWarning: Some hyperparameters have very small integer ranges (< 10 distinct values): 'selection' (2 values: 0 to 1). Small ranges limit search granularity. Consider increasing the range or scale for float types.
  opt = GeneticSearch(

Genetic execution:   0%|          | 0/6 [00:00<?, ?it/s, best fitness=?]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 373.26it/s, best fitness=73.8]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 156.00it/s, best fitness=69.4]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 129.98it/s, best fitness=68.7]
Genetic execution:  17%|█▋        | 1/6 [00:00<00:00, 98.35it/s, best fitness=58.2]
Genetic execution:  33%|███▎      | 2/6 [00:00<00:00, 137.47it/s, best fitness=53.8]
Genetic execution: 100%|██████████| 6/6 [00:00<00:00, 140.47it/s, best fitness=53.8]
ElasticNet - Best params: {'alpha': 0.0001, 'copy_X': True, 'fit_intercept': True, 'l1_ratio': 0.095, 'max_iter': 2000, 'positive': False, 'precompute': False, 'random_state': 42, 'selection': 'cyclic', 'tol': 0.0001, 'warm_start': False}
ElasticNet - Test R2: 0.4579

Visualize ElasticNet optimization

population_df = opt.populations_
g_logbook = plotly_logbook(opt.logbook_, population_df)
g_logbook.update_layout(
    title="ElasticNet Regression Optimization Evolution",
    autosize=True,
    width=None,
    height=500
)
plotly.io.show(g_logbook, config={'responsive': True})

Analyze optimization performance

print("\n=== Optimization Performance ===")
print(f"Unique evaluations performed: {opt.n_trials_}")
print(f"Total individuals in population history: {len(opt.populations_)}")
print(f"Optimization time: {opt.optimization_time_:.4f} seconds")
print(f"Time per evaluation: {opt.optimization_time_ / opt.n_trials_:.4f} seconds")
=== Optimization Performance ===
Unique evaluations performed: 28
Total individuals in population history: 48
Optimization time: 0.0440 seconds
Time per evaluation: 0.0016 seconds

Total running time of the script: (0 minutes 0.330 seconds)

Gallery generated by Sphinx-Gallery