=========================== Metrics and score functions =========================== .. note:: This section provides an overview of metrics and score functions and how they are used in the context of genetic optimization. It also provides examples of how to use the score functions provided by the library. Introduction ------------ Metrics and score functions are used to evaluate the performance of machine learning algorithms. They take as input the true labels\values of the data, the predicted labels\values of the data, a metric (e.g. accuracy, precision, rmse, mse, etc.) and a scoring function or strategy (train/test split, k-fold cross validation, stratified cross validation, time series split, etc.). The score function then calculates a score that quantifies how well the machine learning algorithm performed on the given data. Two main types of metrics are commonly used in machine learning: - Classification metrics: These are used to evaluate the performance of classification algorithms. They include metrics such as accuracy, precision, recall, F1 score, etc. - Regression metrics: These are used to evaluate the performance of regression algorithms. They include metrics such as mean squared error, mean absolute error, R-squared, etc. In the context of genetic optimization, the fitness score is the metric aimed to be maximized. When passing a classification or regression estimator mloptimizer will use by default the following score functions: - Classification: balanced_accuracy_score - Regression: rmse In the case of the regression metrics, we maximize the negative of the metric, so the optimization is done in the same way as in the classification case. However, the user can pass any score function. Metrics ------- The `metrics` input argument in the `Optimizer` class is a dictionary that maps a metric name to a metric function. This function can be one of the metrics provided by the `sklearn.metrics` module, or a custom metric function that shoutd comply with the sklearn library metric functions. Here's an example of how to use the `Optimizer` class with custom metrics: .. code-block:: python from sklearn.metrics import balanced_accuracy_score, mean_squared_error from mloptimizer.application import Optimizer from mloptimizer.domain.hyperspace import HyperparameterSpace from sklearn.ensemble import RandomForestRegressor regression_metrics = { "mse": mean_squared_error, "rmse": root_mean_squared_error } evolvable_hyperparams = HyperparameterSpace.get_default_hyperparameter_space(RandomForestRegressor) mlopt = Optimizer(estimator_class=RandomForestRegressor, hyperparam_space=evolvable_hyperparams, fitness_score='rmse', metrics=regression_metrics, features=X, labels=y) Score Functions --------------- Not only the metric should be defined, how the data is used or split to calculate the metric is also important. The `model_evaluation.py` module provides score functions that can be used to evaluate the performance of machine learning algorithms. These score functions take a estimator, features, labels, and a score metric as input, and return a score that quantifies how well the classifier performed on the given data. The `model_evaluation.py` module provides the following score functions: - `train_score`: This function trains a classifier with the provided features and labels, and then calculates the score over the train data. - `train_test_score`: This function splits the provided features and labels into a training set and a test set. Then, it trains an estimator on the training set and calculates a score on the test set using the provided score function. - `kfold_score`: This function evaluates an estimator using K-Fold cross-validation. It splits the provided features and labels into K folds, trains an estimator on K-1 folds, and calculates a score on the remaining fold. This process is repeated K times, and the function returns the average score across all folds. - `kfold_stratified_score`: This function is similar to `kfold_score`, but it uses stratified K-Fold cross-validation. This means that it preserves the percentage of samples for each class in each fold. For classification problems, this can help ensure that each fold has a representative sample of each class. - `temporal_kfold_score`: This function is similar to `kfold_score`, but it uses temporal K-Fold cross-validation. This means that it respects the order of the data, making it suitable for time series data in order to avoid look-ahead bias. Each of these score functions takes a classifier, features, and labels as input. They also take a score metric as input, which is used to calculate the score. The score function could be any function that takes the true labels\values and the predicted labels\values as input and returns a score. Examples of score functions include accuracy, precision, recall, F1 score, etc. Examples -------- Here's an example of how to use the `train_score` function: .. code-block:: python from mloptimizer.domain.evaluation import model_evaluation from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score # Define features, labels, and classifier from sklearn.datasets import load_iris features, labels = load_iris(return_X_y=True) clf = RandomForestClassifier() # Use the train_score function score = model_evaluation.train_score(features, labels, clf, metrics={"accuracy": accuracy_score}) In this example, we first define the features, labels, and classifier. We then use the `train_score` function to train the classifier and calculate the score. The `accuracy_score` function from `sklearn.metrics` is used as the score function.