Hyperparameters tuning#
Previous notebooks showed how model parameters impact statistical performance. We want to optimize these parameters to achieve the best possible model performance. This optimization process is called hyperparameter tuning.
This notebook demonstrates several methods to tune model hyperparameters.
Introductory example#
We revisit an example from the linear models notebook about the impact of the \(\alpha\)
parameter in a Ridge
model. The \(\alpha\) parameter controls model regularization
strength. No general rule exists for selecting a good \(\alpha\) value - it depends on
the specific dataset.
Let’s load a dataset for regression:
# When using JupyterLite, uncomment and install the `skrub` and `pyodide-http` packages.
%pip install skrub
%pip install pyodide-http
import matplotlib.pyplot as plt
import skrub
# import pyodide_http
# pyodide_http.patch_all()
skrub.patch_display() # makes nice display for pandas tables
/home/runner/work/traces-sklearn/traces-sklearn/.pixi/envs/docs/bin/python: No module named pip
Note: you may need to restart the kernel to use updated packages.
/home/runner/work/traces-sklearn/traces-sklearn/.pixi/envs/docs/bin/python: No module named pip
Note: you may need to restart the kernel to use updated packages.
from sklearn.datasets import fetch_california_housing
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X
Processing column 1 / 8
Processing column 2 / 8
Processing column 3 / 8
Processing column 4 / 8
Processing column 5 / 8
Processing column 6 / 8
Processing column 7 / 8
Processing column 8 / 8
MedInc | HouseAge | AveRooms | AveBedrms | Population | AveOccup | Latitude | Longitude | |
---|---|---|---|---|---|---|---|---|
0 | 8.3252 | 41.0 | 6.984126984126984 | 1.0238095238095237 | 322.0 | 2.5555555555555554 | 37.88 | -122.23 |
1 | 8.3014 | 21.0 | 6.238137082601054 | 0.9718804920913884 | 2401.0 | 2.109841827768014 | 37.86 | -122.22 |
2 | 7.2574 | 52.0 | 8.288135593220339 | 1.073446327683616 | 496.0 | 2.8022598870056497 | 37.85 | -122.24 |
3 | 5.6431 | 52.0 | 5.8173515981735155 | 1.0730593607305936 | 558.0 | 2.547945205479452 | 37.85 | -122.25 |
4 | 3.8462 | 52.0 | 6.281853281853282 | 1.0810810810810811 | 565.0 | 2.1814671814671813 | 37.85 | -122.25 |
20635 | 1.5603 | 25.0 | 5.045454545454546 | 1.1333333333333333 | 845.0 | 2.5606060606060606 | 39.48 | -121.09 |
20636 | 2.5568 | 18.0 | 6.114035087719298 | 1.3157894736842106 | 356.0 | 3.1228070175438596 | 39.49 | -121.21 |
20637 | 1.7 | 17.0 | 5.20554272517321 | 1.120092378752887 | 1007.0 | 2.325635103926097 | 39.43 | -121.22 |
20638 | 1.8672 | 18.0 | 5.329512893982808 | 1.171919770773639 | 741.0 | 2.1232091690544412 | 39.43 | -121.32 |
20639 | 2.3886 | 16.0 | 5.254716981132075 | 1.1622641509433962 | 1387.0 | 2.616981132075472 | 39.37 | -121.24 |
MedInc
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12,928 (62.6%)
- Mean ± Std
- 3.87 ± 1.90
- Median ± IQR
- 3.53 ± 2.18
- Min | Max
- 0.500 | 15.0
HouseAge
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 52 (0.3%)
- Mean ± Std
- 28.6 ± 12.6
- Median ± IQR
- 29.0 ± 19.0
- Min | Max
- 1.00 | 52.0
AveRooms
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 19,392 (94.0%)
- Mean ± Std
- 5.43 ± 2.47
- Median ± IQR
- 5.23 ± 1.61
- Min | Max
- 0.846 | 142.
AveBedrms
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 14,233 (69.0%)
- Mean ± Std
- 1.10 ± 0.474
- Median ± IQR
- 1.05 ± 0.0934
- Min | Max
- 0.333 | 34.1
Population
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 3,888 (18.8%)
- Mean ± Std
- 1.43e+03 ± 1.13e+03
- Median ± IQR
- 1.17e+03 ± 938.
- Min | Max
- 3.00 | 3.57e+04
AveOccup
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 18,841 (91.3%)
- Mean ± Std
- 3.07 ± 10.4
- Median ± IQR
- 2.82 ± 0.852
- Min | Max
- 0.692 | 1.24e+03
Latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 862 (4.2%)
- Mean ± Std
- 35.6 ± 2.14
- Median ± IQR
- 34.3 ± 3.78
- Min | Max
- 32.5 | 42.0
Longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 844 (4.1%)
- Mean ± Std
- -120. ± 2.00
- Median ± IQR
- -118. ± 3.79
- Min | Max
- -124. | -114.
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | MedInc | Float64DType | 0 (0.0%) | 12928 (62.6%) | 3.87 | 1.90 | 0.500 | 3.53 | 15.0 |
1 | HouseAge | Float64DType | 0 (0.0%) | 52 (0.3%) | 28.6 | 12.6 | 1.00 | 29.0 | 52.0 |
2 | AveRooms | Float64DType | 0 (0.0%) | 19392 (94.0%) | 5.43 | 2.47 | 0.846 | 5.23 | 142. |
3 | AveBedrms | Float64DType | 0 (0.0%) | 14233 (69.0%) | 1.10 | 0.474 | 0.333 | 1.05 | 34.1 |
4 | Population | Float64DType | 0 (0.0%) | 3888 (18.8%) | 1.43e+03 | 1.13e+03 | 3.00 | 1.17e+03 | 3.57e+04 |
5 | AveOccup | Float64DType | 0 (0.0%) | 18841 (91.3%) | 3.07 | 10.4 | 0.692 | 2.82 | 1.24e+03 |
6 | Latitude | Float64DType | 0 (0.0%) | 862 (4.2%) | 35.6 | 2.14 | 32.5 | 34.3 | 42.0 |
7 | Longitude | Float64DType | 0 (0.0%) | 844 (4.1%) | -120. | 2.00 | -124. | -118. | -114. |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
MedInc
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12,928 (62.6%)
- Mean ± Std
- 3.87 ± 1.90
- Median ± IQR
- 3.53 ± 2.18
- Min | Max
- 0.500 | 15.0
HouseAge
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 52 (0.3%)
- Mean ± Std
- 28.6 ± 12.6
- Median ± IQR
- 29.0 ± 19.0
- Min | Max
- 1.00 | 52.0
AveRooms
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 19,392 (94.0%)
- Mean ± Std
- 5.43 ± 2.47
- Median ± IQR
- 5.23 ± 1.61
- Min | Max
- 0.846 | 142.
AveBedrms
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 14,233 (69.0%)
- Mean ± Std
- 1.10 ± 0.474
- Median ± IQR
- 1.05 ± 0.0934
- Min | Max
- 0.333 | 34.1
Population
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 3,888 (18.8%)
- Mean ± Std
- 1.43e+03 ± 1.13e+03
- Median ± IQR
- 1.17e+03 ± 938.
- Min | Max
- 3.00 | 3.57e+04
AveOccup
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 18,841 (91.3%)
- Mean ± Std
- 3.07 ± 10.4
- Median ± IQR
- 2.82 ± 0.852
- Min | Max
- 0.692 | 1.24e+03
Latitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 862 (4.2%)
- Mean ± Std
- 35.6 ± 2.14
- Median ± IQR
- 34.3 ± 3.78
- Min | Max
- 32.5 | 42.0
Longitude
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 844 (4.1%)
- Mean ± Std
- -120. ± 2.00
- Median ± IQR
- -118. ± 3.79
- Min | Max
- -124. | -114.
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
AveRooms | AveBedrms | 0.765 |
Latitude | Longitude | 0.520 |
Population | AveOccup | 0.261 |
MedInc | AveRooms | 0.202 |
HouseAge | Longitude | 0.167 |
HouseAge | Latitude | 0.137 |
AveOccup | Longitude | 0.121 |
HouseAge | Population | 0.121 |
AveBedrms | Longitude | 0.120 |
AveBedrms | Latitude | 0.102 |
AveRooms | Latitude | 0.0911 |
AveRooms | Longitude | 0.0906 |
MedInc | Latitude | 0.0902 |
MedInc | Longitude | 0.0846 |
HouseAge | AveOccup | 0.0845 |
MedInc | HouseAge | 0.0810 |
MedInc | AveOccup | 0.0807 |
AveOccup | Latitude | 0.0796 |
AveBedrms | AveOccup | 0.0708 |
HouseAge | AveBedrms | 0.0680 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
y
0 4.526
1 3.585
2 3.521
3 3.413
4 3.422
...
20635 0.781
20636 0.771
20637 0.923
20638 0.847
20639 0.894
Name: MedHouseVal, Length: 20640, dtype: float64
Now we define a Ridge
model that processes data by adding feature interactions using
a PolynomialFeatures
transformer.
from sklearn.linear_model import Ridge
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
model = Pipeline(
[
("poly", PolynomialFeatures()),
("scaler", StandardScaler()),
("ridge", Ridge()),
]
)
model
Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())])
PolynomialFeatures()
StandardScaler()
Ridge()
We start with scikit-learn’s default parameters. Let’s evaluate this basic model:
import pandas as pd
from sklearn.model_selection import KFold, cross_validate
cv = KFold(n_splits=10, shuffle=True, random_state=42)
cv_results = cross_validate(model, X, y, cv=cv)
cv_results = pd.DataFrame(cv_results)
cv_results
Processing column 1 / 3
Processing column 2 / 3
Processing column 3 / 3
fit_time | score_time | test_score | |
---|---|---|---|
0 | 0.018707275390625 | 0.002956390380859375 | 0.6399100590601892 |
1 | 0.019881725311279297 | 0.002814054489135742 | 0.6187311881404927 |
2 | 0.01949334144592285 | 0.0027313232421875 | 0.6759248551427355 |
3 | 0.019518136978149414 | 0.002726316452026367 | 0.6192682129762996 |
4 | 0.019322872161865234 | 0.002720355987548828 | 0.6693539312328332 |
5 | 0.024540424346923828 | 0.0027267932891845703 | 0.6481732142854086 |
6 | 0.019704341888427734 | 0.00274658203125 | -6.694355651567401 |
7 | 0.01975083351135254 | 0.0027985572814941406 | 0.6978353386945753 |
8 | 0.019664525985717773 | 0.0027227401733398438 | 0.6641263749290154 |
9 | 0.01936817169189453 | 0.002648591995239258 | 0.3299956235146686 |
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.0200 ± 0.00163
- Median ± IQR
- 0.0195 ± 0.000383
- Min | Max
- 0.0187 | 0.0245
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00276 ± 8.27e-05
- Median ± IQR
- 0.00273 ± 7.58e-05
- Min | Max
- 0.00265 | 0.00296
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- -0.113 ± 2.31
- Median ± IQR
- 0.640 ± 0.0506
- Min | Max
- -6.69 | 0.698
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | fit_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.0200 | 0.00163 | 0.0187 | 0.0195 | 0.0245 |
1 | score_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.00276 | 8.27e-05 | 0.00265 | 0.00273 | 0.00296 |
2 | test_score | Float64DType | 0 (0.0%) | 10 (100.0%) | -0.113 | 2.31 | -6.69 | 0.640 | 0.698 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.0200 ± 0.00163
- Median ± IQR
- 0.0195 ± 0.000383
- Min | Max
- 0.0187 | 0.0245
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00276 ± 8.27e-05
- Median ± IQR
- 0.00273 ± 7.58e-05
- Min | Max
- 0.00265 | 0.00296
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- -0.113 ± 2.31
- Median ± IQR
- 0.640 ± 0.0506
- Min | Max
- -6.69 | 0.698
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
score_time | test_score | 1.00 |
fit_time | test_score | 1.00 |
fit_time | score_time | 1.00 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
cv_results.aggregate(["mean", "std"])
Processing column 1 / 3
Processing column 2 / 3
Processing column 3 / 3
fit_time | score_time | test_score | |
---|---|---|---|
mean | 0.01999516487121582 | 0.0027591705322265623 | -0.11310368535911826 |
std | 0.0016298939889766457 | 8.271920681182051e-05 | 2.314789459200725 |
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 0.0108 ± 0.0130
- Median ± IQR
- 0.00163 ± 0.0184
- Min | Max
- 0.00163 | 0.0200
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 0.00142 ± 0.00189
- Median ± IQR
- 8.27e-05 ± 0.00268
- Min | Max
- 8.27e-05 | 0.00276
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 1.10 ± 1.72
- Median ± IQR
- -0.113 ± 2.43
- Min | Max
- -0.113 | 2.31
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | fit_time | Float64DType | 0 (0.0%) | 2 (100.0%) | 0.0108 | 0.0130 | 0.00163 | 0.00163 | 0.0200 |
1 | score_time | Float64DType | 0 (0.0%) | 2 (100.0%) | 0.00142 | 0.00189 | 8.27e-05 | 8.27e-05 | 0.00276 |
2 | test_score | Float64DType | 0 (0.0%) | 2 (100.0%) | 1.10 | 1.72 | -0.113 | -0.113 | 2.31 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 0.0108 ± 0.0130
- Median ± IQR
- 0.00163 ± 0.0184
- Min | Max
- 0.00163 | 0.0200
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 0.00142 ± 0.00189
- Median ± IQR
- 8.27e-05 ± 0.00268
- Min | Max
- 8.27e-05 | 0.00276
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 2 (100.0%)
- Mean ± Std
- 1.10 ± 1.72
- Median ± IQR
- -0.113 ± 2.43
- Min | Max
- -0.113 | 2.31
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
score_time | test_score | 1.00 |
fit_time | test_score | 1.00 |
fit_time | score_time | 1.00 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
Nothing indicates our pipeline achieves optimal performance. The PolynomialFeatures
degree might need adjustment or the Ridge
regressor might need different
regularization. Let’s examine which parameters we could tune:
for params in model.get_params():
print(params)
memory
steps
verbose
poly
scaler
ridge
poly__degree
poly__include_bias
poly__interaction_only
poly__order
scaler__copy
scaler__with_mean
scaler__with_std
ridge__alpha
ridge__copy_X
ridge__fit_intercept
ridge__max_iter
ridge__positive
ridge__random_state
ridge__solver
ridge__tol
Two key parameters are scaler__degree
and ridge__alpha
. We will find
their optimal values for this dataset.
Manual hyperparameters search#
Before exploring scikit-learn’s automated tuning tools, we implement a simplified manual version.
EXERCISE:
Create nested
for
loops to try all parameter combinations defined inparameter_grid
In the inner loop, use cross-validation on the training set to get an array of scores
Compute the mean and standard deviation of cross-validation scores to find the best hyperparameters
Train a model with the best hyperparameters and evaluate it on the test set
# Write your code here.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
parameter_grid = {
"poly__degree": [1, 2, 3],
"ridge__alpha": [0.01, 0.1, 1, 10],
}
Hyperparameters search using a grid#
Our manual search implements a grid-search: trying every possible parameter
combination. Scikit-learn provides GridSearchCV
to automate this process. During
fitting, it performs cross-validation and selects optimal hyperparameters.
from sklearn.model_selection import GridSearchCV
search_cv = GridSearchCV(model, param_grid=parameter_grid)
search_cv.fit(X_train, y_train)
GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_grid={'poly__degree': [1, 2, 3], 'ridge__alpha': [0.01, 0.1, 1, 10]})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_grid={'poly__degree': [1, 2, 3], 'ridge__alpha': [0.01, 0.1, 1, 10]})
Pipeline(steps=[('poly', PolynomialFeatures(degree=1)), ('scaler', StandardScaler()), ('ridge', Ridge(alpha=0.01))])
PolynomialFeatures(degree=1)
StandardScaler()
Ridge(alpha=0.01)
The best_params_
attribute shows the optimal parameters found:
search_cv.best_params_
{'poly__degree': 1, 'ridge__alpha': 0.01}
The cv_results_
attribute provides details about all hyperparameter combinations
tried during fitting:
cv_results = pd.DataFrame(search_cv.cv_results_)
cv_results
Processing column 1 / 15
Processing column 2 / 15
Processing column 3 / 15
Processing column 4 / 15
Processing column 5 / 15
Processing column 6 / 15
Processing column 7 / 15
Processing column 8 / 15
Processing column 9 / 15
Processing column 10 / 15
Processing column 11 / 15
Processing column 12 / 15
Processing column 13 / 15
Processing column 14 / 15
Processing column 15 / 15
mean_fit_time | std_fit_time | mean_score_time | std_score_time | param_poly__degree | param_ridge__alpha | params | split0_test_score | split1_test_score | split2_test_score | split3_test_score | split4_test_score | mean_test_score | std_test_score | rank_test_score | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.00489354133605957 | 0.0003960187286927179 | 0.0012660503387451171 | 2.889782678243082e-05 | 1 | 0.01 | {'poly__degree': 1, 'ridge__alpha': 0.01} | 0.6006669698467644 | 0.6195832769358425 | 0.6043652050440378 | 0.610342428232162 | 0.6008419080639655 | 0.6071599576245544 | 0.007132380589920894 | 1 |
1 | 0.004678249359130859 | 3.1259212565534815e-05 | 0.0012323379516601563 | 8.839382887357707e-06 | 1 | 0.1 | {'poly__degree': 1, 'ridge__alpha': 0.1} | 0.6006674922391301 | 0.6195818885036616 | 0.6043640943149728 | 0.6103430627707596 | 0.6008427468003833 | 0.6071598569257814 | 0.007131796913965441 | 2 |
2 | 0.004703521728515625 | 4.9519716803387654e-05 | 0.0012369155883789062 | 1.6005683994190612e-05 | 1 | 1.0 | {'poly__degree': 1, 'ridge__alpha': 1} | 0.6006726008885412 | 0.6195679077514673 | 0.6043528934661087 | 0.6103492989236327 | 0.6008510147942704 | 0.607158743164804 | 0.007125970574526951 | 3 |
3 | 0.00465850830078125 | 2.9205799098890593e-05 | 0.0012303829193115235 | 1.1889989757925948e-05 | 1 | 10.0 | {'poly__degree': 1, 'ridge__alpha': 10} | 0.6007124027499688 | 0.619418684865192 | 0.6042317421971115 | 0.6104009819271281 | 0.6009220083375948 | 0.6071371640153991 | 0.0070687404126070314 | 4 |
4 | 0.01208944320678711 | 0.00023756706767123664 | 0.0024726390838623047 | 1.8272882706031533e-05 | 2 | 0.01 | {'poly__degree': 2, 'ridge__alpha': 0.01} | 0.34439818963562785 | 0.6795898872720656 | -83.25638601643489 | 0.661788106621743 | -0.5015587337774408 | -16.414433713336578 | 33.42372695873222 | 9 |
7 | 0.012063169479370117 | 5.216315190294471e-05 | 0.002458047866821289 | 1.852891274464732e-05 | 2 | 10.0 | {'poly__degree': 2, 'ridge__alpha': 10} | 0.6360666320573516 | 0.6614254980059 | -3.755639939454153 | 0.6202921233200607 | 0.5358859436497205 | -0.26039394848422404 | 1.7481308203604538 | 6 |
8 | 0.04103531837463379 | 0.00039803279086202755 | 0.006585311889648437 | 4.4458359373772325e-05 | 3 | 0.01 | {'poly__degree': 3, 'ridge__alpha': 0.01} | -0.7791345695282557 | 0.635723083061451 | -310146.52431198815 | -12.837606958033792 | 0.5574330392879037 | -62031.789579478675 | 124057.36746906178 | 12 |
9 | 0.04113626480102539 | 0.0003634544657675747 | 0.0065863609313964845 | 1.9255839308954993e-05 | 3 | 0.1 | {'poly__degree': 3, 'ridge__alpha': 0.1} | 0.188162821072569 | 0.640981127131061 | -21166.11794008665 | -8.557584801002648 | -0.7411898172164526 | -4234.917514151333 | 8465.600877943767 | 11 |
10 | 0.04114651679992676 | 0.0004601394079351644 | 0.00659632682800293 | 3.426200244554189e-05 | 3 | 1.0 | {'poly__degree': 3, 'ridge__alpha': 1} | 0.6160114071632994 | 0.673883757075249 | -262.322392563248 | -3.070713254080597 | 0.3521583244401585 | -52.75021046572998 | 104.79551612923026 | 10 |
11 | 0.04080071449279785 | 7.681750421851132e-05 | 0.0065539836883544925 | 4.682934194856267e-05 | 3 | 10.0 | {'poly__degree': 3, 'ridge__alpha': 10} | 0.6705097321118163 | 0.6831188832171505 | -14.335144143838047 | -0.2471894629014415 | 0.48894584733730617 | -2.5479518288146434 | 5.903430704200351 | 8 |
mean_fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.0193 ± 0.0164
- Median ± IQR
- 0.0121 ± 0.0359
- Min | Max
- 0.00466 | 0.0411
std_fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.000181 ± 0.000176
- Median ± IQR
- 7.68e-05 ± 0.000314
- Min | Max
- 8.65e-06 | 0.000460
mean_score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.00343 ± 0.00238
- Median ± IQR
- 0.00247 ± 0.00529
- Min | Max
- 0.00123 | 0.00660
std_score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 2.45e-05 ± 1.21e-05
- Median ± IQR
- 2.05e-05 ± 1.06e-05
- Min | Max
- 8.84e-06 | 4.68e-05
param_poly__degree
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (25.0%)
- Mean ± Std
- 2.00 ± 0.853
- Median ± IQR
- 2 ± 2
- Min | Max
- 1 | 3
param_ridge__alpha
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 4 (33.3%)
- Mean ± Std
- 2.78 ± 4.37
- Median ± IQR
- 1.00 ± 0.900
- Min | Max
- 0.0100 | 10.0
params
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
Most frequent values
split0_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.438 ± 0.409
- Median ± IQR
- 0.601 ± 0.0886
- Min | Max
- -0.779 | 0.671
split1_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.650 ± 0.0262
- Median ± IQR
- 0.661 ± 0.0543
- Min | Max
- 0.619 | 0.683
split2_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -2.76e+04 ± 8.92e+04
- Median ± IQR
- -3.76 ± 83.9
- Min | Max
- -3.10e+05 | 0.604
split3_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -1.64 ± 4.45
- Median ± IQR
- 0.610 ± 0.867
- Min | Max
- -12.8 | 0.662
split4_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.331 ± 0.458
- Median ± IQR
- 0.557 ± 0.249
- Min | Max
- -0.741 | 0.601
mean_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -5.53e+03 ± 1.78e+04
- Median ± IQR
- -0.260 ± 17.0
- Min | Max
- -6.20e+04 | 0.607
std_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 1.11e+04 ± 3.57e+04
- Median ± IQR
- 4.80 ± 33.4
- Min | Max
- 0.00707 | 1.24e+05
rank_test_score
Int32DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 6.50 ± 3.61
- Median ± IQR
- 7 ± 5
- Min | Max
- 1 | 12
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | mean_fit_time | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.0193 | 0.0164 | 0.00466 | 0.0121 | 0.0411 |
1 | std_fit_time | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.000181 | 0.000176 | 8.65e-06 | 7.68e-05 | 0.000460 |
2 | mean_score_time | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.00343 | 0.00238 | 0.00123 | 0.00247 | 0.00660 |
3 | std_score_time | Float64DType | 0 (0.0%) | 12 (100.0%) | 2.45e-05 | 1.21e-05 | 8.84e-06 | 2.05e-05 | 4.68e-05 |
4 | param_poly__degree | Int64DType | 0 (0.0%) | 3 (25.0%) | 2.00 | 0.853 | 1 | 2 | 3 |
5 | param_ridge__alpha | Float64DType | 0 (0.0%) | 4 (33.3%) | 2.78 | 4.37 | 0.0100 | 1.00 | 10.0 |
6 | params | ObjectDType | 0 (0.0%) | 12 (100.0%) | |||||
7 | split0_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.438 | 0.409 | -0.779 | 0.601 | 0.671 |
8 | split1_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.650 | 0.0262 | 0.619 | 0.661 | 0.683 |
9 | split2_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | -2.76e+04 | 8.92e+04 | -3.10e+05 | -3.76 | 0.604 |
10 | split3_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | -1.64 | 4.45 | -12.8 | 0.610 | 0.662 |
11 | split4_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | 0.331 | 0.458 | -0.741 | 0.557 | 0.601 |
12 | mean_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | -5.53e+03 | 1.78e+04 | -6.20e+04 | -0.260 | 0.607 |
13 | std_test_score | Float64DType | 0 (0.0%) | 12 (100.0%) | 1.11e+04 | 3.57e+04 | 0.00707 | 4.80 | 1.24e+05 |
14 | rank_test_score | Int32DType | 0 (0.0%) | 12 (100.0%) | 6.50 | 3.61 | 1 | 7 | 12 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
mean_fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.0193 ± 0.0164
- Median ± IQR
- 0.0121 ± 0.0359
- Min | Max
- 0.00466 | 0.0411
std_fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.000181 ± 0.000176
- Median ± IQR
- 7.68e-05 ± 0.000314
- Min | Max
- 8.65e-06 | 0.000460
mean_score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.00343 ± 0.00238
- Median ± IQR
- 0.00247 ± 0.00529
- Min | Max
- 0.00123 | 0.00660
std_score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 2.45e-05 ± 1.21e-05
- Median ± IQR
- 2.05e-05 ± 1.06e-05
- Min | Max
- 8.84e-06 | 4.68e-05
param_poly__degree
Int64DType- Null values
- 0 (0.0%)
- Unique values
- 3 (25.0%)
- Mean ± Std
- 2.00 ± 0.853
- Median ± IQR
- 2 ± 2
- Min | Max
- 1 | 3
param_ridge__alpha
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 4 (33.3%)
- Mean ± Std
- 2.78 ± 4.37
- Median ± IQR
- 1.00 ± 0.900
- Min | Max
- 0.0100 | 10.0
params
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
Most frequent values
split0_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.438 ± 0.409
- Median ± IQR
- 0.601 ± 0.0886
- Min | Max
- -0.779 | 0.671
split1_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.650 ± 0.0262
- Median ± IQR
- 0.661 ± 0.0543
- Min | Max
- 0.619 | 0.683
split2_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -2.76e+04 ± 8.92e+04
- Median ± IQR
- -3.76 ± 83.9
- Min | Max
- -3.10e+05 | 0.604
split3_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -1.64 ± 4.45
- Median ± IQR
- 0.610 ± 0.867
- Min | Max
- -12.8 | 0.662
split4_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 0.331 ± 0.458
- Median ± IQR
- 0.557 ± 0.249
- Min | Max
- -0.741 | 0.601
mean_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- -5.53e+03 ± 1.78e+04
- Median ± IQR
- -0.260 ± 17.0
- Min | Max
- -6.20e+04 | 0.607
std_test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 1.11e+04 ± 3.57e+04
- Median ± IQR
- 4.80 ± 33.4
- Min | Max
- 0.00707 | 1.24e+05
rank_test_score
Int32DType- Null values
- 0 (0.0%)
- Unique values
- 12 (100.0%)
- Mean ± Std
- 6.50 ± 3.61
- Median ± IQR
- 7 ± 5
- Min | Max
- 1 | 12
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
split3_test_score | mean_test_score | 1.00 |
split2_test_score | split3_test_score | 1.00 |
split1_test_score | std_test_score | 1.00 |
split2_test_score | mean_test_score | 1.00 |
param_poly__degree | params | 1.00 |
std_score_time | std_test_score | 1.00 |
param_poly__degree | rank_test_score | 1.00 |
param_poly__degree | mean_test_score | 1.00 |
param_poly__degree | std_test_score | 1.00 |
mean_score_time | param_poly__degree | 1.00 |
mean_score_time | std_score_time | 1.00 |
mean_fit_time | std_score_time | 1.00 |
mean_fit_time | param_poly__degree | 1.00 |
std_score_time | split1_test_score | 1.00 |
mean_score_time | split1_test_score | 1.00 |
mean_score_time | std_test_score | 1.00 |
std_score_time | param_poly__degree | 1.00 |
param_poly__degree | split3_test_score | 1.00 |
param_poly__degree | split2_test_score | 1.00 |
param_poly__degree | split1_test_score | 1.00 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
When refit=True
(default), the search trains a final model using the best
parameters. Access this model through best_estimator_
:
search_cv.best_estimator_
Pipeline(steps=[('poly', PolynomialFeatures(degree=1)), ('scaler', StandardScaler()), ('ridge', Ridge(alpha=0.01))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('poly', PolynomialFeatures(degree=1)), ('scaler', StandardScaler()), ('ridge', Ridge(alpha=0.01))])
PolynomialFeatures(degree=1)
StandardScaler()
Ridge(alpha=0.01)
The best_estimator_
handles predict
and score
calls to GridSearchCV
:
search_cv.score(X_test, y_test)
0.5910512173880501
EXERCISE:
GridSearchCV
behaves like any classifier or regressor. Use cross_validate
to
evaluate the grid-search model we created.
# Write your code here.
QUESTION:
What limitations does the grid-search approach have?
Randomized hyperparameters search#
Grid-search has two main limitations:
It explores only predefined parameter combinations
Adding parameters or values exponentially increases search cost
RandomizedSearchCV
draws parameter values from specified distributions. This allows
non-grid exploration of the hyperparameter space with a fixed computational budget.
import numpy as np
from scipy.stats import loguniform
parameter_distributions = {
"poly__degree": np.arange(1, 5),
"ridge__alpha": loguniform(1, 3),
}
from sklearn.model_selection import RandomizedSearchCV
search_cv = RandomizedSearchCV(
model,
param_distributions=parameter_distributions,
n_iter=10,
)
cv_results = cross_validate(search_cv, X, y, cv=cv, return_estimator=True)
cv_results = pd.DataFrame(cv_results)
cv_results
Processing column 1 / 4
Processing column 2 / 4
Processing column 3 / 4
Processing column 4 / 4
fit_time | score_time | estimator | test_score | |
---|---|---|---|---|
0 | 4.7333292961120605 | 0.0014634132385253906 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe99ce301d0>}) | 0.5808366314612614 |
1 | 4.893654108047485 | 0.0014805793762207031 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe9919502c0>}) | 0.5703051333563266 |
2 | 3.4716989994049072 | 0.0014336109161376953 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe991a783b0>}) | 0.6343470015551649 |
3 | 2.3472373485565186 | 0.0014469623565673828 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe991962690>}) | 0.5945379765309682 |
4 | 1.8519458770751953 | 0.0014674663543701172 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe991849b80>}) | 0.6156023629870204 |
5 | 4.121352910995483 | 0.0014858245849609375 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe99184a4b0>}) | 0.6026663562284889 |
6 | 1.639528751373291 | 0.0014646053314208984 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe99170cfe0>}) | 0.5906916700247318 |
7 | 2.5602898597717285 | 0.0014247894287109375 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe9918480b0>}) | 0.6399079727815573 |
8 | 2.6855757236480713 | 0.0014030933380126953 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe994b06d80>}) | 0.5778167439406549 |
9 | 2.6311373710632324 | 0.0013957023620605469 | RandomizedSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), param_distributions={'poly__degree': array([1, 2, 3, 4]), 'ridge__alpha': <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7fe99190e570>}) | 0.5941123095328664 |
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 3.09 ± 1.15
- Median ± IQR
- 2.63 ± 1.77
- Min | Max
- 1.64 | 4.89
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00145 ± 3.14e-05
- Median ± IQR
- 0.00145 ± 4.27e-05
- Min | Max
- 0.00140 | 0.00149
estimator
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
Most frequent values
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.600 ± 0.0234
- Median ± IQR
- 0.594 ± 0.0348
- Min | Max
- 0.570 | 0.640
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | fit_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 3.09 | 1.15 | 1.64 | 2.63 | 4.89 |
1 | score_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.00145 | 3.14e-05 | 0.00140 | 0.00145 | 0.00149 |
2 | estimator | ObjectDType | 0 (0.0%) | 10 (100.0%) | |||||
3 | test_score | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.600 | 0.0234 | 0.570 | 0.594 | 0.640 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 3.09 ± 1.15
- Median ± IQR
- 2.63 ± 1.77
- Min | Max
- 1.64 | 4.89
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00145 ± 3.14e-05
- Median ± IQR
- 0.00145 ± 4.27e-05
- Min | Max
- 0.00140 | 0.00149
estimator
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
Most frequent values
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.600 ± 0.0234
- Median ± IQR
- 0.594 ± 0.0348
- Min | Max
- 0.570 | 0.640
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
estimator | test_score | 1.00 |
score_time | test_score | 1.00 |
score_time | estimator | 1.00 |
fit_time | test_score | 1.00 |
fit_time | estimator | 1.00 |
fit_time | score_time | 1.00 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
for est in cv_results["estimator"]:
print(est.best_params_)
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(1.032937010481195)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(2.798247993394346)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(1.6864144008291795)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(1.6094884877789617)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(2.413313312127482)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(1.5080367811202886)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(2.992045601403162)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(2.120544440284744)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(1.9016419082277325)}
{'poly__degree': np.int64(1), 'ridge__alpha': np.float64(2.5722744412709257)}
Model with internal hyperparameter tuning#
Some estimators include efficient hyperparameter selection, more efficient than
grid-search. These estimators typically end with CV
(e.g. RidgeCV
).
EXERCISE:
Create a pipeline with
PolynomialFeatures
,StandardScaler
, andRidge
Create a grid-search with this pipeline and tune
alpha
usingnp.logspace(-2, 2, num=50)
Fit the grid-search on the training set and time it
Repeat using
RidgeCV
instead ofRidge
and removeGridSearchCV
Compare computational performance between approaches
# Write your code here.
Inspection of hyperparameters in cross-validation#
When performing search cross-validation inside evaluation cross-validation, different
hyperparameter values may emerge for each split. Let’s examine this with
GridSearchCV
:
from sklearn.linear_model import RidgeCV
inner_model = Pipeline(
[
("poly", PolynomialFeatures()),
("scaler", StandardScaler()),
("ridge", Ridge()),
]
)
param_grid = {"poly__degree": [1, 2], "ridge__alpha": np.logspace(-2, 2, num=10)}
model = GridSearchCV(inner_model, param_grid=param_grid, n_jobs=-1)
model
GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])})In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])})
Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())])
PolynomialFeatures()
StandardScaler()
Ridge()
We run cross-validation and store models from each split by setting
return_estimator=True
:
cv_results = cross_validate(model, X, y, cv=cv, return_estimator=True)
cv_results = pd.DataFrame(cv_results)
cv_results
/home/runner/work/traces-sklearn/traces-sklearn/.pixi/envs/docs/lib/python3.12/site-packages/numpy/ma/core.py:2881: RuntimeWarning: invalid value encountered in cast
_data = np.array(data, dtype=dtype, copy=copy,
Processing column 1 / 4
Processing column 2 / 4
Processing column 3 / 4
Processing column 4 / 4
fit_time | score_time | estimator | test_score | |
---|---|---|---|---|
0 | 1.9368715286254883 | 0.001316070556640625 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.5808442167130727 |
1 | 0.6601998805999756 | 0.0012886524200439453 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.5701796215611854 |
2 | 0.6725852489471436 | 0.0022492408752441406 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.6895524773430483 |
3 | 0.7212886810302734 | 0.0012629032135009766 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.5945726758159438 |
4 | 0.6606185436248779 | 0.001260995864868164 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.6155481666595959 |
5 | 0.6604900360107422 | 0.0012595653533935547 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.6025090464871246 |
6 | 0.6734142303466797 | 0.0021719932556152344 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | -323.51571347387056 |
7 | 0.7168357372283936 | 0.0012288093566894531 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.6402155473385047 |
8 | 0.6574845314025879 | 0.0012354850769042969 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.5778732142789177 |
9 | 0.671797513961792 | 0.0012249946594238281 | GridSearchCV(estimator=Pipeline(steps=[('poly', PolynomialFeatures()), ('scaler', StandardScaler()), ('ridge', Ridge())]), n_jobs=-1, param_grid={'poly__degree': [1, 2], 'ridge__alpha': array([1.00000000e-02, 2.78255940e-02, 7.74263683e-02, 2.15443469e-01, 5.99484250e-01, 1.66810054e+00, 4.64158883e+00, 1.29154967e+01, 3.59381366e+01, 1.00000000e+02])}) | 0.5940497476452065 |
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.803 ± 0.399
- Median ± IQR
- 0.672 ± 0.0563
- Min | Max
- 0.657 | 1.94
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00145 ± 0.000402
- Median ± IQR
- 0.00126 ± 8.06e-05
- Min | Max
- 0.00122 | 0.00225
estimator
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
Most frequent values
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- -31.8 ± 102.
- Median ± IQR
- 0.594 ± 0.0377
- Min | Max
- -324. | 0.690
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column | Column name | dtype | Null values | Unique values | Mean | Std | Min | Median | Max |
---|---|---|---|---|---|---|---|---|---|
0 | fit_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.803 | 0.399 | 0.657 | 0.672 | 1.94 |
1 | score_time | Float64DType | 0 (0.0%) | 10 (100.0%) | 0.00145 | 0.000402 | 0.00122 | 0.00126 | 0.00225 |
2 | estimator | ObjectDType | 0 (0.0%) | 10 (100.0%) | |||||
3 | test_score | Float64DType | 0 (0.0%) | 10 (100.0%) | -31.8 | 102. | -324. | 0.594 | 0.690 |
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
fit_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.803 ± 0.399
- Median ± IQR
- 0.672 ± 0.0563
- Min | Max
- 0.657 | 1.94
score_time
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- 0.00145 ± 0.000402
- Median ± IQR
- 0.00126 ± 8.06e-05
- Min | Max
- 0.00122 | 0.00225
estimator
ObjectDType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
Most frequent values
test_score
Float64DType- Null values
- 0 (0.0%)
- Unique values
- 10 (100.0%)
- Mean ± Std
- -31.8 ± 102.
- Median ± IQR
- 0.594 ± 0.0377
- Min | Max
- -324. | 0.690
No columns match the selected filter: . You can change the column filter in the dropdown menu above.
Column 1 | Column 2 | Cramér's V |
---|---|---|
score_time | test_score | 1.00 |
fit_time | test_score | 1.00 |
fit_time | score_time | 1.00 |
estimator | test_score | 0.00 |
score_time | estimator | 0.00 |
fit_time | estimator | 0.00 |
Please enable javascript
The skrub table reports need javascript to display correctly. If you are displaying a report in a Jupyter notebook and you see this message, you may need to re-execute the cell or to trust the notebook (button on the top right or "File > Trust notebook").
The estimator
column contains the different estimators. We examine best_params_
from each GridSearchCV
:
for estimator_cv_fold in cv_results["estimator"]:
print(estimator_cv_fold.best_params_)
{'poly__degree': 1, 'ridge__alpha': np.float64(12.915496650148826)}
{'poly__degree': 1, 'ridge__alpha': np.float64(0.01)}
{'poly__degree': 2, 'ridge__alpha': np.float64(0.027825594022071243)}
{'poly__degree': 1, 'ridge__alpha': np.float64(35.93813663804626)}
{'poly__degree': 1, 'ridge__alpha': np.float64(12.915496650148826)}
{'poly__degree': 1, 'ridge__alpha': np.float64(12.915496650148826)}
{'poly__degree': 2, 'ridge__alpha': np.float64(0.01)}
{'poly__degree': 1, 'ridge__alpha': np.float64(35.93813663804626)}
{'poly__degree': 1, 'ridge__alpha': np.float64(12.915496650148826)}
{'poly__degree': 1, 'ridge__alpha': np.float64(12.915496650148826)}
This inspection reveals the stability of hyperparameter values across folds.
Note regarding the scoring metric to optimize during tuning#
The GridSearchCV
and RandomizedSearchCV
classes use the scoring
parameter to
define the metric to optimize during tuning. If not specified, the scoring metric used
for classification is accuracy
and the r2_score
for regression.
These scoring rules are actually not optimal for hyperparameter tuning. Indeed, we recently recognized that it is better to use proper scoring rules. Such scoring rules allow to get calibrated models.
Therefore, we recommend to use brier_score_loss
or log_loss
for classification
and mean_squared_error
for regression.