imbalanced-learn API¶
This is the full API documentation of the imbalanced-learn toolbox.
Under-sampling methods¶
The imblearn.under_sampling provides methods to under-sample
a dataset.
Prototype generation¶
The imblearn.under_sampling.prototype_generation submodule contains
methods that generate new samples in order to balance the dataset.
under_sampling.ClusterCentroids([ratio, ...]) |
Perform under-sampling by generating centroids based on clustering methods. |
Prototype selection¶
The imblearn.under_sampling.prototype_selection submodule contains
methods that select samples in order to balance the dataset.
under_sampling.CondensedNearestNeighbour([...]) |
Class to perform under-sampling based on the condensed nearest neighbour method. |
under_sampling.EditedNearestNeighbours([...]) |
Class to perform under-sampling based on the edited nearest neighbour method. |
under_sampling.RepeatedEditedNearestNeighbours([...]) |
Class to perform under-sampling based on the repeated edited nearest neighbour method. |
under_sampling.AllKNN([ratio, ...]) |
Class to perform under-sampling based on the AllKNN method. |
under_sampling.InstanceHardnessThreshold([...]) |
Class to perform under-sampling based on the instance hardness threshold. |
under_sampling.NearMiss([ratio, ...]) |
Class to perform under-sampling based on NearMiss methods. |
under_sampling.NeighbourhoodCleaningRule([...]) |
Class performing under-sampling based on the neighbourhood cleaning rule. |
under_sampling.OneSidedSelection([ratio, ...]) |
Class to perform under-sampling based on one-sided selection method. |
under_sampling.RandomUnderSampler([ratio, ...]) |
Class to perform random under-sampling. |
under_sampling.TomekLinks([ratio, ...]) |
Class to perform under-sampling by removing Tomek’s links. |
Over-sampling methods¶
The imblearn.over_sampling provides a set of method to
perform over-sampling.
over_sampling.ADASYN([ratio, random_state, ...]) |
Perform over-sampling using ADASYN. |
over_sampling.RandomOverSampler([ratio, ...]) |
Class to perform random over-sampling. |
over_sampling.SMOTE([ratio, random_state, ...]) |
Class to perform over-sampling using SMOTE. |
Combination of over- and under-sampling methods¶
The imblearn.combine provides methods which combine
over-sampling and under-sampling.
combine.SMOTEENN([ratio, random_state, ...]) |
Class to perform over-sampling using SMOTE and cleaning using ENN. |
combine.SMOTETomek([ratio, random_state, ...]) |
Class to perform over-sampling using SMOTE and cleaning using Tomek links. |
Ensemble methods¶
The imblearn.ensemble module include methods generating
under-sampled subsets combined inside an ensemble.
ensemble.BalanceCascade([ratio, ...]) |
Create an ensemble of balanced sets by iteratively under-sampling the imbalanced dataset using an estimator. |
ensemble.EasyEnsemble([ratio, ...]) |
Create an ensemble sets by iteratively applying random under-sampling. |
Pipeline¶
The imblearn.pipeline module implements utilities to build a
composite estimator, as a chain of transforms, samples and estimators.
pipeline.Pipeline(steps[, memory]) |
Pipeline of transforms and resamples with a final estimator. |
pipeline.make_pipeline(*steps) |
Construct a Pipeline from the given estimators. |
Metrics¶
The imblearn.metrics module includes score functions, performance
metrics and pairwise metrics and distance computations.
metrics.classification_report_imbalanced(...) |
Build a classification report based on metrics used with imbalanced |
metrics.sensitivity_specificity_support(...) |
Compute sensitivity, specificity, and support for each class |
metrics.sensitivity_score(y_true, y_pred[, ...]) |
Compute the sensitivity |
metrics.specificity_score(y_true, y_pred[, ...]) |
Compute the specificity |
metrics.geometric_mean_score(y_true, y_pred) |
Compute the geometric mean |
metrics.make_index_balanced_accuracy([...]) |
Balance any scoring function using the index balanced accuracy |
Datasets¶
The imblearn.datasets provides methods to generate
imbalanced data.
datasets.make_imbalance(X, y, ratio[, ...]) |
Turns a dataset into an imbalanced dataset at specific ratio. |
datasets.fetch_datasets([data_home, ...]) |
Load the benchmark datasets from Zenodo, downloading it if necessary. |
Utilities¶
The imblearn.utils module includes various utilities.
utils.estimator_checks.check_estimator(Estimator) |
Check if estimator adheres to scikit-learn conventions and |
utils.check_neighbors_object(nn_name, nn_object) |
Check the objects is consistent to be a NN. |
utils.check_ratio(ratio, y, sampling_type) |
Ratio validation for samplers. |
utils.hash_X_y(X, y[, n_samples]) |
Compute hash of the input arrays. |