imbalanced-learn API¶

This is the full API documentation of the imbalanced-learn toolbox.

Under-sampling methods¶

The imblearn.under_sampling provides methods to under-sample a dataset.

The imblearn.under_sampling.prototype_generation submodule contains methods that generate new samples in order to balance the dataset.

under_sampling.ClusterCentroids([ratio, ...]) Perform under-sampling by generating centroids based on clustering methods.

The imblearn.under_sampling.prototype_selection submodule contains methods that select samples in order to balance the dataset.

`under_sampling.CondensedNearestNeighbour`([...])	Class to perform under-sampling based on the condensed nearest neighbour method.
`under_sampling.EditedNearestNeighbours`([...])	Class to perform under-sampling based on the edited nearest neighbour method.
`under_sampling.RepeatedEditedNearestNeighbours`([...])	Class to perform under-sampling based on the repeated edited nearest neighbour method.
`under_sampling.AllKNN`([ratio, ...])	Class to perform under-sampling based on the AllKNN method.
`under_sampling.InstanceHardnessThreshold`([...])	Class to perform under-sampling based on the instance hardness threshold.
`under_sampling.NearMiss`([ratio, ...])	Class to perform under-sampling based on NearMiss methods.
`under_sampling.NeighbourhoodCleaningRule`([...])	Class performing under-sampling based on the neighbourhood cleaning rule.
`under_sampling.OneSidedSelection`([ratio, ...])	Class to perform under-sampling based on one-sided selection method.
`under_sampling.RandomUnderSampler`([ratio, ...])	Class to perform random under-sampling.
`under_sampling.TomekLinks`([ratio, ...])	Class to perform under-sampling by removing Tomek’s links.

The imblearn.over_sampling provides a set of method to perform over-sampling.

`over_sampling.ADASYN`([ratio, random_state, ...])	Perform over-sampling using ADASYN.
`over_sampling.RandomOverSampler`([ratio, ...])	Class to perform random over-sampling.
`over_sampling.SMOTE`([ratio, random_state, ...])	Class to perform over-sampling using SMOTE.

The imblearn.combine provides methods which combine over-sampling and under-sampling.

`combine.SMOTEENN`([ratio, random_state, ...])	Class to perform over-sampling using SMOTE and cleaning using ENN.
`combine.SMOTETomek`([ratio, random_state, ...])	Class to perform over-sampling using SMOTE and cleaning using Tomek links.

The imblearn.ensemble module include methods generating under-sampled subsets combined inside an ensemble.

`ensemble.BalanceCascade`([ratio, ...])	Create an ensemble of balanced sets by iteratively under-sampling the imbalanced dataset using an estimator.
`ensemble.EasyEnsemble`([ratio, ...])	Create an ensemble sets by iteratively applying random under-sampling.

The imblearn.pipeline module implements utilities to build a composite estimator, as a chain of transforms, samples and estimators.

`pipeline.Pipeline`(steps[, memory])	Pipeline of transforms and resamples with a final estimator.
`pipeline.make_pipeline`(*steps)	Construct a Pipeline from the given estimators.

The imblearn.metrics module includes score functions, performance metrics and pairwise metrics and distance computations.

`metrics.classification_report_imbalanced`(...)	Build a classification report based on metrics used with imbalanced
`metrics.sensitivity_specificity_support`(...)	Compute sensitivity, specificity, and support for each class
`metrics.sensitivity_score`(y_true, y_pred[, ...])	Compute the sensitivity
`metrics.specificity_score`(y_true, y_pred[, ...])	Compute the specificity
`metrics.geometric_mean_score`(y_true, y_pred)	Compute the geometric mean
`metrics.make_index_balanced_accuracy`([...])	Balance any scoring function using the index balanced accuracy

The imblearn.datasets provides methods to generate imbalanced data.

`datasets.make_imbalance`(X, y, ratio[, ...])	Turns a dataset into an imbalanced dataset at specific ratio.
`datasets.fetch_datasets`([data_home, ...])	Load the benchmark datasets from Zenodo, downloading it if necessary.

The imblearn.utils module includes various utilities.

`utils.estimator_checks.check_estimator`(Estimator)	Check if estimator adheres to scikit-learn conventions and
`utils.check_neighbors_object`(nn_name, nn_object)	Check the objects is consistent to be a NN.
`utils.check_ratio`(ratio, y, sampling_type)	Ratio validation for samplers.
`utils.hash_X_y`(X, y[, n_samples])	Compute hash of the input arrays.