Release history

Changelog

Bug fixes

New features

  • Turn off steps in pipeline.Pipeline using the None object. By Christos Aridas.
  • Add a fetching function datasets.fetch_datasets in order to get some imbalanced datasets useful for benchmarking. By Guillaume Lemaitre.

Enhancement

API changes summary

  • __init__ has been removed from the base.SamplerMixin to create a real mixin class. By Guillaume Lemaitre.
  • creation of a module exceptions to handle consistant raising of errors. By Guillaume Lemaitre.
  • creation of a module utils.validation to make checking of recurrent patterns. By Guillaume Lemaitre.
  • move the under-sampling methods in prototype_selection and prototype_generation submodule to make a clearer dinstinction. By Guillaume Lemaitre.
  • change ratio such that it can adapt to multiple class problems. By Guillaume Lemaitre.

Deprecation

  • deprecate the use of float as ratio in favor of dictionary, string, or callable. By Guillaume Lemaitre.

Version 0.2

Changelog

Bug fixes

  • Fixed a bug in under_sampling.NearMiss which was not picking the right samples during under sampling for the method 3. By Guillaume Lemaitre.
  • Fixed a bug in ensemble.EasyEnsemble, correction of the random_state generation. By Guillaume Lemaitre and Christos Aridas.
  • Fixed a bug in under_sampling.RepeatedEditedNearestNeighbours, add additional stopping criterion to avoid that the minority class become a majority class or that a class disappear. By Guillaume Lemaitre.
  • Fixed a bug in under_sampling.AllKNN, add stopping criteria to avoid that the minority class become a majority class or that a class disappear. By Guillaume Lemaitre.
  • Fixed a bug in under_sampling.CondensedNeareastNeigbour, correction of the list of indices returned. By Guillaume Lemaitre.
  • Fixed a bug in ensemble.BalanceCascade, solve the issue to obtain a single array if desired. By Guillaume Lemaitre.
  • Fixed a bug in pipeline.Pipeline, solve to embed Pipeline in other Pipeline. By Christos Aridas .
  • Fixed a bug in pipeline.Pipeline, solve the issue to put to sampler in the same Pipeline. By Christos Aridas .
  • Fixed a bug in under_sampling.CondensedNeareastNeigbour, correction of the shape of sel_x when only one sample is selected. By Aliaksei Halachkin.
  • Fixed a bug in under_sampling.NeighbourhoodCleaningRule, selecting neighbours instead of minority class misclassified samples. By Aleksandr Loskutov.
  • Fixed a bug in over_sampling.ADASYN, correction of the creation of a new sample so that the new sample lies between the minority sample and the nearest neighbour. By Rafael Wampfler.

New features

Enhancement

API changes summary

Documentation changes

Version 0.1

Changelog

API

New methods

  • Under-sampling
    1. Random majority under-sampling with replacement
    2. Extraction of majority-minority Tomek links
    3. Under-sampling with Cluster Centroids
    4. NearMiss-(1 & 2 & 3)
    5. Condensend Nearest Neighbour
    6. One-Sided Selection
    7. Neighboorhood Cleaning Rule
    8. Edited Nearest Neighbours
    9. Instance Hardness Threshold
    10. Repeated Edited Nearest Neighbours
  • Over-sampling
    1. Random minority over-sampling with replacement
    2. SMOTE - Synthetic Minority Over-sampling Technique
    3. bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2
    4. SVM SMOTE - Support Vectors SMOTE
    5. ADASYN - Adaptive synthetic sampling approach for imbalanced learning
  • Over-sampling followed by under-sampling
    1. SMOTE + Tomek links
    2. SMOTE + ENN
  • Ensemble sampling
    1. EasyEnsemble
    2. BalanceCascade