Skip to content

Releases: NannyML/nannyml

v0.10.7

07 Jun 12:28
v0.10.7
a24ab81
Compare
Choose a tag to compare

Changed

  • Optimized summary stats and overall performance by avoiding unnecessary copy operations and index resets in during chunking
    (#390)
  • Optimized performance of nannyml.base.PerMetricPerColumnResult filter operations by adding a short-circuit path
    when only filtering on period. (#391)
  • Optimized performance of all data quality calculators by avoiding unnecessary evaluations and avoiding copy and index reset operations
    (#392)

Fixed

  • Fixed an issue in the Wasserstein "big data heuristic" where outliers caused the binning to cause out-of-memory errors. Thanks! @nikml!
    (#393)
  • Fixed a typo in the salary_range values of the synthetic car loan example dataset. 20K - 20K € is now 20K - 40K €.
    (#395)

v0.10.6

16 May 14:48
Compare
Choose a tag to compare

Changed

  • Make predictions optional for performance calcuation. When not provided, only AUROC and average precision will be calculated. (#380)
  • Small DLE docs updates
  • Combed through and optimized the reconstruction error calculation with PCA resulting in a nice speedup. Cheers @nikml! (#385)
  • Updated summary stats value limits to be in line with the rest of the library. Changed from np.nan to None. (#387)

Fixed

  • Fixed a breaking issue in the sampling error calculation for the median summary statistic when there is only a single value for a column. (#377)
  • Drop identifier column from the documentation example for reconstruction error calculation with PCA. (#382)
  • Fix an issue where default threshold configurations would get changed when upon setting custom thresholds, bad mutables! (#386)

v0.10.5

08 Mar 13:17
v0.10.5
7da83e7
Compare
Choose a tag to compare

Changed

  • Updated dependencies for Python 3.8 and up. (#375)

Added

  • Support for the average precision metric for binary classification in realized and estimated performance. (#374)

v0.10.4

04 Mar 15:44
v0.10.4
17430fc
Compare
Choose a tag to compare

Changed

  • We've changed the defaults for the incomplete parameter in the SizeBasedChunker and CountBasedChunker
    to keep from the previous append. This means that from now on, by default, you might have an additional
    "incomplete" final chunk. Previously these records would have been appended to the last "complete" chunk.
    This change was required for some internal developments, and we also felt it made more sense when looking at
    continuous monitoring (as the incomplete chunk will be filled up later as more data is appended). (#367)
  • We've renamed the Classifier for Drift Detection (CDD) to the more appropriate Domain Classifier. (#368)
  • Bumped the version of the pyarrow dependency to ^14.0.0 if you're running on Python 3.8 or up.
    Congrats on your first contribution here @amrit110, much appreciated!

Fixed

  • Continuous distribution plots will now be scaled per chunk, as opposed to globally. (#369)

v0.10.3

17 Feb 00:46
Compare
Choose a tag to compare

Fixed

  • Handle median summary stat calculation failing due to NaN values
  • Fix standard deviation summary stat sampling error calculation occasionally returning infinity (#363)
  • Fix plotting confidence bands when value gaps occur (#364)

Added

  • New multivariate drift detection method using a classifier and density ration estimation.

v0.10.2

13 Feb 00:35
v0.10.2
926b0a5
Compare
Choose a tag to compare

Changed

  • Removed p-value based thresholds for Chi2 univariate drift detection (#349)
  • Change default thresholds for univariate drift methods to standard deviation based thresholds.
  • Add summary stats support to the Runner and CLI (#353)
  • Add unique identifier columns to included datasets for better joining (#348)
  • Remove unused confidence_deviation properties in CBPE metrics (#357)
  • Improved error handling: failing metric calculation for a single chunk will no longer stop an entire calculator.

Added

  • Add feature distribution calculators (#352)

Fixed

  • Fix join column settings for CLI (#356)
  • Fix crashes in UnseenValuesCalculator

v0.10.1

28 Nov 18:03
fa38d24
Compare
Choose a tag to compare
  • Various small fixes to the docs, thanks once again ghostwriter @NeoKish! (#345)
  • Fixed an issue with estimated accuracy for multiclass classification in CBPE. (#346)

v0.10.0

21 Nov 13:44
v0.10.0
2e99128
Compare
Choose a tag to compare

Changed

  • Telemetry now detects AKS and EKS and NannyML Cloud runtimes. (#325)
  • Runner was refactored, so it can be extended with premium NannyML calculators and estimators. (#325)
  • Sped up telemetry reporting to ensure it doesn't hinder performance.
  • Some love for the docs as @santiviquez tediously standardized variable names. (#338)
  • Optimize calculations for L-infinity method. [(#340)]
  • Refactored the CalibratorFactory to align with our other factory implementations. [(#341)]
  • Updated the Calibrator interface with *args and **kwargs for easier extension.
  • Small refactor to the ResultComparisonMixin to allow easier extension.

Added

  • Added support for directly estimating the confusion matrix of multiclass classification models using CBPE.
    Big thanks to our appreciated alumnus @cartgr for the effort (and sorry it took soooo long). (#287)
  • Added DatabaseWriter support for results from MissingValuesCaclulator and UnseenValuesCalculator. Some
    excellent work by @bgalvao, thanks for being a long-time user and supporter!

Fixed

  • Fix issues with calculation and filtering in performance calculation and estimation. (#321)
  • Fix multivariate reconstruction error plot labels. (#323)
  • Log a warning when performance metrics for a chunk will return NaN value. (#326)
  • Fix issues with ReadTheDocs build failing
  • Fix erroneous specificity calculation, both realized and estimated. Well spotted @nikml! (#334)
  • Fix threshold computation when dealing with NaN values. Major thanks to the eagle-eyed @giodavoli. (#333)
  • Fix exports for confusion matrix metrics using the DatabaseWriter. An inspiring commit that lead to some other changes.
    Great job @shezadkhan137! (#335)
  • Fix incorrect normalization for the business value metric in realized and estimated performance. (#337)
  • Fix handling NaN values when fitting univariate drift. [(#340)]

v0.9.1

13 Jul 16:32
461058f
Compare
Choose a tag to compare

Changed

  • Updated Mendable client library version to deal with styling overrides in the RTD documentation theme
  • Removed superfluous limits for confidence bands in the CBPE class (these are present in the metric classes instead)
  • Threshold value limiting behaviour (e.g. overriding a value and emitting a warning) will be triggered not only when
    the value crosses the threshold but also when it is equal to the threshold value. This is because we interpret the
    threshold as a theoretical maximum.

Added

  • Added a new example notebook walking through a full use case using the NYC Green Taxi dataset, based on the blog of @santiviquez

Fixed

  • Fixed broken Docker container build due to changes in public Poetry installation procedure
  • Fixed broken image source link in the README, thanks @NeoKish!

v0.9.0

26 Jun 11:05
Compare
Choose a tag to compare

Changed

  • Updated API docs for the nannyml.io package, thanks @maciejbalawejder (#286)
  • Restricted versions of numpy to be <1.25, since there seems to be a change in the roc_auc calculation somehow (#301)

Added

  • Support for Data Quality calculators in the CLI runner
  • Support for Data Quality results in Ranker implementations (#297)
  • Support mendable in the docs (#295)
  • Documentation landing page (#303)
  • Support for calculations with delayed targets (#306)

Fixed

  • Small changes to quickstart, thanks @NeoKish (#291)
  • Fix an issue passing *args and **kwargs in Result.filter() and subclasses (#298)
  • Double listing of the binary dataset documentation page
  • Add missing thresholds to roc_auc in CBPE (#294)
  • Fix plotting issue due to introduction of additional values in the 'display names tuple' (#305)
  • Fix broken exception handling due to inheriting from BaseException and not Exception (#307)