Skip to content

1.7.0 Release

Latest

Choose a tag to compare

@mauvais2 mauvais2 released this 20 May 18:27
· 20 commits to master since this release
b89515e

1.7.0 Release

Highlights

  • Adds the seed and random state and sampling features to AMPL. (#344) The features are:
    • Imbalance-learn sampling
    • Seed for Reproducibility
  • Changes to control model sparsity and improvements to MultitaskScaffoldSplitter (#331):
    • Sped up MultitaskScaffoldSplitter and changed its implementation to allow better optimization of validation & test set difference from training set.
    • Added split_diagnostic_plots module for visualizing aspects of split quality.
    • Added L1 and L2 penalty parameters to XGBoost models to control model sparsity.
    • Added hyperopt search domain parameters for NN and XGBoost model sparsity parameters.
  • Resolved a bug in Transformers fitting where the Transformers for normalizing inputs, outputs, and weights were trained on the entire dataset instead of only the training dataset, potentially causing data leakage. (#385)
  • Added MODAC API Client + Example Docs (#361)
  • Incorporates CodeCov into the CI/CD pipeline to generate code coverage reports for enhancing code quality. (#372, #373)

Enhancements

  • Fixed a bug when running predictions on classification models with balancing weight transformers requires MinimalDataset weights for the prediction data. Previously, get_multitask_perf_from_files_new returned NaN metrics for single-task models mixed with multitask models; now, both return correct metrics. (#387)
  • Allows the users to combine calculated features with embedded features from pre-trained models (#395)
  • Logs exceptions generated during a HyperOpt search. Previously they were swallowed, ignored (#392)
  • Add compute_drug_likeness function to the RDkit_easy module to compute various drug-likeness criteria for compounds in a data frame (Lipinski rule of 5, Ghose and Veber filters, QED), along with the descriptors used to derive them. (#384)
  • AD calculation improvements. Fixed an error in the calculation. Added the ability to query for the nearest training set neighbors of each compound running predictions for. (#378)
  • Integrates the MODAC unit tests and automates their execution on GitHub CI/CD Actions (#371)
  • Implemented unit tests for plotting packages using matplotcheck and the PlotTester API to perform plot validation. (#394)

Maintenance

  • System clean up:
    • Improved the CI test pipeline to eliminate duplicate job executions.
    • Added markers to indicate the resources used in certain tests.
    • Expanded the range of tests executed in the CI pipeline. Only exclude those that require LLNL resources. (#393)

Bug Fixes

  • Correct the AD index calculation for Mordred features containing NaN value (#390)