SmileSmile

This extension wraps functionality from the Smile library (http://haifengl.github.io/smile/) and provides them as operators.

[Unsupported Extension Notice] - The Smile extension is not officially supported by Altair RapidMiner. While we're always improving and updating our offerings, we can't guarantee any help or fixes for this extension.

This extension wraps functionality from the Smile library (http://haifengl.github.io/smile/) and provides them as operators.

Smile is a fast and comprehensive machine learning engine. They focus on Speed, Ease of Use, Comprehensive, Natural Language Processing and Mathematics and Statisitcs.

Currently the extension provides the following Operators:

  • Anomaly:
    • Gaussian Mixture
  • Blending:
    • t-SNE
  • Cleansing:
    • Probabilistic Principal Component Analysis (PPCA) 
  • Clustering:
    • G-Means
  • Models:
    • Parametric Probability Estimator
  • Learner:
    • Lasso Regression
    • Random Forest (Smile) (now with classification in 0.4.0)
    • Gradient Boosted Tree (Smile)  (now with classification in 0.4.0)
  • Statisitics
    • Compare Distribution (enhanced in 0.4.0)

Changes in 0.6.0 (2021-09-10)

* GMM throws a proper exception if there are missing values.

* GMM is now normalizing the data before fitting. This reduces numerical issues which may occur

* GMM has now several ways of calculating it's anomaly score. The default is negative log likelihood instead of 1/likelihood.

* Changed the way how confidences are calculated in GMM to avoid numerical instability.

* GMM is now reporting it's BIC as a performance measure. * Fixed a bug that you could apply the GMM model to a data set with a different schema and receive missing values. A proper error message is thrown.

 

Version 0.5 (2021-08-11)

  • Reworked the GMM operator. It now:
    • provides cluster model, not a custom model anymore
    •     provides cluster assignments and confidences
    •     provides a score, which is 1/p by default with a setting to change the invert
    •     provides scores for each component of the mixture
    •     has all information about the model as a text output of the model

Version 0.4.1 (2021-04-08)

  • Fixed a bug that GMM was not able to handle one-class or unlabeled data even though it was able to do.

Version 0.4.0 (2019-12-18)

  • Random Forest (Smile) and Gradient Boosted Trees (Smile) now support Classification.
    • Random Forest Regression (Smile) renamed to Random Forest (Smile)
  • Compare Distributions: 
    • Added Kullback-Leibler and Jensen-Shannon as options to compare distributions. They run on a normalized bin version of the distribution.
    • Binning for Chi-Square, KL and JS are done on the superset of the data (i.e. min/max are determined on the superset).
    • A proper error message is thrown if you use Compare Distributions on data with missing values, which is not supported.

Version 0.3.0 (2019-09-11)

  • New operator: Compare Distributions
    • test the compatibility of two ExampleSets.
  • New operator: Gradient Boosted Tree (Smile)
    • Train a gradient boosted tree for Regression (classification currently not supported)
  • Renamed Regression operator folder to Learner
  • Major internal code refactoring. This may cause that previously trained models are not applicable anymore.

 

Version 0.2.0 (2019-07-30)

  • Added new operator Random Forest Regression (Smile)
  • Added the corresponding Random Forest Model

Version 0.1.0 (2019-02-08)

  • Extension release
  • New operator Gaussian Mixture
  • New operator G-Means 
  • New operator Probabilistic Principal Component Analysis 
  • New operator Lasso Regression
  • Operator t-SNE copied from Operator Toolbox Extension
  • Operator Parametric Probability Estimator copied from Operator Toolbox Extension

Product Details

Version 0.6.0
File size 1.3 MB
Downloads 12056 (1 Today)12056 downloads
Vendor RapidMiner Labs
Category Machine Learning
Released 9/10/21
Last Update 9/10/21 11:34 AM
(Changes)
License AGPL
Product web site
Rating 0.0 stars(0)