Operator ToolboxOperator Toolbox

This extension couples some useful additional operators together.

This extension adds a bunch of new operators to RapidMiner. They range from utility operators to improve the flexibility and usability of the process design, over additional outlier detection algorithm and additional performance criteria to advanced analysis methods like Local Interpretation or the SMOTE algorithm.

Currently the extension provides the following Operators:

  • Blending

    • Calculate Overlaps

    • Collect and Persist

    • Extract Statistics

    • Filter Attributes with Missing Values

    • Filter Examples with Missing Values

    • Generate Levenshtein Distance

    • Generate Phonetic Encoding

    •  Generate Session ID 

    • Group Into Collection

    • Merge Attributes

    • Replace Rare Values

    • SMOTE Upsampling

    • t-SNE (bugfix in 1.7.0)

    • Weight of Evidence

  • Data Access:

    • Read Excel Sheet Names

    • SFTP Download File

    • SFTP Upload File

  • Data Generator:

    • Create ExampleSet

  • Macros:

    • Extract Last Modifying Operator

    • Set Macros from ExampleSet

    • Set Macro (Real)

  • Models:

    • Check Model Conformance (bugfix in 1.7.0)

    • Get Decision Tree Path

    • Local Interpretation (LIME)

    • Parametric Probability Estimator

    • Random Forest Encoder

  • Outliers:

    • Tukey Test

  • Parameters:

    • Get Parameters

    • Set Parameters from ExampleSet

  • Performance

    • Performance (AUPRC)

  • Text Processing:

    • Apply Model (Documents)

    • Dictionary-Based Sentiment

    • Extract Topics from Data (LDA) (new in 1.7.0)

    • Extract Topics from Documents (LDA) (improved in 1.7.0)

    • Filter Tokens Using ExampleSet

    • Split Document into Collection

    • Stem Tokens Using ExampleSet

Version 1.7.0 (2018-11-26)

  • LDA changes:
    • New operator Extract Topics from Data (LDA), able to run LDA on ExampleSets
    • Renamed Extract Topics from Document (LDA), to Extract Topics from Documents (LDA)
    • Updated default optimization interval of the hyperparameters in both LDA operators from 50 to 10
    • Added AlphaSum, Beta and BetaSum to the performance vector output.
    • Bugfix in case numerical meta data element is missing
    • Bugfix that causes the LDA to not respect preprocessing step like Filter Stopwords
  • Bugfix TSNE operator: fixed a bug in handling special attributes
  • Bugfix in Check Model Conformance, that nominal attributes were checked if fail on error is true, even if check nominals is false

Version 1.6.0 (2018-11-12)

  • New operator Check Model Conformance
  • New operator Filter Attributes with Missing Values
  • New operator Filter Examples with Missing Values

Version 1.5.0 (2018-09-14)

  • Improvement of Dictionary-Based Sentiment
    • Added a symmetric negation window option. If selected the negation also look backwards
    • Added the negation token to the result string
    • Fixed a bug in case double (or more negations) occur
  • Improvement of SMOTE Upsampling
    • SMOTE is now throwing a UserError if the wrong label types (Numeric or None) are used.
    • Bugfix for label attributes which contains comparison characters.
  • Local Interpretation (LIME)
    • Renamed operator to Local Interpretation (LIME)
    • Improved Meta Data Handling
  • General:
    • Improved parameter descriptions for several operators
    • Improved expert and mandatory parameters settings and removed unused encoding parameters.
    • Added a progress bar for several operators

Version 1.4.0 (2018-08-27)

  • New operator Calculate Overlaps

Version 1.3.0 (2018-07-27)

  • New operator Set Macro (Real)
  • Additional parameter 'use absolutes' for Generate Session Id
  • Enhancement for meta data of document models
  • Bugfix for Performance (AUPRC)
  • Enhancements of the Extract Topics from Document (LDA) operator
    • LDA model now has a overview over various topic diagnostics measures.
      • Accessible over the LDA model table renderer in the result view.
    • Parameters seed, thinning, burnin and iterations for the LDA model can now be set by the application parameters of the Apply Model operator
    • Added perplexity as a performance measure of LDA.
    • Added an option to get the mallet logging into the RapidMiner log panel
    • Bugfix for meta data when storing a LDA model
    • LDA is now correctly using the token text of a document, not the display text

Version 1.2.0 (2018-06-08)

  • New operator Generate Session ID
  • Bugfix for LDA Model
    • Its now possible to use Generate Prediction Ranking together with LDA models
  • Bugfixes for some original output ports to prevent backward propagation

Version 1.1.0 (2018-05-16)

  • Dictionary-Based Sentiment:
    • Moved operator to text processing folder
    • Reworked operator, so that it now uses the Apply Model (Documents) operator
    • Hence Apply Dictionary-Based Sentiment is now deprecated
  • LDA
    • Bugfix for the calculation of the alpha sum parameter
    • Correct the name of the confidence roles, so that RapidMiner can automatically identify the attributes as confidence attributes.
    • Added the possibility to disable the optimization of the hyperparameters
      • Added a parameter to set the optimization interval for the hyperparameters
  • Refurbishing of the Parametric Probability Estimator operator (forgot in 1.0.0)
    • adaption of parameter names (automatic replacement for older processes)
    • improved documentation/help text

Product Details

Version 1.7.0
File size 11 MB
Downloads 42545 (168 Today)42545 downloads
Vendor RapidMiner Labs
Category Operators
Released 11/26/18
Last Update 11/26/18 8:21 AM
(Changes)
License AGPL
Product web site
Rating 0.0 stars(0)