Data Search for Data Mining
This extension provides various data search and integration methods for enriching (extending) a data table, using a heterogenous tabular corpus. These include Correspondence Search (Search-Join for single attribute) including human-in-the-loop refinements, Unconstrained and Correlation Search. Some operators of this extension require a Data Search server developed by University of Mannheim, which maintains the public enpoints.
[Unsupported Extension Notice] - The extension Keras Extension is not officially supported by Altair RapidMiner. While we're always improving and updating our offerings, we can't guarantee any help or fixes for this extension.
This extension provides automated and semi-automated methods for data augmentation, which includes data search, attribute discovery and integration of new attributes to a data set.
The extension provides i) Single-attribute data augmentation, also called as Constrained augmentation or governed data discovery. This discovers a specific attribute as dictated by the user from a given corpus and ii) Multi-attribute data augmentation, also called as Unconstrained augmentation. This discovers relevant attribute from the corpus and augments these to the given data set.
Currently, the extension provides the following operators:
- Legacy: Some of the operators are now considerd as legacy and are
replaced by other operators to make the extension more independent
of any Backend search server. These include:
- Data Search
- Fuse
- Correlation-Based Search
- Unconstrained Search
- Single Attribute Augmentation: This group includes operators that
work together in a single operator chain.
- Create Correspondences
- Translate
- Advanced Fuse
- Multi Attribute Augmentation
- Enrich Table by Data Fusion
- Repository Management: This group contains operators for creating
repositories and uploading data in them. Currently, you may setup an
instance of data search server (developed by University of
Mannheinm) on your premise.
- Create Repository
- Data Table Upload
- Data Tables Upload
- Data Table Search: This group provides access to search engines.
- Google Table Search
Version 2.1.0 (26-04-2019)
- Two bugfixes in Enrich Table by Data Fusion operator.
- A parameter added in Enrich Table by Data Fusion to balance coverage with precision.
- A new dataset, a tutorial process and application template added for fully automated augmentation.
Version 2.0.0 (16-11-2018)
- New operator Create Correspondences. This operator implements Constrained Data Augmentation algorithm without depending on Mannheim data search server.
- Rearranged operators into new operator groups (Legacy, Single Attribute Augmentation and Multi Attribute Augmentation).
Version 1.0.1 (30-07-2018)
- New operator Enrich Table by Data Fusion. This operator implements Unconstrained Data Augmentation algorithm without depending on Mannheim data search server.
- New operator Unconstrained Search. This operator depends on Mannheim data search server.
- New operator Correlation-Based Search. This operator depends on Mannheim data search server.
Version 0.2.0 (30-01-2018)
- New component Connection Manager added to easily maintain connections with multiple instances or endpoints of data search server.
- Repository Management operator group added with following new
operators to create repository and upload data:
- Create Repository
- Data Table Upload
- Data Tables Upload
Version 0.1.3 (19-10-2017)
- New operator Google Table Search
Product Details
Version | 2.1.0 |
File size | 15 MB |
Downloads | 14069 (1 Today) |
Vendor | RapidMiner Labs |
Category | Operators |
Released | 4/26/19 |
Last Update | 4/26/19 4:18 PM |
License | AGPL |
Product web site | http://ds4dm.de |
Rating | (0)
|
Comments
Log in to post comments.