In-Database ProcessingIn-Database Processing Supported

Visually define data prep or ETL workflows and execute them directly in the database. Reduce data transfer by loading only the data you need after preparation.

With the new In-Database Processing extension you can design a subprocess with new, but familiar preprocessing operators. Computation of these operators is pushed down into a database, i.e. they are automatically translated into SQL code which is submitted to the database. You can then process the result with other operators just like in a normal RapidMiner process.

The main goal of this extension is to allow you to limit the data that you read from a database into the memory of RapidMiner Studio or Server. This is especially important when you are using cloud engines like Google BigQuery where you have to pay for the amount of data you retrieve. Another goal is to leverage your database's computing power which is also important when using distributed, scalable database or cloud engines. All this is done without the need to write SQL code.

The extension currently supports Google BigQuery (via OAuth 2), PostgreSQL, MySQL, MSSQL, Oracle, Snowflake and Databricks. Further database and cloud engine support is planned for the future, depending on user demand.


Product Details

Version 10.4.0
File size 843 kB
Downloads 27269 (18 Today)27269 downloads
Vendor RapidMiner
Category Operators
Released 5/27/24
Last Update 5/27/24 10:37 AM
(Changes)
License RM_EULA
Product web site www.rapidminer.com
Rating 0.0 stars(0)