This extension provides a simple mechanism to extract part-of-speech and named-entity-recognition tags from text.

Besides tokenization, part-of-speech and named-entity-recognition this extension allows for dependency parsing and lemmatization (currently English and Spanish only). These mechanisms are using pre-trained language models and are based on the Stanford CoreNLP library (version 4.2.2).

Currently supported languages are:
English, German, French and Spanish.

Additionally to using pre-trained language models custom named-entity-tags can be defined to be used exlusively or along-side the existing model.

Product Details

Version 0.3.4
File size 1.1 GB
Downloads 83 (0 Today)83 downloads
Vendor RapidMiner Labs
Category Domain specific operators
Released 8/11/21
Last Update 8/11/21 8:42 AM
License AGPL
Product web site
Rating 0.0 stars(0)