The HanMiner Extension provides a fast and easy-to-use toolset to perform text processing and mining tasks in Chinese Mandarin (Han language).

Understanding unstructured text data without the need to code! The HanMiner Extension provides a fast and easy-to-use toolset for text processing/mining in Chinese Mandarin (Han language). It enables researchers/data analysts to extract valuable information from text with no programming knowledge required. Use it to build your own workflow for public opinion monitoring, sentiment analysis, keyword extraction for wordle, etc.

Supported operators:

  • Document Reader/Writer
  • Word Segmentation (Tokenization)
  • Filtering (stopwords, tokens, documents)
  • Word Count
  • Keyword Extraction
  • Vectorizer (count, TFIDF, Doc2Vec)
  • Part-of-Speech (POS) Tagging
  • Name Entity Recognition (NER)
  • Translation (between Simplified Chinese and Traditional Chinese)
  • Document Classification
  • Sentiment Analysis

Product Details

Version 1.0.3
File size 33 MB
Downloads 2514 (4 Today)2514 downloads
Vendor joeyhaohao78
Category Domain specific operators
Released 3/15/21
Last Update 3/15/21 4:29 AM
License AGPL
Product web site https://github.com/joeyhaohao/rapidminer-HanMiner
Rating 0.0 stars(0)