Understanding unstructured text data without the need to code! The HanMiner Extension provides a fast and easy-to-use toolset for text processing/mining in Chinese Mandarin (Han language). It enables researchers/data analysts to extract valuable information from text with no programming knowledge required. Use it to build your own workflow for public opinion monitoring, sentiment analysis, keyword extraction for wordle, etc.

Supported operators:

  • Document Reader/Writer
  • Word Segmentation (Tokenization)
  • Filtering (stopwords, tokens, documents)
  • Word Count
  • Keyword Extraction
  • Vectorizer (count, TFIDF, Doc2Vec)
  • Part-of-Speech (POS) Tagging
  • Name Entity Recognition (NER)
  • Translation (between Simplified Chinese and Traditional Chinese)
  • Document Classification
  • Sentiment Analysis

