NoSQL: Mavuno: A Hadoop-Based Text Mining Toolkit

| | bookmark | email

Mavuno: A Hadoop-Based Text Mining Toolkit

Mavuno is an open source, modular, scalable text mining toolkit built upon Hadoop. It supports basic natural language processing tasks (e.g., part of speech tagging, chunking, parsing, named entity recognition), is capable of large-scale distributional similarity computations (e.g., synonym, paraphrase, and lexical variant mining), and has information extraction capabilities (e.g., instance and semantic relation mining). It can easily be adapted to new input formats and text mining tasks.

tags:mavuno,hadoop,mahout,text mining

via NoSQL databases