SAS Text Miner 5.1 includes the following new features and enhancements:
Two new nodes have been added in SAS Text Miner:
The Text Cluster node replaces the clustering functionality and the creation of the singular value decomposition in the original Text Miner node. The new node enables you to both cluster documents and experiment with different cluster settings without having to reparse the collection to see the updates.
The Text Import node enables you to create data sets from your own document collections or from a Web crawl, all from within the context of a SAS Enterprise Miner diagram.
The Text Miner node that was available in previous releases of SAS Text Miner has now been replaced by the functionality in other SAS Text Miner nodes. See for a review of how controls and functionality in the original Text Miner node have been replaced in the new SAS Text Miner nodes. This release allows you to import diagrams from a previous release of SAS Text Miner that had a Text Miner node in the process flow diagram; however, new Text Miner nodes can no longer be created, and property values cannot be changed in imported Text Miner nodes.
In addition to the languages supported in previous releases (Arabic, Chinese, Dutch, English, French, German, Italian, Japanese, Korean, Polish, Portuguese, Spanish, and Swedish), SAS Text Miner 5.1 also supports these languages: Czech, Danish, Finnish, Greek, Hebrew, Hungarian, Indonesian, Norwegian, Romanian, Russian, Slovak, Thai, Turkish, and Vietnamese.
Note: While custom entities are supported for the new languages, these languages do not come prepackaged with default entities. You can use SAS Concept Creation for SAS Text Miner to enable extraction, definition, and managing of custom entities for inclusion in text mining projects and analysis.
You can create synonym data sets as you specify synonyms in the Interactive Filter Viewer.
You can import synonyms into the Text Filter node using the Import Synonyms property.
Improvements include the ability to:
When a new row is added for user topics, a default weight is used.
You can now edit any existing subset documents filter in the Text Filter node.
Both the Text Filter node and Text Topic node viewers allow you to find text (and find the next to cycle through all occurrences).
The Text Topic Viewer includes the following improvements:
The DOCPARSE procedure has been replaced by the TGPARSE procedure. If you currently use the DOCPARSE procedure, you will need to modify your code to use the TGPARSE procedure.