IBM has announced the addition of new natural language processing features to Watson Discovery, a text search and analysis platform that can extract important information hidden in corporate data. The added functionality includes automatic recognition of the structure of documents and text templates.
The Watson Discovery update includes the following improvements:
- A pre-trained model for understanding the structure of a document. Watson Discovery’s Intelligent Document Understanding feature now includes a new pre-trained model designed to automatically understand the visual structure and layout of a document without additional training by a developer or data processing specialist.
- Automatic detection of text templates. The new template creation feature allows users to quickly identify business-specific text templates in documents. The function can identify basic text templates based on just two examples, and then improve them based on user feedback.
- Advanced customization options for natural language processing. With the help of a new feature extraction feature, IBM simplifies the process of training NLP models to identify highly reliable, business-specific keywords by reducing data preparation efforts, simplifying annotation through active learning and mass annotation, and also makes it easier to deploy models to speed up training time.