Decide Which Named Entity Recognition Product to Use

OpenText provides three main ways for you to use Named Entity Recognition: Named Entity Recognition in ingest, the Named Entity Recognition SDK, and Named Entity Recognition Server. The one you use depends on your use case, and your preference.

Named Entity Recognition as Part of an Ingestion Process

The ingestion components, NiFi Ingest and Connector Framework Server (CFS), allow you to incorporate Named Entity Recognition as part of a document retrieval process. You can use connectors to retrieve documents from your repositories, and perform Named Entity Recognition alongside any other document processing.

This method is very useful if you want to automate the process of retrieving and tagging documents. In particular, if you use NiFi Ingest and CFS to index into the Content component, you can use Named Entity Recognition to add extra fields to your documents to make it easier to search for the entity values that you extract.

However, this method is not appropriate if you want to provide text directly to Named Entity Recognition as part of an application.

Named Entity Recognition SDK or Named Entity Recognition Server

The Named Entity Recognition SDK provides APIs to allow you to run Named Entity Recognition directly. This option is most suitable for OEM environments, where you want to embed Named Entity Recognition into an application that you distribute to your users.

The Named Entity Recognition ACI Server also allows you to use Named Entity Recognition as part of an application. In this case, you must host the server (and a license server), which makes it less suitable for OEM environments. It might be the most suitable option if you want to use Named Entity Recognition in a web application, particularly if you want to use other Knowledge Discovery services.

In other cases, you can use either option, depending on your personal preference. When you choose, you might want to consider the following points:

  • The Named Entity Recognition SDK has a larger initial learning requirement as you start using the SDK.

    Named Entity Recognition Server accepts HTTP requests and returns XML or JSON, so it might be a quicker method to get started with. In particular, if you already use the ACI API in other applications, you do not need to learn how to use additional APIs to run Named Entity Recognition.

  • The Named Entity Recognition SDK is available only for C, .NET, Python, and Java. If you want to create an application in a different language, you might not be able to use the SDK without using additional methods to call out to shared libraries.

    NOTE: To use Named Entity Recognition Server, you can use any method for making HTTP requests and parsing XML or JSON. Named Entity Recognition Server supports XSLT, and there are also Knowledge Discovery SDKs available in C, .NET, and Java.

  • To run Named Entity Recognition, both Named Entity Recognition Server and the Named Entity Recognition SDK must have access to the required grammar files.

    In the case of the Named Entity Recognition SDK, the easiest way to include the grammars is to install them with your applications. You can also embed the grammars in your application. Installing or embedding the grammars increases the size of your application.

    For Named Entity Recognition Server, you include the grammars with the web server, so this does not add any overhead for your end users.

For more information, see Named Entity Recognition SDK and Named Entity Recognition ACI Server.