The importance of data richness in a NL API

Userlevel 1
Badge +1

A concrete example of the value for developers

In the following use case in the financial services industry, we’re going to look at how NL API compares in practical terms to other popular NLP models.

Language powers many strategic and operational activities within banks and financial services organizations. For example, processing and interpreting documents ,  enterprise search and customer interaction all require domain expertise and accurate understanding of language. With the volume of these contents growing exponentially, leveraging accurate scalable NLU technology that can be implemented across the organization is a critical success factor. 


Rethinking Investment Research

Targeting, collecting and creating quality research continue to be a moving target for financial services organizations of all sizes. The industry has become commoditized, and it is difficult to determine the value of one investment research platform over another. The opportunity to capture valuable, timely information hidden in plain sight is what continues to drive organizations to dig deeper into the ever-growing pool of available content sources.

Let’s imagine we are tasked to develop an application that identifies forms of assets from news, social media and customer interaction communications to automatically (and in real time) send the document to the investment portfolio manager or asset broker to provide insight to make better investment decisions.

For this example, we assume that what constitutes a valuable insight for an investment decision is the occurrence of a certain event and/or a visible trend in the sentiment of the market towards the asset (stock/bond etc.). However, the list of factors to monitor could be richer and more detailed.

This is how this practical activity would look using the most common NLP APIs:

The API would simply return a list of keyword, at best labelled as verbs, nouns and adjectives. As they won’t provide the meaning of these keywords, you would have to create a list of “financial assets” maybe leveraging an external source like Wikipedia and a list of “verbs” who could be linked to certain events and adjectives that could be linked to the sentiment.

But wait, there’s more, because at this stage you could either assume that if there is a match the keyword returned is assumed to refer to the asset (or to the event or to the sentiment) creating many false positive. Or doing extra steps by writing code to acquire additional information to address the ambiguity. In a nutshell a lot of additional work that you could avoid.

An additional note for developers with some knowledge about NLP: you might be tempted to assume that the above ambiguity issue could be easily addressed with larger language models like Bert. However, while Bert could infer that different instances of the word (i.e., stock) having different meanings, it will still not be able to assign the actual meaning for each of these instances; so, extra code will be required also in this case*.

Let’s now consider instead using NL API

  • Identify sources that provide real time information. This list should combine institutional sources (global and local news), social media feeds, and relevant blogs or specialized digital sources
  • Extract plain text from these sources
  • Write code or use SDK to invoke NL API to analyze text
  • Promptly receive from the output each noun that is also an asset solving any ambiguity directly in the analysis phase, receiving back  the “event” identified and the sentiment associated to the asset or to the event directly from one of the API extensions made available out of the box
  • All text and language processing are integrated and ready at your fingertips (hassle free, no need to code it from scratch).

Adding extensions that further enhance the specificity of the data returned by the API (consider, for example, the Media Topic taxonomy that can be sued to categorize news content based on a 4 level taxonomy with hundreds of nodes designed for this kind of content) is part of our strategy to make the life of developers facing NLP needs easier. Using this advanced feature, a developer can, as described above, also associate to the specific paragraph where the asset is mentioned in an additional metadata identifying for example the event or the general topic associated to the asset. (i.e., Shares, Apple, quarterly release vs. new hire etc.). Similarly, the sentiment extension provides immediate value by estimating the sentiment associated to the asset and the event.

Now, it is correct, as many developers tell me that the additional code that the limited values returned by other NLP APIs is not very detailed. Nevertheless, it can be significant in terms of adding time, it forces to create less elegant code, more difficult to audit, and the more complex the task is more extra steps are required.

One last thing. While we are aware that these use case specific extensions are very valuable to developers, it would be impossible to develop and maintain an unlimited inventory of these extensions to address all of the real world use cases. In order to enable developers to fully customize and enhance the power of our NLP, we offer studio. While the tools within studio do require the developer to learn a specific language, in return they provide the ability to create very customized, use case specific sets of metadata that can be wrapped up, directly from Studio with a simple click, in a server that can run locally ( Edge). I will write more about this in the future, but it is an important piece of information in understanding our commitment to the community of developers.



If you are a developer new to NLP, you can easily use all NLP APIs as microservices but you need to consider that not all NLP API are the same and the richness of the data returned can vary significantly on your work. Analyzing in detail and understanding the differences in the output each NLP APIs provide can mean hours of less work and significant improvement in the effectiveness and the elegance of your code. Investing some additional time in learning how to take advantage of tools enabling you to create a custom APIs to return the best data possible for your application, will further simplify and increase the effectiveness of your project. With language technologies becoming critical for many applications, expanding the knowledge and understanding will be a profitable investment for any developer. If you are interested in learning more, we are here to help. We are experts of these technologies and we are committed to continue to serve the community of developers to make sure you can be effective and efficient in your work.

0 replies

Be the first to reply!