With a market size in the billions and a double-digit projected yearly growth (some say 20% CAGR for the foreseeable future), NLP is not the niche it once was. We moved from looking for single words in a file to analyzing communication as a whole: at document level, at library level, at Internet level. Our scope went from one sentence on a page to twitter trends and global patterns. Today many companies base their entire marketing strategies on what people say on social media, and they’re able to track all this digital chatter thanks to Text Analytics technology. In fact, new companies and an entire industry were born just to satisfy this need for knowledge.
We now face increasing complexity and strenuous demand for new tools coming from every imaginable industry; this landscape, naturally, requires technologies that become more sophisticated by the minute. We started with a straightforward approach which was revolving around key words; then this evolved into something better when we added statistics to the mix as a way to weigh the relevance of each element in the discourse; then Machine Learning was applied to the analysis of text, with the hope to minimize the need for human contribution to the selection of words and stats that were part of this process…but in the meantime keyword technology had become more refined, and was introducing some forms of shallow linguistics (and syntactical analysis) as the proverbial cherry on top. As a result of its maturity, adoption started to take off, the number of players grew exponentially, as did the competition, therefore the need for differentiating factors as a way to win the race; this new gold rush caused several effects: some vendors decided to focus on specialized vocabularies in an effort to become the best in one specific field (e.g., the Life Sciences industry); others moved horizontally (on the linguistic side of the problem), producing software that would understand text much better; and, finally, some worked on the extensibility of their solutions, providing out-of-the-box programmability through linguistic rules. If this wasn’t enough, in recent years Machine Learning made a (strong) comeback, reshaping its intentions from human-replacement tool to a more politically correct human-supporter one.
As one would expect, each one of these methodologies occupies a different position on Gartner’s Hype Cycle curve. Some rest easy on the Plateau of Productivity, having reached maturity and enjoying broad distribution in applications that almost everyone on the planet uses every day (like Search), while others are still confined to hard-core first adopters, rapidly sliding down from the Peak of Inflated Expectations to the Trough of Disillusionment. Whatever the technology, a higher demand means more money on the table, more companies playing this game (with frequent market adjustments and consolidation), more research and development that constantly bring newer, more powerful solutions that can address almost any problem and scale as expected. The planet produces so much unstructured data every day that no service could keep up without the help of Artificial Intelligence to make sense of it all. But the history of this industry is proof of how a sudden explosion of demand can be the root cause for extreme fragmentation.
Here’s a selection of what I’m observing:
- Companies that go for the platform play: a solid engine accompanied by powerful tools; clients have a high degree of control on the development of AI applications for their own company, but they’re also left to deal with the complexities of this kind of endeavor on their own. It’s not uncommon to see small consulting shops being born around a platform exactly for that reason (supporting users, delivering professional services). This is the portion of the market that’s growing the fastest
- Companies that sell ML-based AI products that only deal with numbers and structured information (no documents or freeform text). These vendors work primarily with banks, often around compliance and AML (anti-money laundering) use cases
- Vertical play: some technology vendors in the NLP space specialize in specific industries, projecting stronger reliability because of their experience, and also speeding up time-to-market thanks to vocabularies and standard taxonomies already in place. These companies find it a little harder than others to participate in market consolidation, but they are sometimes acquired by non-AI organizations that happen to be leaders in the industry the company specializes in. They’re also struggling to grow at the same pace of the market because today organizations want to have a holistic approach to AI while at the same time dealing only with one technology partner (as opposed to a different NLP provider for every need)
- In-house development (client side): large organizations have embraced how important it is to onboard AI technologies fully, so they acted decisively; they started by hiring experts, putting together dedicated AI teams of data scientists and knowledge engineers (sometimes with representation at executive level, Chief Data Officer), and decided to build what they need autonomously. Typically, these teams rely on open-source platforms, sometimes they work with a technology partner. All in all this way to operate is positive because it offers total control, but has shown its weakness in terms of speed and actual achievements. Many such projects fail to get to finish line, and the cause is often found in the fact that, quite simply, NLP is hard and these organizations are not designed to be successful at that (as an example, a bank doesn’t build its own ATM machines, why should it ever create an AI solution?)
- In-house development (vendor side, OEM): there’s probably no better place to look if one wants to appreciate how fragmented (and, in a way, still very young) the NLP industry is than the OEM situation. So many applications could easily make great use of an integration with NLP because they happen to deal with a lot of content and communication, but they just don’t. Think of DBs, Enterprise Search, Analytics, social trends, automated assistants, note taking, …I could go on for days. The reason is they never did, their business always worked without NLP; so, even those leaders that see the potential of powerful new functions that could be added to their users’ experience will still put it aside since their days are already choke full with other priorities. They don’t feel the pressure to innovate, and the few that do try to build some minimal form of NLP in-house. There are exceptions, of course, but the ratio of what could be done to what is actually done is mind-boggling. The OEM play in NLP is leaving a lot of money on the table, at this time.
- Business intelligence: every major organization uses powerful BI tools to keep an eye on the performance of multiple parts of the business. This has initially generated some confusion, when BI tools providers felt that providing intelligence also around the content of a company’s documents was their natural place. A business intelligence application does something very different than an NLP software, but heavy BI tools users were, at first, tricked by the above-mentioned assumption so they didn’t pull the trigger on any NLP solution. Later on, many understood the difference and tried to keep their BI tools front and center while onboarding third-party NLP technology, which led to some awkward integrations. My impression is that today we’re finally past that, BI tools are great tools that respond to some need, NLP is concerned with completely different needs
- Automation/RPA: in recent years, the automation market (now known as RPA) has boomed. Even more interesting, this category of software platforms doesn’t really come with its own features, it’s more like a vehicle for other applications. It simplifies integration and, at the same time, takes care of scaling the automation efforts all across an organization, while also insulating key pieces like security/privacy and orchestration. It’s a great advancement and it has sped up the introduction of so many technologies in different industries. Seems like the perfect place for NLP to thrive, enhancing the possibilities around every process that automates anything around documents…and still, probably due to the same OEM struggles described above, it has yet to deliver on its promise. Having been exposed directly to these conversations multiple times, I can offer my personal opinion on why this is the case, at the moment: being RPA the container of all the functions that need to be delivered, the RPA provider acts as the project manager that has to respond and frame the client’s needs; since RPA professionals are not NLP experts, they often don’t see the value in this kind of expansion of their automation pipeline. On the other hand, if you ask me, I think Cognitive Automation is the next big thing
- Consulting outfits and tier-2 system integrators: the army of people that works on the professional services side of things, around a product’s platform ecosystem, are martyrs that save the day and fix all the problems and confusion and fragmentation described above. After all the misunderstandings and bad choices (both in regard to technologies and teams that should lead a project), the ones that will ultimately make sense of it all and deliver a functioning solution are the consultants who had originally been hired just to help out. Not to say that this group doesn’t suffer the struggles of the current market, since it gets quite difficult to streamline your operations when the products are all so different, not integrated, and not very well understood (with clients sometimes asking to address a use case using a product that simply wasn’t built to do that). Knowledge Management consulting companies often have their product of choice, one they know well and use often, but they don’t have complete control on what technology a client will ask them to adopt
More on the industry segmentation in this interesting article.
The industry is moving really fast, though surely faster on some fronts and almost not at all on others. I wouldn’t be surprised to revisit the landscape in 5 years and find a completely different situation, but the kind of all-over-the-place offering I’m recording at the moment could have improved a few years ago already; it didn’t, in fact I feel it got worse, mostly because more capital entered the market, which allowed for many more players to start competing, and everyone went their own direction. My feeling is that the absence of standards and proper leaders that show which direction deserves the most focus is the clearest sign of a market maturity that’s still quite not there.