Deep Learning: The next step on Natural Language Processing Technologies
Natural Language Processing (NLP) of texts has been applied with different degrees of success. For example, automatic translation has attracted a lot of attention in the early stages of NLP. Nowadays, with the advent of social networks, users generate a big volume of interesting information for companies which are either in the search of user feedback for they products or in the search of personalised information to sell new ones. Thus, new NLP interesting applications appear such as sentiment analysis (extracting opinions in a user opinion about a product), user wants and needs detection or user profiling. Humans cannot process this information timely without great effort and money expenditures and computers stand up as the only alternative as they are much faster than humans.
Language comprehension of user comments in social networks is inherently complex to computers. The ways in which humans express themselves with natural language are nearly unlimited and informal texts is riddled with typos, misspellings, badly set up syntactic constructions and specific artefacts (e.g. Hashtags in Twitter) which exponentially complicate this task. Furthermore, humans could learn new words by the context in which they appear but for computers extracting this information and using it appropriately is not that easy.
Deep Learning is the current trend in many application which perform complex operations reserved to humans in the past such as voice recognition (e.g. Siri, Cortana or Google Talk or computer vision (e.g. automatic face and object recognition). Two factors are the main responsible in the progress in NLP in last years:
Word Embeddings: Translation to a mathematical domain which aim to represent with numbers the semantic meaning of the words. This process is usually done automatically using tons of data publicly available (e.g. social networks). Thus, computers learn the representations of billions of words without human intervention.
High-level abstraction of texts: Deep Learning technologies wisely combine the aforementioned word representations to obtain a semantic view of more complex texts such as sentences and documents. With this information, computers can take a grasp of the real meaning of texts obtaining better results in comparison with traditional approaches when complex analysis are involved (sentiment analysis, automatic translation, detection of entities in texts, question-answering system, etc).
All in all, whilst human experts still beat computers in most NLP tasks, thanks to Deep Learning and Word Embeddings the gap has dramatically narrowed in the last five-year period. With a lot of support from a high-spirited academic community and relevant IT companies investing time and money in deep learning technologies (e.g. Google, Apple or Microsoft just to name a few), the next big jump in NLP tools is expected to be just around the corner.
Gradiant has embraced deep learning to gain competitiveness in NLP technologies worldwide. Therefore, Gradiant develops their own Deep Learning algorithms to build the intelligent tools of the future.
The interested reader can get a comprehensive technical introduction to Deep Learning and NLP in the following link:
http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/