Named Entity Recognition Using Character LSTM

Human beings, when provided with repetitive tasks, are prone to committing errors, owing to muscle memory and loss of concentration. Loss of concentration is often known as brain fatigue, wherein the brain tends to operate in an autopilot state, without the need to think about actions and reactions. Hence, there is a pressing need to improve conventional user interfaces, changing the way that we fundamentally interact with machines, to cater to answer questions without any loss of information or errors. Such user interfaces are also a very important area of research, owing to their impact on a multitude of applications, in customer service, search interfaces, and human-computer interactions.

In order to develop such interfaces, one of the fundamental tasks is to understand and interpret a sentence provided as input by a user. Such an interface should be able to recognize words in a sentence, along with what they convey to the user reading the sentence. Such a process is called Named Entity Recognition (NER), wherein the objective is to find (and classify) named entities in a text. NER falls under the broader area of information retrieval, and it is commonly known by names such as entity identificationentity chunking, and entity extraction.

In NER, entities are predefined categories, such as peoples' names, organization names, location names, times, and so on. NER allows a computer program to interpret the sentence, I will meet you at Burj Khalifa for a cup of coffee at 7:30 PM tomorrow, as, I will meet you at (Burj Khalifa)Location for a cup of (coffee)Food at (7:30 PM)Time (tomorrow)Date. In this example, the algorithm detects and classifies a two-token location, a single-token food item with, and a temporal expression, and a date.

NER is often considered a sequence labeling problem, using approaches such as the Hidden Markov Model (HMM), decision trees, maximum entropy (ME) models, Support Vector Machines (SVMs), and Conditional Random Fields (CRFs). However, in recent literature, deep learning has been extensively leveraged for recognizing named entities. With large amounts of data available for applications to build its algorithms, deep learning has shown that it can outperform conventional methods, while also being robust in generalizing its learning capabilities.

It has to be noted that state-of-the-art NER systems for the English language produce near-human performance. For example, the best system scored an F-measure of 93.39%, while human annotators scored approximately 97%.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset