Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks

Loading...
Thumbnail Image

Authors

Hassan, Abdalraouf

Issue Date

2018-08-04

Type

Thesis

Language

en_US

Keywords

Convolutional neural network , Deep learning , Machine learning , Natural language processing , Recurrent neural network , Sentiment analysis

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

The evolution of the social media and the e-commerce sites produces a massive amount of unstructured text data on the internet. Thus, there is a high demand to develop an intelligent model to process it and extract a useful information from it. Text classification plays an important task for many Natural Language Processing (NLP) applications such as, sentiment analysis, web search, spam filtering, and information retrieval, in which we need to assign single or multiple predefined categories to a sequence of text. In Neural Network Language Models learning long-term dependencies with gradient descent is difficult due to the vanishing gradient problem. Recently researchers started to increase the depth of the network in order to overcome the limitations of the existing techniques. However, increasing the depth of the network means increasing the number of the parameters, which makes the network computationally expensive, and more prone to overfitting. Furthermore, NLP systems traditionally treat words as discrete atomic symbols; the model can leverage small amounts of information regarding the relationship between the individual symbols. In recent years, deep learning models such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been applied to language modeling with comparative, remarkable results. CNNs are a noble approach to extract higher-level features invariant to local translation. However, this method requires the stacking of multiple convolutional layers in order to capture long-term dependencies because of the locality of the convolutional and pooling layers. In this dissertation, we introduce a joint CNN-RNN framework to overcome the problems in the existing deep learning models. Briefly, we applied an unsupervised neural language model to train initial word embeddings that are further tuned by our deep learning network, then the pre-trained parameters of the network are used to initialize the model. At a final stage, the proposed framework combines former information with a set of feature maps learned by a convolutional layer with long-term dependencies learned via Long-Short-Term Memory (LSTM). Empirically, we show that our approach, with slight hyperparameter tuning and static vectors, achieves outstanding results on multiple sentiment analysis benchmarks. Our approach outperforms several existing approaches in term of accuracy; our results are also competitive with the state-of-the-art results on the Stanford Large Movie Review (IMDB) dataset, and the Stanford Sentiment Treebank (SSTb) dataset. Our approach has a significant role in reducing the number of parameters and constructing the convolutional layer followed by the recurrent layer with no pooling layers. Our results show that we were able to reduce the loss of detailed, local information and capture long-term dependencies with an efficient framework that has fewer parameters and a high level of performance.

Description

Citation

A. Hassan, "Deep Neural Language Model For Text Classification Based On Convolutional And Recurrent Neural Networks", Ph.D. dissertation, Dept. of Computer Science and Engineering, Univ. of Bridgeport, Bridgeport, CT, 2018.

Publisher

License

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN