RMDL

Referenced paper : RMDL: Random Multimodel Deep Learning for Classification

RMDL: Random Multimodel Deep Learning for Classification

Global.py

Create weights folder and download GloVe for text classification (if you already download GloVe set Glove Directory in Global.py)

Text Feature Extraction:

We used two different feature extraction :

Term Frequency-Inverse Document Frequency

GloVe: Global Vectors for Word Representation

BuildModel.py:

This file contain build random model of deep learning architectures for image and text including (DNN, CNN, and RNN)

General requirements:

Python 3.5 or later see Instruction Documents
TensorFlow see Instruction Documents.
scikit-learn see Instruction Documents
Keras see Instruction Documents
scipy see Instruction Documents
GPU (if you want to run on GPU):
- CUDA® Toolkit 8.0. For details, see NVIDIA's documentation.
- The NVIDIA drivers associated with CUDA Toolkit 8.0.
- cuDNN v6. For details, see NVIDIA's documentation.
- GPU card with CUDA Compute Capability 3.0 or higher.
- The libcupti-dev library,

Parameters:

Text Classification

from RMDL import RMDL_Text

Text_Classification(x_train, y_train, x_test,  y_test, batch_size=128,
                 EMBEDDING_DIM=50,MAX_SEQUENCE_LENGTH = 500,
                 MAX_NB_WORDS = 75000, GloVe_dir="",
                 GloVe_file = "glove.6B.50d.txt",
                 sparse_categorical=True, random_deep=[3, 3, 3],
                 epochs=[500, 500, 500],  plot=True,
                 min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
                 min_nodes_dnn=128, max_nodes_dnn=1024,
                 min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
                 min_nodes_rnn=32,  max_nodes_rnn=128,
                 min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
                 min_nodes_cnn=128, max_nodes_cnn=512,
                 random_state=42, random_optimizor=True, dropout=0.05):

Input

x_train
y_train
x_test
y_test

batch_size

batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.

EMBEDDING_DIM

batch_size: Integer. Shape of word embedding (this number should be same with GloVe or other pre-trained embedding techniques that be used), it will default to 50 that used with pain of glove.6B.50d.txt file.

MAX_SEQUENCE_LENGTH

MAX_SEQUENCE_LENGTH: Integer. Maximum length of sequence or document in datasets, it will default to 500.

MAX_NB_WORDS

MAX_NB_WORDS: Integer. Maximum number of unique words in datasets, it will default to 75000.

GloVe_dir

GloVe_dir: String. Address of GloVe or any pre-trained directory, it will default to null which glove.6B.zip will be download.

GloVe_file

GloVe_dir: String. Which version of GloVe or pre-trained word emending will be used, it will default to glove.6B.50d.txt.
NOTE: if you use other version of GloVe EMBEDDING_DIM must be same dimensions.

sparse_categorical

sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.

random_deep

random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].

epochs

epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].

plot

plot: bool. True: shows confusion matrix and accuracy and loss

min_hidden_layer_dnn

min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.

max_hidden_layer_dnn

max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.

min_nodes_dnn

min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.

max_nodes_dnn

max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.

min_hidden_layer_rnn

min_hidden_layer_rnn: Integer. Lower Bounds of hidden layers of RNN used in RMDL, it will default to 1.

max_hidden_layer_rnn

man_hidden_layer_rnn: Integer. Upper Bounds of hidden layers of RNN used in RMDL, it will default to 5.

min_nodes_rnn

min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.

max_nodes_rnn

max_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.

min_hidden_layer_cnn

min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.

max_hidden_layer_cnn

max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.

min_nodes_cnn

min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.

max_nodes_cnn

min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.

random_state

random_state : Integer, RandomState instance or None, optional (default=None)
- If Integer, random_state is the seed used by the random number generator;

random_optimizor

random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True

dropout

dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

Image Classification

from RMDL import RMDL_Image

Image_Classification(x_train, y_train, x_test, y_test, shape, batch_size=128,
                     sparse_categorical=True, random_deep=[3, 3, 3],
                     epochs=[500, 500, 500], plot=True,
                     min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
                     min_nodes_dnn=128, max_nodes_dnn=1024,
                     min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
                     min_nodes_rnn=32, max_nodes_rnn=128,
                     min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
                     min_nodes_cnn=128, max_nodes_cnn=512,
                     random_state=42, random_optimizor=True, dropout=0.05)

Input

x_train
y_train
x_test
y_test

shape

shape: np.shape . shape of image. The most common situation would be a 2D input with shape (batch_size, input_dim).

batch_size

batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.

sparse_categorical

sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.

random_deep

random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].

epochs

epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].

plot

plot: bool. True: shows confusion matrix and accuracy and loss

min_hidden_layer_dnn

min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.

max_hidden_layer_dnn

max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.

min_nodes_dnn

min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.

max_nodes_dnn

max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.

min_nodes_rnn

min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.

max_nodes_rnn

maz_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.

min_hidden_layer_cnn

min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.

max_hidden_layer_cnn

max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.

min_nodes_cnn

min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.

max_nodes_cnn

min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.

random_state

random_state : Integer, RandomState instance or None, optional (default=None)
- If Integer, random_state is the seed used by the random number generator;

random_optimizor

random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True

dropout

dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

Error and Comments:

Send an email to kk7nc@virginia.edu

Citation

@inproceedings{Kowsari2018RMDL,
title={RMDL: Random Multimodel Deep Learning for Classification},
author={Kowsari, Kamran and Heidarysafa, Mojtaba and Brown, Donald E. and Jafari Meimandi, Kiana and Barnes, Laura E.},
booktitle={Proceedings of the 2018 International Conference on Information System and Data Mining},
year={2018},
organization={ACM}
}

Name		Name	Last commit message	Last commit date
parent directory ..
Download		Download
__pycache__		__pycache__
BuildModel.py		BuildModel.py
Global.py		Global.py
Plot.py		Plot.py
README.md		README.md
RMDL_Image.py		RMDL_Image.py
RMDL_Text.py		RMDL_Text.py
__init__.py		__init__.py
text_feature_extraction.py		text_feature_extraction.py

Files

RMDL

Directory actions

More options

Directory actions

More options

Latest commit

History

RMDL

Folders and files

parent directory

README.md

RMDL: Random Multimodel Deep Learning for Classification

Global.py

Text Feature Extraction:

Term Frequency-Inverse Document Frequency

GloVe: Global Vectors for Word Representation

BuildModel.py:

General requirements:

Parameters:

Text Classification

Input

batch_size

EMBEDDING_DIM

MAX_SEQUENCE_LENGTH

MAX_NB_WORDS

GloVe_dir

GloVe_file

sparse_categorical

random_deep

epochs

plot

min_hidden_layer_dnn

max_hidden_layer_dnn

min_nodes_dnn

max_nodes_dnn

min_hidden_layer_rnn

max_hidden_layer_rnn

min_nodes_rnn

max_nodes_rnn

min_hidden_layer_cnn

max_hidden_layer_cnn

min_nodes_cnn

max_nodes_cnn

random_state

random_optimizor

dropout

Image Classification

Input

shape

batch_size

sparse_categorical

random_deep

epochs

plot

min_hidden_layer_dnn

max_hidden_layer_dnn

min_nodes_dnn

max_nodes_dnn

min_nodes_rnn

max_nodes_rnn

min_hidden_layer_cnn

max_hidden_layer_cnn

min_nodes_cnn

max_nodes_cnn

random_state

random_optimizor

dropout

Error and Comments:

Citation