Referenced paper : RMDL: Random Multimodel Deep Learning for Classification
Create weights folder and download GloVe for text classification (if you already download GloVe set Glove Directory in Global.py)
We used two different feature extraction :
This file contain build random model of deep learning architectures for image and text including (DNN, CNN, and RNN)
-
Python 3.5 or later see Instruction Documents
-
TensorFlow see Instruction Documents.
-
scikit-learn see Instruction Documents
-
Keras see Instruction Documents
-
scipy see Instruction Documents
-
GPU (if you want to run on GPU):
-
CUDA® Toolkit 8.0. For details, see NVIDIA's documentation.
-
cuDNN v6. For details, see NVIDIA's documentation.
-
GPU card with CUDA Compute Capability 3.0 or higher.
-
The libcupti-dev library,
-
from RMDL import RMDL_Text
Text_Classification(x_train, y_train, x_test, y_test, batch_size=128,
EMBEDDING_DIM=50,MAX_SEQUENCE_LENGTH = 500,
MAX_NB_WORDS = 75000, GloVe_dir="",
GloVe_file = "glove.6B.50d.txt",
sparse_categorical=True, random_deep=[3, 3, 3],
epochs=[500, 500, 500], plot=True,
min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
min_nodes_dnn=128, max_nodes_dnn=1024,
min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
min_nodes_rnn=32, max_nodes_rnn=128,
min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
min_nodes_cnn=128, max_nodes_cnn=512,
random_state=42, random_optimizor=True, dropout=0.05):
- x_train
- y_train
- x_test
- y_test
- batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.
- batch_size: Integer. Shape of word embedding (this number should be same with GloVe or other pre-trained embedding techniques that be used), it will default to 50 that used with pain of glove.6B.50d.txt file.
- MAX_SEQUENCE_LENGTH: Integer. Maximum length of sequence or document in datasets, it will default to 500.
- MAX_NB_WORDS: Integer. Maximum number of unique words in datasets, it will default to 75000.
- GloVe_dir: String. Address of GloVe or any pre-trained directory, it will default to null which glove.6B.zip will be download.
- GloVe_dir: String. Which version of GloVe or pre-trained word emending will be used, it will default to glove.6B.50d.txt.
- NOTE: if you use other version of GloVe EMBEDDING_DIM must be same dimensions.
- sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.
- random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].
- epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].
- plot: bool. True: shows confusion matrix and accuracy and loss
min_hidden_layer_dnn
- min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.
max_hidden_layer_dnn
- max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.
- min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.
- max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.
min_hidden_layer_rnn
- min_hidden_layer_rnn: Integer. Lower Bounds of hidden layers of RNN used in RMDL, it will default to 1.
max_hidden_layer_rnn
- man_hidden_layer_rnn: Integer. Upper Bounds of hidden layers of RNN used in RMDL, it will default to 5.
- min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.
- max_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.
min_hidden_layer_cnn
- min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.
max_hidden_layer_cnn
- max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.
- min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.
- min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.
-
random_state : Integer, RandomState instance or None, optional (default=None)
- If Integer, random_state is the seed used by the random number generator;
- random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
from RMDL import RMDL_Image
Image_Classification(x_train, y_train, x_test, y_test, shape, batch_size=128,
sparse_categorical=True, random_deep=[3, 3, 3],
epochs=[500, 500, 500], plot=True,
min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
min_nodes_dnn=128, max_nodes_dnn=1024,
min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
min_nodes_rnn=32, max_nodes_rnn=128,
min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
min_nodes_cnn=128, max_nodes_cnn=512,
random_state=42, random_optimizor=True, dropout=0.05)
- x_train
- y_train
- x_test
- y_test
- shape: np.shape . shape of image. The most common situation would be a 2D input with shape (batch_size, input_dim).
- batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.
- sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.
- random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].
- epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].
- plot: bool. True: shows confusion matrix and accuracy and loss
min_hidden_layer_dnn
- min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.
max_hidden_layer_dnn
- max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.
- min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.
- max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.
- min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.
- maz_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.
min_hidden_layer_cnn
- min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.
max_hidden_layer_cnn
- max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.
- min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.
- min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.
-
random_state : Integer, RandomState instance or None, optional (default=None)
- If Integer, random_state is the seed used by the random number generator;
- random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True
- dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.
Send an email to kk7nc@virginia.edu
@inproceedings{Kowsari2018RMDL,
title={RMDL: Random Multimodel Deep Learning for Classification},
author={Kowsari, Kamran and Heidarysafa, Mojtaba and Brown, Donald E. and Jafari Meimandi, Kiana and Barnes, Laura E.},
booktitle={Proceedings of the 2018 International Conference on Information System and Data Mining},
year={2018},
organization={ACM}
}