8000 RMDL/RMDL at master · kk7nc/RMDL · GitHub
[go: up one dir, main page]

Skip to content

Latest commit

 

History

History

RMDL

DOI Pypi werckerstatus appveyor BuildStatus PowerPoint researchgate Binder pdf GitHublicense

Referenced paper : RMDL: Random Multimodel Deep Learning for Classification

RMDL: Random Multimodel Deep Learning for Classification

Global.py

Create weights folder and download GloVe for text classification (if you already download GloVe set Glove Directory in Global.py)

Text Feature Extraction:

We used two different feature extraction :

BuildModel.py:

This file contain build random model of deep learning architectures for image and text including (DNN, CNN, and RNN)

General requirements:

Parameters:

Text Classification

from RMDL import RMDL_Text
Text_Classification(x_train, y_train, x_test,  y_test, batch_size=128,
                 EMBEDDING_DIM=50,MAX_SEQUENCE_LENGTH = 500,
                 MAX_NB_WORDS = 75000, GloVe_dir="",
                 GloVe_file = "glove.6B.50d.txt",
                 sparse_categorical=True, random_deep=[3, 3, 3],
                 epochs=[500, 500, 500],  plot=True,
                 min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
                 min_nodes_dnn=128, max_nodes_dnn=1024,
                 min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
                 min_nodes_rnn=32,  max_nodes_rnn=128,
                 min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
                 min_nodes_cnn=128, max_nodes_cnn=512,
                 random_state=42, random_optimizor=True, dropout=0.05):

Input

  • x_train
  • y_train
  • x_test
  • y_test

batch_size

  • batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.

EMBEDDING_DIM

  • batch_size: Integer. Shape of word embedding (this number should be same with GloVe or other pre-trained embedding techniques that be used), it will default to 50 that used with pain of glove.6B.50d.txt file.

MAX_SEQUENCE_LENGTH

  • MAX_SEQUENCE_LENGTH: Integer. Maximum length of sequence or document in datasets, it will default to 500.

MAX_NB_WORDS

  • MAX_NB_WORDS: Integer. Maximum number of unique words in datasets, it will default to 75000.

GloVe_dir

  • GloVe_dir: String. Address of GloVe or any pre-trained directory, it will default to null which glove.6B.zip will be download.

GloVe_file

  • GloVe_dir: String. Which version of GloVe or pre-trained word emending will be used, it will default to glove.6B.50d.txt.
  • NOTE: if you use other version of GloVe EMBEDDING_DIM must be same dimensions.

sparse_categorical

  • sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.

random_deep

  • random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].

epochs

  • epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].

plot

  • plot: bool. True: shows confusion matrix and accuracy and loss

min_hidden_layer_dnn

  • min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.

max_hidden_layer_dnn

  • max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.

min_nodes_dnn

  • min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.

max_nodes_dnn

  • max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.

min_hidden_layer_rnn

  • min_hidden_layer_rnn: Integer. Lower Bounds of hidden layers of RNN used in RMDL, it will default to 1.

max_hidden_layer_rnn

  • man_hidden_layer_rnn: Integer. Upper Bounds of hidden layers of RNN used in RMDL, it will default to 5.

min_nodes_rnn

  • min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.

max_nodes_rnn

  • max_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.

min_hidden_layer_cnn

  • min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.

max_hidden_layer_cnn

  • max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.

min_nodes_cnn

  • min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.

max_nodes_cnn

  • min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.

random_state

  • random_state : Integer, RandomState instance or None, optional (default=None)

    • If Integer, random_state is the seed used by the random number generator;

random_optimizor

  • random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True

dropout

  • dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

Image Classification

from RMDL import RMDL_Image
Image_Classification(x_train, y_train, x_test, y_test, shape, batch_size=128,
                     sparse_categorical=True, random_deep=[3, 3, 3],
                     epochs=[500, 500, 500], plot=True,
                     min_hidden_layer_dnn=1, max_hidden_layer_dnn=8,
                     min_nodes_dnn=128, max_nodes_dnn=1024,
                     min_hidden_layer_rnn=1, max_hidden_layer_rnn=5,
                     min_nodes_rnn=32, max_nodes_rnn=128,
                     min_hidden_layer_cnn=3, max_hidden_layer_cnn=10,
                     min_nodes_cnn=128, max_nodes_cnn=512,
                     random_state=42, random_optimizor=True, dropout=0.05)

Input

  • x_train
  • y_train
  • x_test
  • y_test

shape

  • shape: np.shape . shape of image. The most common situation would be a 2D input with shape (batch_size, input_dim).

batch_size

  • batch_size: Integer. Number of samples per gradient update. If unspecified, it will default to 128.

sparse_categorical

  • sparse_categorical: bool. When target's dataset is (n,1) should be True, it will default to True.

random_deep

  • random_deep: Integer [3]. Number of ensembled model used in RMDL random_deep[0] is number of DNN, random_deep[1] is number of RNN, random_deep[0] is number of CNN, it will default to [3, 3, 3].

epochs

  • epochs: Integer [3]. Number of epochs in each ensembled model used in RMDL epochs[0] is number of epochs used in DNN, epochs[1] is number of epochs used in RNN, epochs[0] is number of epochs used in CNN, it will default to [500, 500, 500].

plot

  • plot: bool. True: shows confusion matrix and accuracy and loss

min_hidden_layer_dnn

  • min_hidden_layer_dnn: Integer. Lower Bounds of hidden layers of DNN used in RMDL, it will default to 1.

max_hidden_layer_dnn

  • max_hidden_layer_dnn: Integer. Upper bounds of hidden layers of DNN used in RMDL, it will default to 8.

min_nodes_dnn

  • min_nodes_dnn: Integer. Lower bounds of nodes in each layer of DNN used in RMDL, it will default to 128.

max_nodes_dnn

  • max_nodes_dnn: Integer. Upper bounds of nodes in each layer of DNN used in RMDL, it will default to 1024.

min_nodes_rnn

  • min_nodes_rnn: Integer. Lower bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 32.

max_nodes_rnn

  • maz_nodes_rnn: Integer. Upper bounds of nodes (LSTM or GRU) in each layer of RNN used in RMDL, it will default to 128.

min_hidden_layer_cnn

  • min_hidden_layer_cnn: Integer. Lower Bounds of hidden layers of CNN used in RMDL, it will default to 3.

max_hidden_layer_cnn

  • max_hidden_layer_cnn: Integer. Upper Bounds of hidden layers of CNN used in RMDL, it will default to 10.

min_nodes_cnn

  • min_nodes_cnn: Integer. Lower bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 128.

max_nodes_cnn

  • min_nodes_cnn: Integer. Upper bounds of nodes (2D convolution layer) in each layer of CNN used in RMDL, it will default to 512.

random_state

  • random_state : Integer, RandomState instance or None, optional (default=None)

    • If Integer, random_state is the seed used by the random number generator;

random_optimizor

  • random_optimizor : bool, If False, all models use adam optimizer. If True, all models use random optimizers. it will default to True

dropout

  • dropout: Float between 0 and 1. Fraction of the units to drop for the linear transformation of the inputs.

Error and Comments:

Send an email to kk7nc@virginia.edu

Citation

@inproceedings{Kowsari2018RMDL,
title={RMDL: Random Multimodel Deep Learning for Classification},
author={Kowsari, Kamran and Heidarysafa, Mojtaba and Brown, Donald E. and Jafari Meimandi, Kiana and Barnes, Laura E.},
booktitle={Proceedings of the 2018 International Conference on Information System and Data Mining},
year={2018},
organization={ACM}
}
0