Deep Learning - CNN
Deep Learning - CNN
Image Processing:
OpenCV (cv2):
cv2.resize(): Resize an image.
● Syntax: cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]])
● Parameters:
● src: Input image.
● dsize: Output image size.
● fx: Scale factor along the horizontal axis.
● fy: Scale factor along the vertical axis.
● interpolation: Interpolation method (e.g., cv2.INTER_LINEAR).
cv2.flip(): Flip an image.
● Syntax: cv2.flip(src, flipCode)
● Parameters:
● src: Input image.
● flipCode: Flip direction (0: vertical, 1: horizontal, -1: both).
cv2.rotate(): Rotate an image.
● Syntax: cv2.rotate(src, rotateCode)
● Parameters:
● src: Input image.
● rotateCode: Rotation direction (e.g.,
cv2.ROTATE_90_CLOCKWISE).
cv2.cvtColor(): Convert color spaces.
● Syntax: cv2.cvtColor(src, code[, dst[, dstCn]])
● Parameters:
● src: Input image.
● code: Color space conversion code (e.g.,
cv2.COLOR_BGR2GRAY).
cv2.split(): Split channels.
● Syntax: cv2.split(src[, mv])
● Parameters:
● src: Input image.
cv2.merge(): Merge channels.
● Syntax: cv2.merge(mv[, dst])
● Parameters:
● mv: Input vector of matrices.
cv2.addWeighted(): Blend two images.
● Syntax: cv2.addWeighted(src1, alpha, src2, beta, gamma[, dst[, dtype]])
● Parameters:
● src1, src2: Input images.
● alpha, beta: Weights for the images.
● gamma: Scalar added to each sum.
Pillow (PIL):
Image.resize(): Resize an image.
● Syntax: Image.resize(size, resample=0)
● Parameters:
● size: Output size (width, height).
● resample: Resampling filter (0: nearest, 2: bilinear).
Image.rotate(): Rotate an image.
● Syntax: Image.rotate(angle, resample=0, expand=0, center=None)
● Parameters:
● angle: Rotation angle in degrees.
● resample: Resampling filter (0: nearest, 2: bilinear).
● expand: Boolean indicating whether to expand the output image.
● center: Center of rotation (x, y).
Image.transpose(): Transpose an image.
● Syntax: Image.transpose(method)
● Parameters:
● method: Transpose method (e.g., Image.FLIP_LEFT_RIGHT).
Image.convert(): Convert image modes.
● Syntax: Image.convert(mode, matrix=None, dither=None, palette=0,
colors=256)
● Parameters:
● mode: Output mode (e.g., "L" for grayscale, "RGB" for RGB).
Image Arithmetic:
● cv2.addWeighted(): Blend two images represented by NumPy arrays.
● Syntax: cv2.addWeighted(src1, alpha, src2, beta, gamma[, dst[, dtype]])
● Parameters:
● src1: First input array (image).
● alpha: Weight of the first image.
● src2: Second input array (image).
● beta: Weight of the second image.
● gamma: Scalar added to each sum.
● dst: Output array (optional).
● dtype: Data type of the output array (optional).
Geometric Transformation:
● cv2.getRotationMatrix2D(): Compute an affine transformation matrix for rotating
an image around a specified point.
● Syntax: cv2.getRotationMatrix2D(center, angle, scale)
● Parameters:
● center: Center of rotation.
● angle: Rotation angle in degrees.
● scale: Scale factor.
Convolution and Pooling
Convolution Operation:
● TensorFlow:
● tf.nn.conv2d(input, filter, strides, padding)
● PyTorch:
● torch.nn.functional.conv2d(input, weight, bias=None, stride=1,
padding=0)
● Keras:
● keras.layers.Conv2D(node, kernel_size, kernel_initializer,
bias_initializer, activation , shape(input size),kernel_regularizer)
Filter/Kernel Initialization:
● TensorFlow:
● tf.Variable(initial_value, trainable=True)
● PyTorch:
● torch.nn.init.xavier_uniform_(tensor)
● Keras:
● keras.initializers.GlorotNormal()
Pooling Layers:
● TensorFlow:
● Max Pooling: tf.nn.max_pool(value, ksize, strides, padding)
● Average Pooling: tf.nn.avg_pool(value, ksize, strides, padding)
● PyTorch:
● Max Pooling: torch.nn.functional.max_pool2d(input, kernel_size,
stride=None, padding=0, dilation=1, ceil_mode=False)
● Average Pooling: torch.nn.functional.avg_pool2d(input, kernel_size,
stride=None, padding=0, ceil_mode=False,
count_include_pad=True)
● Keras:
● Max Pooling: keras.layer.MaxPooling2D(pool_size=(2, 2),
strides=None, padding="valid", data_format=None)
● Average Pooling: keras.layers.AveragePooling2D(pool_size,
strides=None, padding="valid", data_format=None)
Normalization:
● TensorFlow:
● Batch Normalization: tf.keras.layers.BatchNormalization()
● PyTorch:
● Batch Normalization: torch.nn.BatchNorm2d(num_features,
eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
● Keras:
● Batch Normalization: keras.layers.BatchNormalization( axis=-1,
momentum=0.99, epsilon=0.001, center=True, scale=True,
beta_initializer="zeros", gamma_initializer="ones")
Regularization:
● TensorFlow:
● Dropout: tf.keras.layers.Dropout(rate)
● PyTorch:
● Dropout: torch.nn.Dropout(p=0.5, inplace=False)
● Keras:
● keras.regularizers.L1L2()
● keras.regularizers.OrthogonalRegularizer(factor=0.01,
mode="rows")
Padding:
● TensorFlow:
● tf.pad(tensor, paddings, mode='CONSTANT', constant_values=0)
● PyTorch:
● Zero Padding: torch.nn.ZeroPad2d(padding)
● Keras:
● keras.layers.ZeroPadding2D(padding=(2, 2))
Image Classification
Model Architecture:
● TensorFlow:
● tf.keras.Sequential()
● tf.keras.layers.Conv2D(filters, kernel_size, activation='relu',
input_shape)
● tf.keras.layers.MaxPooling2D(pool_size)
● tf.keras.layers.Flatten()
● tf.keras.layers.Dense(units, activation='relu')
● tf.keras.layers.Dense(units, activation='softmax')
● PyTorch:
● class CustomCNN(nn.Module):
● nn.Conv2d(in_channels, out_channels, kernel_size, stride=1,
padding=0)
● nn.MaxPool2d(kernel_size, stride=None, padding=0)
● nn.Linear(in_features, out_features, bias=True)
● Keras:
● inputs = keras.Input(shape=input_shape)
● x = layers.Rescaling(1.0 / 255)(inputs)
● x = layers.Conv2D(128, 3, strides=2, padding="same")(x)
● x = layers.BatchNormalization()(x)
● x = layers.Activation("relu")(x)
● x = layers.SeparableConv2D(1024, 3, padding="same")(x)
● x = layers.GlobalAveragePooling2D()(x)
● x = layers.Dropout(0.25)(x)
● outputs = layers.Dense(units, activation=None)(x)
● keras.utils.plot_model(model, show_shapes=True)
Loss Function:
● TensorFlow:
● tf.keras.losses.CategoricalCrossentropy()
● PyTorch:
● nn.CrossEntropyLoss()
● Keras:
● keras.losses.CategoricalCrossentropy(from_logits=False)
Optimizer:
● TensorFlow:
● tf.keras.optimizers.Adam(learning_rate=0.001)
● PyTorch:
● torch.optim.Adam(params, lr=0.001)
● Keras:
● keras.optimizers.Adam(learning_rate=0.001)
Learning Rate Scheduler:
● TensorFlow:
● tf.keras.callbacks.LearningRateScheduler(schedule, verbose=0)
● PyTorch:
● torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda)
● Keras:
● keras.callbacks.LearningRateScheduler(schedule, verbose=0)
● keras.callbacks.ModelCheckpoint("save_at_{epoch}.keras")
Model Training:
● TensorFlow:
● model.compile(optimizer, loss, metrics)
● model.fit(train_dataset, epochs, validation_data)
● PyTorch:
● criterion = nn.CrossEntropyLoss() and optimizer =
torch.optim.Adam(model.parameters(), lr=0.001)
● Loop over data loader and update weights based on loss.
● Keras:
● model.compile(loss='categorical_crossentropy', optimizer='adagrad',
metrics=['accuracy'])
● model.fit(Train_ds, epochs=epochs, callbacks=callbacks,
validation_data=val_ds)