[go: up one dir, main page]

0% found this document useful (0 votes)
18 views3 pages

Personality Prediction Equations

The document outlines a project on personality prediction using video and audio data, detailing mathematical equations and explanations for various processes such as frame extraction, feature extraction, and normalization. It describes the use of neural networks to analyze multimodal inputs and predict personality traits based on the OCEAN model. Key components include feature fusion, activation functions, loss functions, and evaluation metrics to assess model performance.

Uploaded by

Sumanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views3 pages

Personality Prediction Equations

The document outlines a project on personality prediction using video and audio data, detailing mathematical equations and explanations for various processes such as frame extraction, feature extraction, and normalization. It describes the use of neural networks to analyze multimodal inputs and predict personality traits based on the OCEAN model. Key components include feature fusion, activation functions, loss functions, and evaluation metrics to assess model performance.

Uploaded by

Sumanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Mathematical Expressions and

Explanations for Personality Prediction


Project
(1) Video Frame Extraction
Equation: V_{frames} = Duration \times F

Explanation:

To effectively capture meaningful visual cues from the video, frames are extracted at a fixed
rate of frames per second (fps). This process ensures that expressions and facial movements
are not missed. The total number of frames extracted from a video is computed by
multiplying its duration (in seconds) by the chosen frame rate. These frames are then
resized to a standardized resolution (224×224) and passed through a face detection model
like MTCNN to ensure the facial area is the focal point while reducing background noise.

(2) Audio Feature Extraction


Equation: \mathbf{a} = \frac{1}{T} \sum_{t=1}^{T} \text{MFCC}_t(x)

Explanation:

The audio signal from each video is processed to extract Mel-Frequency Cepstral
Coefficients (MFCCs), which capture the timbral and phonetic characteristics of speech.
MFCCs are computed over small time intervals and averaged across the entire audio clip to
create a fixed-length feature vector. This representation effectively summarizes the vocal
traits of the speaker, which are critical for inferring personality.

(3) Video Feature Extraction


Equation: \mathbf{v} = \frac{1}{n} \sum_{i=1}^{n} \text{VGG16}(I_i)

Explanation:

Each extracted facial frame is fed into a pretrained VGG16 convolutional neural network to
obtain deep visual features that represent facial expressions, pose, and other identity-
related information. These features are averaged across all selected frames from the video
to produce a single, compact feature vector that characterizes the person’s visual behavior.
(4) Feature Fusion
Equation: \mathbf{x} = [\mathbf{v}; \mathbf{a}]

Explanation:

The audio and video feature vectors are concatenated to form a single multimodal input.
This combined vector captures both verbal (audio) and non-verbal (visual) aspects of the
subject’s behavior, allowing the model to learn richer personality cues than using either
modality alone.

(5) Feature Normalization


Equation: \hat{x}_i = \frac{x_i - \mu_i}{\sigma_i}

Explanation:

Before feeding the features into the neural network, normalization is applied to bring all
input values to a common scale. This ensures that features with large numeric ranges do not
dominate others. Z-score normalization is used here, where each feature is centered by
subtracting the mean and scaled by the standard deviation.

(6) Linear Transformation in Hidden Layers


Equation: \mathbf{z}^{(l)} = \mathbf{W}^{(l)} \mathbf{x}^{(l-1)} + \mathbf{b}^{(l)}

Explanation:

In each layer of the neural network, a linear transformation is performed using learned
weights and biases. This operation projects the input into a new space where patterns
related to personality can be more easily detected.

(7) Activation Function (ReLU)


Equation: \mathbf{x}^{(l)} = \text{ReLU}(\mathbf{z}^{(l)}) = \max(0, \mathbf{z}^{(l)})

Explanation:

After the linear transformation, a ReLU (Rectified Linear Unit) activation is applied. This
introduces non-linearity into the model, enabling it to learn complex patterns in the data.
ReLU is commonly used for its simplicity and effectiveness in deep learning.

(8) Output Layer (Regression for OCEAN)


Equation: \hat{\mathbf{y}} = \mathbf{W}^{(L+1)} \mathbf{x}^{(L)} + \mathbf{b}^{(L+1)}

Explanation:
The final layer of the neural network outputs five continuous values corresponding to the
predicted Big Five (OCEAN) personality traits. These values are not passed through a
softmax or sigmoid because this is a regression task, not classification.

(9) Loss Function (Mean Squared Error)


Equation: \mathcal{L}_{\text{MSE}} = \frac{1}{5} \sum_{i=1}^{5} (y_i - \hat{y}_i)^2

Explanation:

To evaluate the difference between the predicted and actual trait scores, the Mean Squared
Error (MSE) is used as the loss function. It penalizes larger errors more than smaller ones
and is commonly used in regression problems to guide model training.

(10) Gradient Descent Update Rule


Equation: \theta \leftarrow \theta - \eta \cdot \nabla_{\theta} \mathcal{L}

Explanation:

During training, the model parameters (weights and biases) are updated to minimize the
loss using gradient descent or a variant like Adam. The update rule subtracts a portion of
the gradient of the loss (scaled by the learning rate) from the current parameters.

(11) Evaluation Metric (R² Score)


Equation: R^2 = 1 - \frac{\sum_{i} (y_i - \hat{y}_i)^2}{\sum_{i} (y_i - \bar{y})^2}

Explanation:

The R² score, or coefficient of determination, evaluates how well the model explains the
variability in the true personality scores. An R² value close to 1 indicates that the model
makes accurate predictions, whereas a value near 0 means it performs no better than
simply predicting the mean.

You might also like