Case Study - Vae Application
Case Study - Vae Application
z = Lambda(sampling)([z_mean, z_log_var])
decoder_inputs = Input(shape=(latent_dim,))
x = Dense(128, activation='relu')(decoder_inputs)
outputs = Dense(input_dim, activation='linear')(x)
Explanation:
The code defines a custom Keras layer, VAELossLayer, which calculates the
loss for a Variational Autoencoder (VAE) by combining the reconstruction
loss and the KL divergence. This loss ensures the VAE learns meaningful
encodings in the latent space. The VAE model then uses this custom layer
to compute its overall loss and is compiled with the Adam optimizer for
training.
Code:
outputs = decoder(z)
vae_outputs = VAELossLayer()([encoder_inputs, outputs, z_log_var,
z_mean])
vae = Model(encoder_inputs, vae_outputs)
vae.compile(optimizer=Adam())
Explanation:
The code decodes latent representations into outputs, calculates the model's
error using a special layer VAELossLayer, builds the full model from input to
output, and then readies it for training with the Adam method.
Code:
# ---- TRAIN VAE ----
train_size = int(0.8 * len(sensor_data))
x_train = sensor_data[:train_size]
x_valid = sensor_data[train_size:]
epochs = 50
batch_size = 32
vae.fit(x_train, x_train,
shuffle=True,
epochs=epochs,
batch_size=batch_size,
validation_data=(x_valid, x_valid))
Output:
Epoch 1/50
108/108 [==============================] - 1s 5ms/step
- loss: 253493.5781 - val_loss: 84043.8594
Epoch 2/50
108/108 [==============================] - 0s 4ms/step
- loss: 44589.9336 - val_loss: 27646.7246
...
108/108 [==============================] - 1s 5ms/step
- loss: 257.5038 - val_loss: 362.1933
108/108 [==============================] - 0s 2ms/step
136/136 [==============================] - 0s 2ms/step
Explanation:
In this code, data is divided into training and validation sets. This lets the
model learn from one part and tests its performance on unseen data,
preventing overfitting. The model trains by reviewing the entire dataset
multiple times, as specified by epochs, and processes data in groups
determined by batch_size, which affects both the speed and quality of
training. Training with vae.fit(): The model tweaks its settings to better
predict the training data, while also checking its accuracy on the validation
data.
Code:
# ---- ANOMALY DETECTION ----
reconstructed_train = vae.predict(x_train)
train_error = np.mean(np.square(x_train - reconstructed_train), axis=-1)
reconstructed_data = vae.predict(sensor_data)
reconstruction_error = np.mean(np.square(sensor_data -
reconstructed_data), axis=-1)
Explanation:
The code uses the trained model to rebuild the training data and calculate
the error between original and rebuilt values. It sets a threshold based on
the 99th percentile of this error. Then, it rebuilds the entire dataset and
identifies data points with errors exceeding the threshold as anomalies,
finally printing the number of anomalies detected.
Code:
import matplotlib.pyplot as plt
plt.figure(figsize=(15, 6))
plt.plot(data['Timestamp'], reconstruction_error, label='Reconstruction
Error')
plt.axhline(y=threshold, color='r', linestyle='--', label='Threshold')
plt.title("Reconstruction Error Over Time")
plt.legend()
plt.xlabel("Timestamp")
plt.ylabel("Reconstruction Error")
# Rotate the x-axis labels for better readability and set an interval for them
plt.xticks(rotation=45)
plt.gca().xaxis.set_major_locator(plt.MaxNLocator(nbins=40))
plt.tight_layout() # Adjust layout to ensure labels fit
plt.show()
Output:
Explanation:
The chart visualizes the error between the original and rebuilt data over time.
A red dashed line indicates a set threshold. Any spike in the error above this
line suggests a potential anomaly at that time.
Challenges in Implementation