-
Notifications
You must be signed in to change notification settings - Fork 226
Parallel VM creation fix #524
Parallel VM creation fix #524
Conversation
575f829
to
6299c72
Compare
Tightened the lock by moving it closer to the loop device and device mapper setup code. This will help avoid other processes from waiting on the lock when they could start importing the images. Tested it multiple times with and without sleep in the lock retry, didn't see any failure without sleep. Decided to not include any sleep and retry as soon as possible. Releasing the lock right after creating the loop devices only causes race condition in the lock file creation with error |
This creates an ignite lock file at /tmp/ignite-snapshot.lock when an overlay snapshot is created. The locking is handled via pid file using github.com/nightlyone/lockfile package. This helps avoid the race condition when multiple ignite processes try to create loop device and use the device mapper for overlay snapshot at the same time. When a process obtains a lock, other processes retry to obtain a lock, until a lock is obtained. Once the snapshot is activated, the lock is released.
`make tidy-in-docker`
6299c72
to
884c28e
Compare
Thanks so much for the thorough testing of this bug-fix |
Concurrent VM creation is much faster than serial with this patch 🏎️ 5 vm's -- parallel vs. serial: num_vms=5
time (
for i in {1..${num_vms}}; do
sudo bin/ignite run weaveworks/ignite-ubuntu \
--name concurrent-${RANDOM} --ssh 1>/dev/null &
done
wait )
time (
for i in {1..${num_vms}}; do
sudo bin/ignite run weaveworks/ignite-ubuntu \
--name serial-${RANDOM} --ssh 1>/dev/null
done
) results on my laptop:
and with num_vms=10:
For these cases, it's over a 3x improvement. There is no lock for the image pull, so we run into a race when the image does not exist like we expected on last week's call: num_vms=5
sudo bin/ignite image rm weaveworks/ignite-ubuntu:latest
echo
time (
for i in {1..${num_vms}}; do
sudo bin/ignite run weaveworks/ignite-ubuntu \
--name concurrent-${RANDOM} --ssh &
done
wait )
echo
(../ignite-scratch/ignite-clean.sh 2>&1; ../ignite-scratch/iptables-clean-cni-ignite.sh 2>&1) >/dev/null
sudo bin/ignite image rm weaveworks/ignite-ubuntu:latest
This can be worked around by the user performing parallel pull operations. We can fix that issue at a future time /w a separate issue/pr. |
Add lockfile at snapshot activation to avoid race condition
This creates an ignite lock file at
/tmp/ignite-snapshot.lock
when an overlay snapshot is created. The locking is handled via
pid file using github.com/nightlyone/lockfile package. This
helps avoid the race condition when multiple ignite processes try
to create loop device and use the device mapper for overlay
snapshot at the same time. When a process obtains a lock, other
processes retry to obtain a lock, until a lock is obtained. Once
the snapshot device is created, the lock is released.
lockfile godoc:https://pkg.go.dev/github.com/nightlyone/lockfile?tab=doc
Fixes #510