Thor's+Study+Guide+-+CISSP+Domain+7
Thor's+Study+Guide+-+CISSP+Domain+7
Introduction to Domain 1
In this domain we cover:
• Investigations support and requirements, Logging and monitoring activities.
• Provisioning of resources, Foundational security operations concepts, Resource
protection techniques.
• Preventative measures, Patch and vulnerability management, Change management
processes.
• Incident management, Recovery strategies, Disaster recovery processes and plans,
Business continuity planning and exercises.
• Personnel safety concerns.
This chapter is how we secure our day-to-day operations, how we continue to function in a disaster event
and how we recover after an event. The domain also has some areas that just didn’t fit in elsewhere.
Domain 7 makes up 13% of the exam questions.
2|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Need to Know:
⬧ Even if you have access, if you do not need to know, then you should
not access the data.
(Kaiser employees).
▪ Separation of Duties:
⬧ More than one individual in one single task is an internal control
intended to prevent fraud and error.
⬧ We do not allow the same person to enter the purchase order and issue
the check.
⬧ For the exam assume the organization is large enough to use separation
of duties, in smaller organizations where that is not practical,
compensating controls should be in place.
▪ Job Rotation:
⬧ For the exam think of it to detect errors and frauds. It is easier to detect
fraud and there is less chance of collusion between individuals if they
rotate jobs.
⬧ It also helps with employees burnout and it helps employees
understand the entire business.
⬧ This can be to cost prohibitive for the exam/real life, make sure on the
exam the cost justifies the benefit.
▪ Mandatory Vacations:
⬧ Done to ensure one person is not always performing the same task,
someone else has to cover and it can keep fraud from happening or help
us detect it.
⬧ Their accounts are locked and an audit is performed on the accounts.
⬧ If the employee has been conducting fraud and covering it up, the audit
will discover it.
⬧ The best way to do this is to not give too much advance notice of
vacations.
• With the combination of all 5 we minimize some of the insider threats we may have.
• Background Checks:
▪ References, Degrees, Employment, Criminal, Credit history (less common, more
costly).
▪ For sensitive positions the background check is an ongoing process.
• Privilege Monitoring:
▪ The more access and privilege an employee has the more we keep an eye on
their activity.
3|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ They are already screened more in depth and consistently, but they also have
access to many business critical systems, we need to audit their use of that
access.
▪ With more access comes more responsibility and scrutiny.
4|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
Administrative Security
• Digital (Computer) Forensics:
▪ Focuses on the recovery and
investigation of material found in
digital devices, often in relation to
computer crime.
▪ Closely related to incident response,
forensics is based on gathering and
protecting the evidence, where
incidents responses are how we
react in an event breach.
▪ We preserve the crime scene and
the evidence, we can prove the
integrity of it at a later needed time,
often court.
▪ The Forensic Process:
⬧ Identify the potential evidence,
acquire the evidence, analyze the
evidence, make a report.
⬧ We need to be more aware of
how we gather our forensic
evidence, attackers are covering
their tracks, deleting the evidence
and logs.
⬧ This can be through malware that
is only in volatile memory, if
power is shut off (to preserve the
crime scene), the malware is
gone and the evidence is lost.
⬧ Rather than shutting the system
down, we can if considered safe
disconnect it from the network and take bit
by bit copies of the memory, drives, running processes and network
connection data.
5|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
6|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Digital Forensics:
▪ Here are the four basic types of disk-based forensic data:
⬧ Allocated Space:
▫ The portions of the disk that are marked as actively containing
data.
⬧ Unallocated Space:
▫ The portions of the disk
that does not contain active
data.
▫ This is parts that have never
been allocated and
previously allocated parts
that have been marked
unallocated.
▫ When a file is deleted, the
parts of the disk that held
the deleted file are marked as
unallocated and made available
for use. (This is also why deleting a
file does nothing, the data is still there until
overwritten).
⬧ Slack Space:
▫ Data is stored in specific size chunks known as
clusters (clusters = sectors or blocks).
▫ A cluster is the minimum size that can be allocated by a file
system.
7|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Network Forensics:
⬧ A sub-branch of digital forensics where we look at the monitoring and
analysis of computer network traffic for the purposes of information
gathering, legal evidence, or intrusion detection.
⬧ Network investigations deal with volatile and dynamic information.
⬧ Network traffic is transmitted and then lost, so network forensics is
often a proactive investigation.
⬧ Network Forensics generally has two uses.
▫ The first type is monitoring a network for anomalous traffic and
identifying intrusions (IDS/IPS).
→ An attacker might be able to erase all log files on a
compromised host, a network-based evidence might be
the only evidence available for forensic analysis.
▫ The second type relates to law enforcement.
→ In this case analysis of captured network traffic can
include tasks such as reassembling transferred files,
searching for keywords and parsing human
communication such as emails or chat sessions.
⬧ Systems used to collect network data for forensics use usually come in
two forms:
▫ Catch-it-as-you-can:
→ All packets passing through a certain traffic point are
captured and written to storage with analysis being
done subsequently in batch mode.
→ This approach requires large amounts of storage.
▫ Stop, look and listen:
→ Each packet is analyzed in a basic way in memory and
only certain information is saved for future analysis.
→ This approach requires a faster processor to keep up
with incoming traffic.
8|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Egress Monitoring:
⬧ Done to prevent data exfiltration both logically and physically.
⬧ For logical egress monitoring, we can use DLP systems.
▫ This can be both network-based and endpoint DLP systems.
▫ Even if the data is encrypted and we can’t decrypt it, we can still
prevent the egress from our network.
⬧ For physical egress monitoring, we could use guards, make sure the
trash and any other way things can be physically removed from our
organization are monitored and secured.
9|Page
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ Can be very costly and take a lot of time with the amounts of data we
store. Proper retention for backups can reduce this as well as what we
back up.
⬧ The Electronic Discovery Reference Model (EDRM):
▫ Information governance, identification, preservation,
collection, processing, review, analysis, production, and
presentation.
Incident Management
• Involves the monitoring and detection of security events on our systems, and how we
react in those events.
• It is an administrative function of managing and protecting computer assets, networks
and information systems.
• The primary purpose is to have a well understood and predictable response to events
and computer intrusions.
• We have very clear processes and responses, and our teams are trained in them and
know what to do when an event occurs.
• Incidents are very stressful situations, it is important staff knows exactly what to do,
that they have received ongoing training and understand the procedures.
• Incidences and events can generally be categorized in 3 classes:
▪ Natural: Hurricanes, floods, earthquakes, blizzards, anything that is caused by
nature.
▪ Human: Done intentionally or unintentionally by humans, these are by far the
most common.
▪ Environmental: This is not nature, but the environments we work in, the power
grid, the internet connections, hardware failures, software flaws,…
• Event:
▪ An observable change in state, this is neither negative nor positive, it is just
something has changed.
▪ A system powered on, traffic from one segment to another, an application
started.
• Alert:
▪ Triggers warnings if certain event happens.
▪ This can be traffic utilization above 75% or memory usage at 90% or more for
more than 2 minutes.
• Incident:
▪ Multiple adverse events happening on our systems or network, often caused by
people.
• Problem:
▪ Incidence with an unknown cause, we would follow similar steps to incidence
response.
▪ More time would be spent on root cause analysis, we need to know what
happened so we can prevent it from happening again, this could be a total
internet outage or server crash.
10 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Inconvenience (Non-disasters):
▪ Non-disruptive failures, hard disk failure, 1 server in a cluster is down,…
• Emergency (Crisis):
▪ Urgent, event with the potential for loss of life or property.
• Disaster:
▪ Our entire facility is unusable for 24 hours or longer.
▪ If we are geographically diverse and redundant we can mitigate this a lot.
▪ Yes, a snowstorm can be a disaster.
• Catastrophe:
▪ Our facility is destroyed.
▪ Preparation:
⬧ This is all the steps we take
to prepare for incidences.
⬧ We write the policies, procedures, we train our staff, we procure the
detection soft/hardware, we give our incidence response team the tools
they need to respond to an incident.
⬧ The more we train our team, the better they will handle the response,
the faster we recover, the better we preserve the crime scene (if there
is one), the less impactful an incident will be.
▪ Detection:
⬧ Events are analyzed to determine if they might be a security incident.
⬧ If we do not have strong detective capabilities in and around our
systems, we will most likely not realize we have a problem until long
after it has happened.
⬧ The earlier we detect the events, the earlier we can respond, IDS's can
help us detect, where IPS's can help us detect and prevent further
compromise.
⬧ The IDS's and IPS's can help us detect and prevent on a single network
segment, we also need something that can correlate all the information
from the entire network.
11 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Response:
⬧ The response phase is when the incident response team begins
interacting with affected systems and attempts to keep further damage
from occurring as a result of the incident.
⬧ This can be taking a system off the network, isolating traffic, powering
off the system, or however our plan dictates to isolate the system to
minimize both the scope and severity of the incident.
⬧ Knowing how to respond, when to follow the policies and procedures to
the letter and when not to, is why we have senior staff handle the
responses.
⬧ We make bit level copies of the systems, as close as possible to the time
of incidence to ensure they are a true representation of the incident.
⬧ IT Security is there to help the business, it may not be the choice of
senior management to disrupt business to contain or analyze, it is
ultimately a decision that is made by them.
⬧ We stop it from spreading, but that is it, we contain the event.
▪ Mitigation:
⬧ We understand the cause of the incident so that the system can be
reliably cleaned and restored to operational status later in the recovery
phase.
⬧ Organizations often remove the most obvious sign of intrusion on a
system or systems, but miss backdoors and other malware installed in
the attack.
⬧ The obvious sign is often left to be found, where the actual payload is
hidden. if that is detected or assumed, we often just rebuild the system
from scratch and restore application files from a known good backup,
but not system files.
⬧ To ensure the backup is good, we need to do root cause analysis, we
need a timeline for the intrusion, when did it start?
⬧ If it is from a known vulnerability we patch. If it's a newly discovered
vulnerability we mitigate it before exposing the newly built system to
the outside again.
⬧ If anything else can be learned about the attack, we can add that to our
posture.
⬧ Once eradication is complete, we start the recovery phase.
▪ Reporting:
⬧ We report throughout the process beginning with the detection, and we
start reporting immediately when we detect malicious activity.
⬧ The reporting has 2 focus areas: technical and non-technical.
⬧ The incident handling teams report the technical details of the incident
as they start the incident handling process, but they also notify
management of serious incidents.
⬧ The procedures and policies will outline when which level of
management needs to be informed and involved, it is commonly
forgotten until later and can be a RPE (Resume Producing Event).
12 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
13 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ If we do nothing and just fix the problem, the root of the issue still
persists, that is what we need to fix.
14 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Hybrid-Based systems combining both are more used now and check for both
signatures and abnormalities.
15 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ True Positive: An attack is happening and the system detects it and acts.
⬧ True Negative: Normal traffic on the network and the system detects it
and does nothing.
⬧ False Positive: Normal traffic and the system detects it and acts.
⬧ False Negative: An attack is happening the system does not detect it
and does nothing.
▪ We rarely talk about the “true” states since things are happening like they are
supposed to, we are interested in when it doesn’t and we prevent authorized
traffic or allow malicious traffic.
16 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Application Positive-listing:
▪ We can positive-list the applications we want to allow to run on our
environments, but it can also be compromised.
▪ We would positive-list against a trusted digital certificate, a known hash or path
and name, the latter is the least secure, an attacker can replace the file at the
path with a malicious copy.
▪ Building the trusted application positive-list takes a good deal of time, but is far
superior to negative-listing, there are 10,000’s of application and we can never
keep up with them.
17 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
Configuration Management
• When we receive or build new systems they often are completely open, before we
introduce them to our environment we harden them.
• We develop a long list of ports to close, services to disable, accounts to delete, missing
patches and many other things.
• Often it is easier to have OS images that are completely hardened and use the image for
the new system, we then update the image when new vulnerabilities are found or
patches need to be applied, often though we use a standard image and just apply the
missing patches.
• We do this for any device on our network, servers, workstations, phones, routers,
switches,...
• Pre-introduction into our production environment we run vulnerability scans against the
system to ensure we didn't miss anything (Rarely done on workstations, should be done
on servers/network equipment).
• Having a standard hardening baseline for each OS ensures all servers are similarly
hardened and there should be no weak links, we also have the standardized hardening
making troubleshooting much easier.
• Once a system is introduced to our production environment we monitor changes away
from our security baseline, most changes are administrators troubleshooting or making
workarounds, which may or may not be allowed, but it could also be an attacker
punching a path out of our network.
18 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
Asset Management
• Patch Management:
▪ In order to keep our network secure we need to apply patches on a regular
basis.
▪ Whenever a vulnerability is discovered the software producer should release a
patch to fix it.
▪ Microsoft for instance have “Patch Tuesday” (2nd Tuesday of the month).
⬧ They release all their patches for that month.
⬧ If critical vulnerabilities are discovered they push those patches outside
of Patch Tuesday.
⬧ Most organizations give the patches a few weeks to be reviewed and
then implement them in their environment.
▪ We normally remember the OS patches, but can often forget about network
equipment updates, array updates, IoT updates and so on, if they are not
patched we are not fully using defense in depth and we can expose ourselves to
risk.
▪ I have seen places where full rack disk arrays were not encrypted and had not
been patched since installation over 10 years prior, the reasoning was poorly
designed data storage and updating would take the disks offline for up to an
hour, which for the organization was unacceptable.
▪ We use software to push our patches to all appropriate systems, this is easier,
we ensure all systems gets patched and they all get the same parts of the patch,
we may exclude some parts that have an adverse effect on our network.
▪ Common tools could be SCCM or WSUS, they do not only push patches, but any
software we want to distribute to our organization.
▪ We do the pushes after hours to not impact the availability during working
hours, normally done Friday or Saturday night somewhere between 01:00 am
and 04:00 am.
▪ Most places avoid midnight as a lot of backups and jobs run at that time, and
end no later than 04:00 am or 05:00 am to ensure systems are online by the
start of business the following day.
• Change Management:
▪ Our formalized process on how we handle changes to our environments.
▪ If done right we will have full documentation, understanding and we
communicate changes to appropriate parties.
▪ The change review board should be comprised of both IT and other operational
units from the organization, we may consider impacts on IT, but we are there to
serve the organization, they need to understand how it will impact them and
raise concerns if they have any.
▪ A change is proposed to the change board, they research in order to understand
the full impact of the change.
▪ The person or group submitting the change should clearly explain the reasons
for the change, the pro's and con's of implementing and not implementing, any
19 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
changes to systems and processes they know about and in general aide and
support the board with as much information as needed.
▪ The board can have senior leadership on it or they can have a predefined range
of changes they can approve and anything above that threshold they would
make recommendations but changes require senior leadership approval.
▪ There are many different models and process flows for change management,
some are dependent on organization structure, maturity, field of business and
many other factors.
⬧ A generalized flow would look like this:
1. Identifying the change.
2. Propose the change.
3. Assessing risks, impacts and benefits of implementing and
not implementing.
4. Provisional change approval, if testing is what we expect
this is the final approval.
5. Testing the change, if what we expected we proceed, if not
we go back.
6. Scheduling the change.
7. Change notification for impacted parties.
8. Implementing the change.
9. Post implementation reporting of the actual change impact.
▪ We closely monitor and audit changes, remember changes can hold residual risk
which we would then have to mitigate.
▪ Everything in the change control process should be documented and kept, often
auditors want to see that we have implemented proper change controls, and
that we actually follow the paper process we have presented them with.
20 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• 0-day Vulnerabilities:
▪ Vulnerabilities not generally known or discovered, the first time an attack is
seen is considered day 0, hence the name.
▪ From when a vulnerability is discovered it is now only a short timespan before
patches or signatures are released on major software.
▪ With millions of lines of code in a lot of software and the 1% errors we talked
about there will always be new attack surfaces and vulnerabilities to discover.
The only real defense against the 0 day exploits is defense in depth and when
discovered immediate patching as soon as it is available and we have tested it in
our test environments. Most signatures in IDS/IPS and anti virus auto update as
soon as new signatures are available.
▪ 0-day Vulnerability: The vulnerability that has not been widely discovered and
published.
▪ 0-day Exploit: Code that uses the 0-day vulnerability.
▪ 0-day Attack: The actual attack using the code.
▪ The Stuxnet worm that targeted Iran's nuclear centrifuges used 4 unique 0-day
exploits (previously unheard of).
▪ It was developed over 5+ years and estimated to have cost 100's of millions of
dollars.
▪ Stuxnet has three modules:
⬧ A worm that executes all routines related to the main payload of the
attack;
⬧ A link file that automatically executes the propagated copies of the
worm.
⬧ A rootkit responsible for hiding all malicious files and processes,
preventing detection of Stuxnet.
▪ It is introduced to the target environment by an infected USB flash drive.
21 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ The worm then propagates across the network, scanning for Siemens Step7
software on computers controlling a PLC, If both are not present, Stuxnet
becomes dormant inside the computer, it will still replicate the worm.
▪ If both are present, Stuxnet introduces the infected rootkit onto the PLC and
Step7 software, modifying the codes and giving unexpected commands to the
PLC while returning a loop of normal operations system values feedback to the
users.
Continuity of Operations
• Fault Tolerance:
▪ To ensure our internal SLAs and provide as high availability as possible we use as
high degree of redundancy and resiliency as makes sense to that particular
system and data set.
▪ Backups:
⬧ One of the first things that comes to mind when talking about fault
tolerance is backups of our data, while it is very important it is often like
log reviews an afterthought and treated with "Set it and forget it"
mentality.
⬧ For backups we use Full, Incremental, Differential and Copy backups,
and how we use them is determined on what we need from our
backups.
⬧ How much data we can stand to lose and how fast we want the backup
and restore process to be.
⬧ In our backup solution we make backup policies of what to back up,
what to exclude, how long to keep the data of the Full, Incremental and
Differential backups.
⬧ All these values are assigned dependent on what we back up, and
normal organizations would have different backup policies and apply
those to the appropriate data.
⬧ This could be Full 3, 6, 12, 36, 84 months and infinity, the retention is
often mandated by our policies and the regulations in our field of
business.
⬧ It is preferable to run backups outside of business hours, but if the
backup solution is a little older it can be required to run around the
clock, in that case we put the smaller and less important backups in the
daytime and the important larger ones after hours.
⬧ We often want to exclude parts of the system we are backing up, this
could be the OS, the trashcan, certain program folders, ... we just
backup what is important and rarely everything.
⬧ If a system is compromised and the issue is a rootkit, the rootkit would
persist on the backup if we did a full mirror restore, by eliminating some
of the system data we not only backup a lot less data, we also may
avoid the infection we are trying to remedy.
22 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ Incremental Backup:
▫ Backs up everything that has changed since the last backup.
▫ Clears the archive bits.
▫ Incremental are often fast to do, they only backup what
has changed since the last incremental or full.
▫ The downside to them is if we
do a monthly full
backup and daily
incremental, we
have to get a full
restore and
could have to
use up to 30
tapes, this would
take a lot longer than
with 1 Full and 1
Differential.
▫ IF we need to restore on Thursday:
→ Restore with the full Sunday backup and Monday,
Tuesday, and Wednesday’s incremental tapes.
→ 4 tapes.
23 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ Differential Backup:
▫ Backs up everything since the last Full backup.
▫ Does not clear the archive bit.
▫ Faster to restore since we just need 2 tapes for a full restore,
the full and the differential.
▫ Backups take longer than the incremental, we are backing
everything since the last full.
▫ Never use both incremental and differential on the same data, it
is fine on the same backup solution, different data has different
needs.
▫ IF we need to restore on Thursday:
→ Restore with the Sunday full backup and Wednesday’s
incremental tapes.
→ 2 tapes.
⬧ Copy Backup:
▫ This is a full backup with one important difference, it does not
clear the archive bit.
▫ Often used before we do system updates, patches and similar
upgrades.
▫ We do not want to mess up the backup cycle, but we want to be
able to revert to a previous good copy if something goes wrong.
⬧ Archive Bit:
▫ For Windows the NTFS has an archive bit on file, it is a flag that
indicates if the file was changed since the last Full or
Incremental backup.
24 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ There are many different types of RAID, for the exam I would know the above
terms and how RAID 0, 1 and 5 works.
▪ RAID 0:
⬧ Striping with no mirroring or parity, no fault tolerance, only provides
faster read write speed, requires at least 2 disks
▪ RAID 1:
⬧ Mirror set, 2 disks
with identical data,
and write function
is written to both
disks
simultaneously.
25 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ RAID 5:
⬧ Block level striping with distributed parity, requires at least 3 disks.
⬧ Combined speed with redundancy.
• RAID will help with data loss when we
have a single disk failure if we use a
fault tolerant RAID type, if more than
one disk fails before the first is
replaced and rebuilt, we would need
to restore from our tapes.
• Most servers have the same disks with
the same manufacturer date, they will
hit their MTBF (Mean time between
failures) around the same time.
• Larger data centers often have SLA’s
with the hard disk/server vendor,
which also includes MTTR (Mean time
to repair).
• This could be within 4 or 8 hours the vendor has to be onsite with a replacement disk.
• System Redundancy:
▪ On top of the RAID and the backups we also try to provide system redundancy
as well as redundant parts on the systems.
▪ The most common system failures are from pieces with moving parts, this could
be disks, fans or PSU (power supplies).
▪ Most servers have redundant power supplies, extra fans, redundant NIC’s.
▪ The NIC and PSU serve a dual purpose, both for internal redundancy and
external. If a UPS fails, the server is still operational with just the 1 PSU getting
power.
▪ Redundant disk controllers are also reasonably common, we design and buy the
system to match the redundancy we need for that application.
26 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Often we have spare hardware on hand in the event of a failure, this could
include hard disks, PSU's, fans, memory, NICs.
▪ Many systems are built for some hardware to be hot-swappable, most
commonly HDD's, PSU's and fans.
▪ If the application or system is important we often also have multiple systems in
a cluster.
▪ Multiple servers often with a virtual IP, seen as a single server to users.
▪ Clustering is designed for fault tolerance, often combined with load balancing,
but not innately.
▪ Clustering can be active/active, this is load balancing, with 2 servers both
servers would actively process traffic.
▪ Active/passive: There is a designated primary active server and a secondary
passive server, they are connected and the passive sends a keep-alive or
heartbeat every 1-3 seconds, are you alive, are you alive,... AS long as the active
server responds the passive does nothing, if the active does not respond for
(normally) 3 keepalives the passive assumes the primary is dead and assumes
the primary role.
▪ In well designed environments the servers are geographically dispersed.
▪ We can also use other complementary backup strategies to give ourselves more
real time resilience, and faster recovery.
▪ Database Shadowing:
⬧ Exact real time copy of the database or files to another location.
⬧ It can be another disk in the same server, but best practice dictates
another geographical location, often on a different media.
▪ Electronic Vaulting (E-vaulting):
⬧ Using a remote backup service, backups are sent off-site electronically
at a certain interval or when files change.
▪ Remote Journaling:
⬧ Sends transaction log files to a remote location, not the files
themselves. The transactions can be rebuilt from the logs if we lose the
original files.
27 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
28 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
29 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Heat (Environmental):
▪ Many data centers are kept too cold, the last decades research has shown it is
not needed.
▪ Common temperature levels range from 68–77 °F (20–25 °C) - with an allowable
range 59–90 °F (15–32 °C).
▪ Keeping a Data Center too cold wastes money and raises humidity.
30 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Personnel Shortages(Human/Nature/Environmental):
▪ In our BCP, we also have to ensure that we have redundancy for our personnel
and how we handle cases where we have staff shortages.
▪ If we have 10% of our staff, how impacted is our organization?
▪ This can be caused by natural events (snow, hurricane) but is more commonly
caused by the flu or other viruses.
▪ Pandemics:
⬧ Organizations should identify critical staff by position not by name, and
have it on hand for potential epidemics. <Insert your own COVID-19
work experiences here.>
▪ Strikes:
⬧ A work stoppage caused by the mass refusal of employees to work.
⬧ Usually takes place in response to employee grievances.
⬧ How diminished of a workforce can we have to continue to function?
▪ Travel:
⬧ When our employees travel, we need to ensure both they and our data
is safe.
⬧ That may mean avoiding certain locations, limiting what they bring of
hardware and what they can access from the remote location.
⬧ If they need laptops/smartphones, we use encryption, device
monitoring, VPNs, and all other appropriate measures.
• Our DRP (Disaster Recovery Plan) should answer at least three basic questions:
▪ What is the objective and purpose?
▪ Who will be the people or teams who will be responsible in case any disruptions
happen?
▪ What will these people do (our procedures) when the disaster hits?
• Normal plans are a lot more in depth and outline many different scenarios, they have a
clear definition of what a disaster is, who can declare it, who should be informed, how
often we send updates to whom, who does what,…
• It is easy to just focus on getting back up and running when we are in the middle of a
disaster, staff often forget about communication, preserving the crime scene (if any)
and in general our written procedures.
31 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• We have looked at the first 2 before, for now we will focus on Response and Recovery.
▪ Response: How we react in a disaster, following the procedures.
⬧ How we respond and how quickly we respond is essential in Disaster
Recovery.
⬧ We assess if the incident we were alerted to or discovered is serious and
could be a disaster, the assessment is an iterative process.
▫ The more we learn and as the team gets involved we can assess
the disaster better.
⬧ We notify appropriate staff to help with the incident (often a call tree or
automated calls), inform the senior management identified in our plans
and if indicated by the plan communicate with any other appropriate
staff.
▪ Recovery: Reestablish basic functionality and get back to full production.
⬧ We act on our assessment using the plan.
⬧ At this point all key stakeholders should be involved, we have a clearer
picture of the disaster and take the appropriate steps to recover. This
could be DR site, system rebuilds, traffic redirects,…
32 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• BCP/DRP’s are often built using the waterfall project management methodology, we will
cover it in the next domain.
• The BCP team has sub-teams responsible for rescue, recovery and salvage in the event
of a disaster or disruption.
33 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
34 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
Recovery Strategies
• In our recovery process we have to consider the many factors that can impact us, we
need look at our options if our suppliers, contractors or the infrastructure are impacted
as well.
• We may be able to get our data center up and running in 12 hours, but if we have no
outside connectivity that may not matter.
• Supply chain:
▪ If an earthquake hits, do our local suppliers function, can we get supplies from
farther away, is the infrastructure intact?
35 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
36 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ Mobile Site:
⬧ Basically a data center on wheels, often a container or trailer that can
be moved wherever by a truck.
⬧ Has HVAC, fire suppression, physical security, (generator),… everything
you need in a full data center.
⬧ Some are independent with generator and satellite internet, others
need power and internet hookups.
▪ Subscription/Cloud Site:
⬧ We pay someone else to have a minimal or full replica of our production
environment up and running within a certain number of hours (SLA).
⬧ They have fully built systems with our applications and receive backups
of our data, if we are completely down we contact them and they spin
the systems up and apply the latest backups.
⬧ How fast and how much is determined by our plans and how much we
want to pay for this type of insurance.
37 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
38 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
⬧ Automated call trees are often a better idea than manual ones,
notifying people of the disaster is one of those things that tends to get
forgotten.
⬧ They are hosted at a remote location, often on SaaS, and key personnel
that are allowed to declare a disaster can activate them.
• Off Site Copies and Plans:
▪ We keep both digital and physical copies of all our plans at offsite locations,
assume we can’t access our data or our facilities. Relying on memory is a bad
idea.
▪ We also keep critical business records in the same manner.
• EOC (Emergency Operations Center):
▪ A central temporary command and control facility responsible for our
emergency management, or disaster management functions at a strategic level
during an emergency.
▪ It ensures the continuity of operation of our organization.
▪ We place the EOC in a secure location if the disaster is impacting a larger area.
• MOU/MOA (Memorandum of Understanding/Agreement):
▪ Staff signs a legal document acknowledging they are responsible for a certain
activity.
▪ If the test asks "A critical staff member didn't show, and they were supposed to
be there. What could have fixed that problem?" it would be the MOU/MOA.
While slightly different they are used interchangeably on the test.
• Executive Succession Planning:
▪ Senior leadership often are the only ones who can declare a disaster.
▪ We need to plan for if they are not available to do so.
▪ Their unavailability may be from the disaster or they may just be somewhere
without phone coverage.
▪ Organizations must ensure that there is always an executive available to make
decisions
▪ Our plans should clearly outline who should declare a disaster, if they are not
available, who is next in line and the list should be relatively long.
▪ Organizations often have the entire executive team at remote sessions or
conferences (it is not very smart).
• Employee Redundancy:
▪ We should have a high degree of skilled employee redundancy, just like we have
on our critical hardware.
▪ It is natural for key employees to move on, find a new job, retire or win the
lottery.
▪ If we do not prepare for it we can cripple our organization.
▪ Can be mitigated with training and job rotation.
39 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
• Physical Tests:
▪ Parallel Processing:
⬧ We bring critical components up at a secondary site using backups,
while the same systems are up at the primary site, after the last daily
backup is loaded we compare the two systems.
▪ Partial Interruption:
⬧ We interrupt a single application and fail it over to our secondary
facilities, often done off hours.
▪ Full Interruption:
⬧ We interrupt all applications and fail it over to our secondary facilities,
always done off hours.
▪ Both partial and full are mostly done by fully redundant organizations, build
your plans for your environment.
• Testing:
▪ To ensure the plan is accurate, complete and effective, happens before we
implement the plan.
• Drills (Exercises):
▪ Walkthroughs of the plan, main focus is to train staff, and improve employee
response (think fire drills).
• Auditing:
▪ A 3rd party ensures that the plan is being followed, understood and the
measures in the plan are effective.
40 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
41 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
▪ What happened and didn’t happen is less important than how we improve for
next time.
▪ We do not place blame, the purpose is improving.
▪ How can we as an organization grow and become better next time we have
another incidence? While we may have fixed this one vulnerability there are
potentially 100's of new ones we know nothing about yet.
▪ The outcome and changes of the Lessons Learned will then feed into our
preparation and improvement of our BCP and DRP.
After a Disruption
• We only use our BCP/DRP's when our other countermeasures have failed.
• This makes the plans even more important. (Remember 2/3 of business with major data
loss close).
• When we make and maintain the plans there are some common pitfalls we want to
avoid:
▪ Lack of senior leadership support
▪ Lack of involvement from the business units
▪ Lack of critical staff prioritization
▪ Too narrow scope
▪ Inadequate telecommunications and supply chain management
▪ Lack of testing
▪ Lack of training and awareness
▪ Not keeping the BCP/DRP plans up to date, or no proper versioning controls
BCP/DRP Frameworks
• When building or updating our BCP/DRP plans, we can get a lot of guidance from these
frameworks, and just like the other standards and frameworks we use we often tailor
and tweak them to fit the needs of our organization.
• NIST 800-34:
▪ Provides instructions, recommendations, and considerations for federal
information system contingency planning. Contingency planning refers to
interim measures to recover information system services after a disruption.
• ISO 22301:
▪ Societal security, Business continuity management systems, specifies a
management system to manage an organization's business continuity plans,
supported by ISO 27031.
• ISO/IEC-27031:
▪ Societal security, Business continuity management systems – Guidance, which
provides more pragmatic advice concerning business continuity management
• BCI (Business Continuity Institute):
▪ 6 step process of "Good Practice Guidelines (GPG)” the independent body of
knowledge for Business Continuity.
42 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
43 | P a g e
https://thorteaches.com/
Thor’s Study Guide – CISSP® Domain 7
44 | P a g e
https://thorteaches.com/