HA UserGuide
HA UserGuide
ArcSight ESM
Software Version: 7.3
Legal Notices
Copyright Notice
© Copyright 2001-2020 Micro Focus or one of its affiliates
Confidential computer software. Valid license from Micro Focus required for possession, use or copying. The
information contained herein is subject to change without notice.
The only warranties for Micro Focus products and services are set forth in the express warranty statements
accompanying such products and services. Nothing herein should be construed as constituting an
additional warranty. Micro Focus shall not be liable for technical or editorial errors or omissions contained
herein.
No portion of this product's documentation may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or information storage and retrieval systems,
for any purpose other than the purchaser's internal use, without the express written permission of Micro
Focus.
Notwithstanding anything to the contrary in your license agreement for Micro Focus ArcSight software, you
may reverse engineer and modify certain open source components of the software in accordance with the
license terms for those particular components. See below for the applicable terms.
U.S. Governmental Rights. For purposes of your license to Micro Focus ArcSight software, “commercial
computer software” is defined at FAR 2.101. If acquired by or on behalf of a civilian agency, the U.S.
Government acquires this commercial computer software and/or commercial computer software
documentation and other technical data subject to the terms of the Agreement as specified in 48 C.F.R.
12.212 (Computer Software) and 12.211 (Technical Data) of the Federal Acquisition Regulation (“FAR”) and
its successors. If acquired by or on behalf of any agency within the Department of Defense (“DOD”), the U.S.
Government acquires this commercial computer software and/or commercial computer software
documentation subject to the terms of the Agreement as specified in 48 C.F.R. 227.7202-3 of the DOD FAR
Supplement (“DFARS”) and its successors. This U.S. Government Rights Section 18.11 is in lieu of, and
supersedes, any other FAR, DFARS, or other clause or provision that addresses government rights in
computer software or technical data.
Trademark Notices
Adobe™ is a trademark of Adobe Systems Incorporated.
Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.
UNIX® is a registered trademark of The Open Group.
Support
Contact Information
Phone A list of phone numbers is available on the Technical Support
Page: https://softwaresupport.softwaregrp.com/support-contact-
information
You must complete configuration set up tasks on both the primary and secondary
systems before you install the APHA module. Some tasks are the same regardless of
your installation scenario. For more information about these tasks, see Configuring the
Active-Passive High Availability System - All Scenarios. Other tasks are specific to your
installation scenario. This guide covers the following scenarios:
l Both systems are new and do not have ESM installed.
For more information, see Installing the Active-Passive High Availability Module and
ESM.
l One of the systems has ESM installed.
For more information, see Installing the Active-Passive High Availability Module with
an Existing ESM Installation.
l You are upgrading both ESM and the APHA module.
For more information, see Upgrading ESM and the Active-Passive High Availability
Module.
l You are upgrading both ESM and the APHA module on an appliance.
For more information, see Upgrading ESM and the Active-Passive High Availability
Module on an Appliance.
The configuration steps ensure that both systems are configured properly and that the
configuration is aligned across the two systems.
You install or upgrade ESM and the APHA module on the primary system only. After
installation is complete, the APHA module requires time to synchronize the secondary
system with the primary system. In general, new ESM installations require less time than
upgrading existing ESM systems because of the amount of data to be synchronized.
Important: If you already have ESM and a license for a High Availability solution that
was implemented before the APHA Module 1.0 release, you will need a new
ESM license that supports this product. The new Active-Passive High Availability
module uses software to manage failovers and requires a different hardware
configuration.
When planning your installation or upgrade, keep the following points in mind:
l The APHA module is not supported with SELinux in enforcing mode. If SELinux is
installed, it should be in disabled or permissive mode. Micro Focus recommends
disabled mode.
l If you are planning to install ESM in distributed correlation mode, note that only the
persistor node in the distributed correlation cluster supports the APHA module. Non-
persistor nodes do not support the APHA module.
For an example APHA implementation, see An Example APHA Implementation
l Use application management software to notify you of any issues with the primary or
secondary systems.
DISA STIG
Issue CIS benchmark section Notes
/tmp should be a separate partition 1.1.2 This means that you cannot
with the noexec mount option. run a program underneath
/tmp and impacts APHA
installation and upgrade.
As a workaround, create the
directory <tmpdir> as user
arcsight and add the
following lines to
/home/arcsight/.bashrc :
export
IATEMPDIR=<tmpdir>
export _JAVA_OPTIONS=-
Djava.io.tmpdir=<tmpdir>
export TMP=<tmpdir>
Ensure the SELinux state is enforcing. 1.6.1.2 V-71989 The APHA module is not
supported with SELinux in
enforcing mode. If SELinux is
installed, it should be in
disabled or permissive mode.
Micro Focus recommends
disabled mode.
Use operating system firewall 3.6 V-72273 The firewall should not block
software. ports that are used by the
APHA module. For more
information about firewall
considerations, see
Understanding Network
Requirements.
Ensure that SSH root login is disabled. 5.2.8 APHA uses SSH root logins
to synchronize configuration
between servers.
Default user umask 5.4.4 V-71995 CIS specifies that the default
user umask should be 027 or
more restrictive.
The DISA STIG specifies
that the default user umask
should be 077 or more
restrictive.
For APHA module
installation, the default user
umask must be 022 or less
restrictive at installation time.
ensure that the static host name has the same value as the dynamic host name.
APHA software starts very early in the boot process, and if the static and dynamic host
names are different, you might get the wrong host name.
l The primary and secondary machines must be close enough that the cable
connection between them requires no intervening routers or switches.
l Ensure that Port 1 (bottom left port) is connected to the network and complete the OS
configuration. Then use the second port at the bottom to connect the crossover link.
l Obtain at least five IP addresses for the two systems:
o Two IP addresses (one per system) are the static host IP addresses used to receive
network communication.
o Two IP addresses (one per system) are used for direct communication between the
two systems in the cluster using crossover cables. These can be IPv4 or IPv6
addresses.
Note: You can use private IP addresses if you are certain that ESM will not
route communication to these addresses.
o One IP address is the service IP address that is assigned to the ESM cluster. You
will specify the host IP addresses and the service IP address when you run the
First Boot Wizard. The service IP address is dynamically reassigned when a
failover occurs and when the primary is brought back online. The service IP
address must be on the same subnet as the host IP addresses. For more
information, see Using the Service IP Address to Identify the Cluster.
Note: Micro Focus recommends that all IP addresses be either IPv4 or IPv6. The
crossover cable IP addresses can differ in protocol from the other IP addresses,
but if they do the cluster management communication between the two hosts will
only use network communication ports (not the crossover cable ports) and will not
be redundant.
l If you are converting from a single system deployment to a cluster deployment using
the APHA module, you can save time by using the original ESM IP address as the
new service IP address, and then giving the original ESM system a new IP address.
This enables you to reuse the ArcSight Manager SSL certificate, rather than having to
generate a new certificate and import it to all connectors and clients.
l Micro Focus recommends using DNS to manage IP addresses and host names for all
components in the cluster. Using DNS enables you to manage the service IP address
in relation to the numerous connectors, consoles, and command centers associated
with a specific ArcSight Manager. Also, using DNS enables you to keep the IP
addresses or host names consistent for the primary and secondary systems in your
cluster. However, you would not want to use DNS to track the IP addresses for the
primary and secondary cables; there is no benefit from using DNS in this case.
l The APHA module uses ports 5404, 5405, and 7789 on each IP address in the cluster
environment. These ports must be dedicated to APHA module communication. Do not
configure other applications to use these ports.
l Both systems use the ports and protocols that are listed below. Ensure that firewalld
and iptables do not block these ports. Set up your network firewalls to allow access
to the connected hosts. A connected host is any other device on the network that the
APHA module can ping to verify that it is still on the network.
Note: Micro Focus recommends that you configure a host name during ESM
installation. Host name changes are easier to manage using DNS and are required
for IPv6 systems.
Note: Do not install ESM until the APHA disk synchronization is complete.
Attempting to install ESM while APHA disk synchronization is in process can cause
the ESM installation to fail.
l Running ESM with the APHA module requires significant disk space. Because of the
synchronization process, the cluster systems must meet minimum storage
requirements. The ESM and archival storage must be on the same shared disk.
For information about hard disk requirements for running ESM, see the ESM
Installation Guide on the ESM documentation page. In addition to the
ESM requirements, the APHA module has additional storage requirements:
Purpose Minimum Note
storage
ESM and APHA 3 GB Ensure that there is enough space for the downloaded installation
module installation binaries.
binaries
Temporary installation 6 GB Space required to run the installation wizard and the First Boot
files Wizard
Shared disk partition Varies The APHA module mirrors this partition between the two systems.
The volume size depends on the specific implementation needs.
ESM requires approximately 10 TB (mid-range) - 12 TB (high
performance) disk space for event storage, plus at least one 1 GB
(with no upper limit) of event archive space.
APHA Varies The volume size depends on the size of the ESM online storage.
synchronization
metadata
l Set up the following disk partitions on the primary and secondary systems.
Because the installation erases the contents of the shared disk on the secondary
system, ensure that it does not contain data of value.
Ensure that processes on the primary and secondary systems do not use the shared
disk file system.
The shared disk partition does not support bind mounts. The installation wizard flags
them as errors. Use symbolic links instead.
Partitions Space Location Notes
required
l If the shared disks have write caches enabled, the write caches must be battery
backed write caches (BBWC). If the shared disks do not have battery backup, there is
a chance that the disks will be out-of-sync if a power failure occurs.
l The network interface cards should be at 1 Gigabit (Gb) or higher and use a cable that
supports this bandwidth.
l The network interface that is used for the interconnection of the two servers should
run at 1 or 10 Gigabits (Gb)/sec. The benefit of the higher bandwidth is seen during
the initial synchronization between the primary and secondary systems. This is useful
when you upgrade ESM on the primary system and there is a significant amount of
data to synchronize.
l If your servers have high-speed disk subsystems, you might see improved
performance with a 10 Gb network interface. The mirrored disk performance is limited
by the slower of either the disk write throughput or the throughput on the crossover
link.
supported, the module might not work properly. Do not upgrade to a newer version
of your operating system until there is a version of the APHA module that supports
it.
If avahi software is running on your system, the APHA module might not work
properly. Run the following command as user root to detect whether avahi is
present:
systemctl status avahi-daemon
If the response indicates that the avahi-daemon is enabled or running, see the
documentation for your operating system for instructions to stop it and to disable it
upon reboot.
l The file system for the mirrored disk partitions can be EXT4 or XFS. You cannot
change the file system type while installing the APHA module or during an ESM
upgrade. Both systems must use the same file system type.
l Configure both systems to access a yum (for RHEL or CentOS) or yast (for SLES)
repository in order to install dependencies that the APHA module requires. The
repository can be a remote yum repository that the operating system vendor provides,
a repository that you create from the operating system ISO or CD, or a directory
location on the local system. See the vendor-specific documentation for information
about configuring yum or yast and connecting to yum or yast repositories.
l Micro Focus recommends that you use the operating system’s Logical Volume
Management (LVM) tools to manage volumes and partitions on the APHA cluster
systems. These tools make the process of configuring and managing disk space
much simpler than using physical disk management.
An LVM partition must be a multiple of the LVM chunk size. If you use 32 MiB for the
chunk size and need a 33 MiB partition, create a 64 MiB partition (because you would
need two chunks). For an example, see Disk Partition Setup.
Note: After installation, this partition is only mounted on the primary system.
Only the primary system can make changes to it.
7. If the mirrored disks are SSD drives, such as Fusion, configure TRIM support on the
primary and secondary systems.
8. Ensure that all file system options are set up as desired on the primary system. The
APHA module mounts the file system on the secondary system exactly as you
mounted it on the primary system.
9. On the primary and secondary systems, create a metadata partition. This is a small
partition on each system that is used for disk-synchronization metadata. The size to
allocate for each partition is calculated in mebibytes:
where P is the size of the shared disk partition in gibibytes. For example, if the
shared disk partition size is 1 TiB (that is, 1,024 GiB), the metadata partition size
would be 33 MiB.
For an example, see Disk Partition Setup. If you increase the size of the shared disk
partition, also increase the size of the metadata partition accordingly. Decreasing
the size of the mounted partition is not supported.
If the metadata partition will be a physical volume (for example, /dev/sda8), create it
now. If the metadata partition will be a logical volume (for example,
/dev/mapper/vg00-meta), then you only need to ensure that enough free disk space
is available in a volume group. The prepareHA.sh script will create the metadata
volume.
10. Ensure that the password for the root user is the same on both systems. You can
change the password after installation.
11. As user root, run the following commands:
cp -r Tools /tmp
cd /tmp/Tools/highavail
cp template.properties highavail.properties
chmod 644 highavail.properties
12. Specify the following information in /tmp/Tools/highavail/highavail.properties:
l service_hostname= host name of ESM in the APHA cluster
13. As user root, run the following command on the primary system:
/tmp/Tools/highavail/prepareHA.sh
a. Confirm the names of the primary and secondary systems.
b. If the metadata partition does not exist and it will be a logical volume, allow the
script to create it.
c. Provide the password for user arcsight.
If there are errors, correct them and run prepareHA.sh again.
14. As user root, run the following command:
scp -r /tmp/Tools <secondary hostname>:/tmp
15. Reboot the primary system.
16. On the secondary system, as user root, run the following command:
/tmp/Tools/highavail/prepareHA.sh
If there are errors, correct them and run prepareHA.sh again.
17. Reboot the secondary system.
When the configuration tasks for the primary and secondary systems are complete,
continue to New Installation: Running the Active-Passive High Availability Module
Installation Wizard.
Note: Run the APHA installation wizard on the primary system only.
Note: Run the First Boot Wizard on the primary system only.
Shared Mount point of the disk that is shared between the primary and secondary systems
Disk The options provided include all relevant mount points.
In the highavail.properties file, this value is identified as shared_disk .
The installation does not support bind mounts and flags them as errors. Use symbolic
links instead.
Because the installation completely erases the contents of the shared disk on the
secondary system, ensure that it does not contain data of value.
Ensure that no processes on the primary or secondary systems are using this file
system. Otherwise, the installation will exit with errors.
You cannot change this value on subsequent runs of the First Boot Wizard.
Field Description
Primary IP address of the interface that is connected to the interconnect cable on the primary
Cable IP system
In the highavail.properties file, this value is identified as primary_cable_ip .
Secondary IP address of the interface that is connected to the interconnect cable on the secondary
Cable IP system
In the highavail.properties file, this value is identified as secondary_cable_ip .
Connected These hosts are other machines in the network that the APHA module can ping to verify
Hosts that it is connected to the network.
Enter a space-separated list of host names or IP addresses that the module can ping. Do
not specify a host name or IP address for the primary and secondary systems.
This field is not required.
Ping Number of pings to attempt before reporting that the pings failed
Attempts The default is 2 pings.
The wizard generates a summary of the host names and other configuration
parameters. The wizard resolves IP addresses to host names and resolves host
names to IP addresses. The wizard determines whether to use IPv4 or IPv6 for the
service IP address and provides an explanation. If you do not agree with the choice,
you might be able to force the wizard to choose IPv4 or IPv6 by specifying an IPv4
or IPv6 address instead of a host name.
6. Provide the password for user root.
The password enables the APHA configuration script to complete actions that must
be performed as the root user. The password must be the same on the primary and
secondary systems. The wizard does not permanently store the password. You can
change the password after the installation is complete.
After you select to continue, the wizard displays the status of each operation. It might
take approximately one hour to complete.
7. When the installation is complete, check the log files on both servers. For
information about resolving errors, see Installation Issues and Solutions.
After you resolve errors, as user arcsight, run the First Boot Wizard again:
/usr/lib/arcsight/highavail/bin/arcsight firstBootWizard
When the First Boot Wizard is complete, continue to Installing ESM.
Installing ESM
After you complete the First Boot Wizard, you can install ESM.
Note: Before you install ESM, ensure that the APHA module is running.
To install ESM:
1. As user root, create the folder /opt/arcsight and set ownership to user arcsight:
chown arcsight:arcsight /opt/arcsight
If the mount point for mirroring is /opt or /opt/arcsight, the APHA module mirrors
the change to the secondary system.
2. On the primary system, install ESM. For more information, see the ESM Installation
Guide on the ESM documentation page.
Note: When the ESM Configuration Wizard prompts you for the ArcSight
Manager host name or IP address, specify the cluster service host name or
service IP address and not the host name of a single server.
Because you already installed the APHA module, you do not need to run
prepare_system.sh during ESM installation.
3. During installation, ensure that you include the ArcSight ESM APHA Monitoring
Foundation Package.
The package is required in order to receive up-to-date APHA module status
information.
When the ESM installation is complete, perform post-installation tasks as described in
the ESM Installation Guide on the ESM documentation page, and then continue to New
Installation: Completing Module Post-Installation Tasks.
ESM and APHA 3 GB Ensure that there is enough space for the downloaded installation
module installation binaries.
binaries
Temporary installation 6 GB Space required to run the installation wizard and the First Boot
files Wizard
Shared disk partition Varies The APHA module mirrors this partition between the two systems.
The volume size depends on the specific implementation needs.
ESM requires approximately 10 TB (mid-range) - 12 TB (high
performance) disk space for event storage, plus at least one 1 GB
(with no upper limit) of event archive space.
APHA Varies The volume size depends on the size of the ESM online storage.
synchronization
metadata
l Set up the following disk partitions on the primary and secondary systems.
Because the installation erases the contents of the shared disk on the secondary
system, ensure that it does not contain data of value.
Ensure that processes on the primary and secondary systems do not use the shared
disk file system.
The shared disk partition does not support bind mounts. The installation wizard flags
them as errors. Use symbolic links instead.
Partitions Space Location Notes
required
l If the shared disks have write caches enabled, the write caches must be battery
backed write caches (BBWC). If the shared disks do not have battery backup, there is
a chance that the disks will be out-of-sync if a power failure occurs.
l The network interface cards should be at 1 Gigabit (Gb) or higher and use a cable that
supports this bandwidth.
l The network interface that is used for the interconnection of the two servers should
run at 1 or 10 Gigabits (Gb)/sec. The benefit of the higher bandwidth is seen during
the initial synchronization between the primary and secondary systems. This is useful
when you upgrade ESM on the primary system and there is a significant amount of
data to synchronize. For more information, see Planning for the Initial Disk
Synchronization.
l If your servers have high-speed disk subsystems, you might see improved
performance with a 10 Gb network interface. The mirrored disk performance is limited
by the slower of either the disk write throughput or the throughput on the crossover
link.
l If you are adding the APHA module to an existing ESM installation that is earlier than
ESM 7.0, upgrade to ESM 7.0 before you install the module.
l If you plan to convert the system from IPv4 to IPv6, convert after you upgrade to ESM
7.0 and before you install the module.
l The mirrored disk mount point (for example, /opt) must be the same on the primary
and secondary systems. The mounted volume name (for example, /dev/sda5 or
/dev/mapper/vg00-opt) must also be the same on both systems. If the primary system
uses a physical volume for the mirrored disk, then the secondary system must also
use a physical volume.
Note: If ESM was running in distributed correlation mode, you must run sshSetup
again. When you run remove_services, it removes the configuration of sshSetup
after you install the APHA module and run setupServices. For more information,
see the ESM Administrator's Guide on the ESM documentation page.
4. If you want to re-use the ArcSight Manager SSL certificate rather than generate a
new certificate, complete the following steps:
a. Add a new IP address to the interface that has the current host IP address. The
new IP address will become the new host IP address. The original IP address
will become the service IP address that identifies the cluster.
b. Set up /etc/hosts or DNS to resolve the new host IP address to the new host
name.
c. Configure the host to use the new host name.
Now that you have removed the original IP address from the network interface, you
can re-use it as the cluster service IP Address when you run the First Boot Wizard.
Note: If you change the system host name during installation, ensure that the
change persists across reboots. Reboot the system, and then use the hostname
command to verify the system host name.
5. Ensure that both systems are using the correct version of the operating system
timezone package. This is a requirement for ESM. For more information, see the
ESM Installation Guide on the ESM documentation page.
6. To synchronize the time between the primary and secondary systems, configure
them to run the Network Time Protocol (NTP).
7. Connect the systems with crossover cables and configure the interfaces with the
appropriate IPv4 or IPv6 addresses. Both systems must use the same IP version.
Ping from one system to the other over the configured interfaces to ensure proper
configuration.
8. On the primary and secondary systems, select the partitions to be mirrored:
a. On the primary system, run df /opt/arcsight.
The mount point for /opt/arcsight is displayed in the Mounted on column.
b. On the secondary system, create and mount a volume with the same volume
name, size, and file system type.
If /opt/arcsight is a symbolic link on the primary system, the installation wizard
will create the same symbolic link on the secondary system.
9. If the mirrored disks are SSD drives, such as Fusion, configure TRIM support on the
primary and secondary systems.
10. Ensure that all file system options are set up as desired on the primary system. The
APHA module mounts the file system on the secondary system exactly as you
mounted it on the primary system.
11. On the primary and secondary systems, create a metadata partition. This is a small
partition on each system that is used for disk-synchronization metadata. The size to
allocate for each partition is calculated in mebibytes:
size (in mebibytes) = (P/32)+1
where P is the size of the shared disk partition in gibibytes. For example, if the
shared disk partition size is 1 TiB (that is, 1,024 GiB), the metadata partition size
would be 33 MiB.
For an example, see Disk Partition Setup. If you increase the size of the shared disk
partition, also increase the size of the metadata partition accordingly. Decreasing
the size of the mounted partition is not supported.
If the metadata partition will be a physical volume (for example, /dev/sda8), create it
now. If the metadata partition will be a logical volume (for example,
/dev/mapper/vg00-meta), then you only need to ensure that enough free disk space
is available in a volume group. The prepareHA.sh script will create the metadata
volume.
12. Ensure that the password for the root user is the same on both systems. You can
change the password after installation.
13. As user root, run the following commands:
cp -r Tools /tmp
cd /tmp/Tools/highavail
cp template.properties highavail.properties
chmod 644 highavail.properties
14. Specify the following information in /tmp/Tools/highavail/highavail.properties:
l service_hostname = host name of ESM in the APHA cluster
15. As user root, run the following command on the primary system:
/tmp/Tools/highavail/prepareHA.sh
a. Confirm the names of the primary and secondary systems.
b. If the metadata partition does not exist and it will be a logical volume, allow the
script to create it.
c. Provide the password for user arcsight.
If there are errors, correct them and run prepareHA.sh again.
16. As user root, run the following command:
scp -r /tmp/Tools <secondary hostname>:/tmp
17. Reboot the primary system.
18. On the secondary system, as user root, run the following command:
/tmp/Tools/highavail/prepareHA.sh
If there are errors, correct them and run prepareHA.sh again.
19. Reboot the secondary system.
When the configuration tasks for the primary and secondary systems are complete,
continue to Existing ESM Installation: Running the Active-Passive High Availability
Module Installation Wizard.
Note: Run the APHA installation wizard on the primary system only.
For information about running the First Boot Wizard, see Existing ESM Installation:
Running the First Boot Wizard.
Note: Run the First Boot Wizard on the primary system only.
Shared Mount point of the disk that is shared between the primary and secondary systems
Disk The options provided include all relevant mount points.
In the highavail.properties file, this value is identified as shared_disk .
The installation does not support bind mounts and flags them as errors. Use symbolic
links instead.
Because the installation completely erases the contents of the shared disk on the
secondary system, ensure that it does not contain data of value.
Ensure that no processes on the primary or secondary systems are using this file
system. Otherwise, the installation will exit with errors.
You cannot change this value on subsequent runs of the First Boot Wizard.
Primary IP address of the interface that is connected to the interconnect cable on the primary
Cable IP system
In the highavail.properties file, this value is identified as primary_cable_ip .
Secondary IP address of the interface that is connected to the interconnect cable on the secondary
Cable IP system
In the highavail.properties file, this value is identified as secondary_cable_ip .
Field Description
Connected These hosts are other machines in the network that the APHA module can ping to verify
Hosts that it is connected to the network.
Enter a space-separated list of host names or IP addresses that the module can ping. Do
not specify a host name or IP address for the primary and secondary systems.
This field is not required.
Ping Number of pings to attempt before reporting that the pings failed
Attempts The default is 2 pings.
The wizard generates a summary of the host names and other configuration
parameters. The wizard resolves IP addresses to host names and resolves host
names to IP addresses. The wizard determines whether to use IPv4 or IPv6 for the
service IP address and provides an explanation. If you do not agree with the choice,
you might be able to force the wizard to choose IPv4 or IPv6 by specifying an IPv4
or IPv6 address instead of a host name.
6. Provide the password for user root.
The password enables the APHA configuration script to complete actions that must
be performed as the root user. The password must be the same on the primary and
secondary systems. The wizard does not permanently store the password. You can
change the password after the installation is complete.
After you select to continue, the wizard displays the status of each operation. It might
take approximately one hour to complete.
7. When the installation is complete, check the log files on both servers. For
information about resolving errors, see Installation Issues and Solutions.
After you resolve errors, as user arcsight, run the First Boot Wizard again:
/usr/lib/arcsight/highavail/bin/arcsight firstBootWizard
When the installation is complete, perform post-installation tasks as described in the
ESM Installation Guide on the ESM documentation page, and then continue to Existing
ESM Installation: Completing Module Post-Installation Tasks.
The script automatically detects the APHA module and makes appropriate changes
to the primary and the secondary systems.
2. If the shared disk is a solid state drive (SSD), run the following command:
fstrim <shared disk>
On the primary system, if the drive has a large amount of free disk space, the
command shortens the time required to synchronize the secondary disk.
Note: You can skip steps 3-10 if you changed the original single system
hostname and are now using the original IP as the Service IP for the cluster. You
can also skip steps 3-10 if your ESM installation uses the hostname for the SSL
certificate.
3. As user arcsight, run the following command to stop the ArcSight Manager:
/etc/init.d/arcsight_services stop manager
5. When prompted for the ArcSight Manager host name, and in every field where the
previous host name or IP address is displayed, specify the cluster service host
name or cluster service IP address (specify the same value that you set in the First
Boot Wizard).
6. When prompted, select the self-signed keypair option and enter the required
information to generate the self-signed certificate with the cluster service IP address.
Note: If ESM is configured for FIPS mode, you must complete this step manually
from the command line. For more information, see the ESM Administrator's
Guide on the ESM documentation page.
7. As user arcsight, run the following command to start the ArcSight Manager:
/etc/init.d/arcsight_services start manager
Run this command about once per minute until you receive a notification that the
Manager is available.
9. Start the ArcSight Command Center:
https://<Service Hostname>:8443/
where <Service Hostname> is the host name that is defined for the cluster
If you are running Internet Explorer, host names with underscores do not work, so
use the service IP address.
If you are not using DNS to resolve host names, use the service IP address instead.
10. Change the ArcSight Manager IP address to the cluster service IP address for every
connector and console that connects to this Manager.
11. Update any URLs (for example, bookmarks) to ArcSight Command Center.
12. Import the newly-generated certificate for the ArcSight Manager to all clients,
consoles, and connectors that access the Manager.
You can use keytoolgui to import the certificate. For more information, see the ESM
Administrator’s Guide on the ESM documentation page.
If ESM is configured to use FIPS, use the arcsight keytool utility. For more
information, see the ESM Administrator’s Guide.
13. Ensure that clients can connect to the ArcSight Manager using the service
IP address or service host name, and ensure that peer configuration works as
expected.
The ESM installation is only mounted and visible on the primary system. To run
ESM utilities and commands (for example, /opt/arcsight/manager/bin/arcsight,
do so from the server that is currently the primary system.
14. If you have not already activated the ArcSight ESM APHA Monitoring Foundation
Package, activate it from the ArcSight Console.
For more information about activating standard content, see the ArcSight
Administration and ArcSight System Standard Content Guide on the ESM
documentation page.
15. Ensure that the primary and secondary systems are using the correct version of the
operating system timezone package.
For more information, see the ESM Installation Guide on the ESM documentation
page. If you need to install the timezone package, you must install it on the primary
and secondary systems because it is not installed in the shared directory.
Upgrade ESM and the APHA module on the primary system only. After the upgrade is
complete, the APHA module synchronizes the secondary system with the primary
system.
Before you begin the upgrade, review Configuring the Active-Passive High Availability
System - All Scenarios.
The upgrade.sh script performs upgrade tasks on both the primary and secondary
systems. The primary system is the server on which you run the script. When performing
upgrade tasks on the secondary system, the script uses passwordless SSH.
upgrade.sh performs the following steps:
Note: upgrade.sh takes the secondary system offline during the upgrade
process to ensure that it remains the secondary system and that ESM runs on
the primary system.
5. Rebuilds the Pacemaker configuration based on the information that you specified
during installation.
6. Places the secondary system in online mode.
The state of the disks are stored in DRBD metadata that DRBD uses to determine which
disk is more up-to-date and which parts of the disks are synchronized. Typically, the
server on which you run upgrade.sh is the primary system and the server where you first
run preUpgrade.sh becomes the secondary system. However, if the other server is more
up-to-date than the server on which you run upgrade.sh, DRBD forces the more up-to-
date server to be the primary system.
If you upgrade the operating system, download the APHA support packages for that
operating system and install them.
6. If you upgraded the operating system on software ESM, reboot the secondary
system.
11. If you upgraded the operating system on software ESM, reboot the primary system.
Note: This is not necessary on an appliance. The appliance will automatically
reboot.
12. If you have not already done so, disable SELinux and then reboot the primary and
secondary systems.
13. On the primary system, as user arcsight, run ArcSight-
ActivePassiveHighAvailability-7.3.0.xxxx.x.bin to start the APHA Module
Installation Wizard.
14. On the primary system, as user root, run the following command:
/usr/lib/arcsight/highavail/install/upgrade.sh
The log file for the APHA module upgrade is located
at: /usr/lib/arcsight/highavail/logs/upgrade.log.
15. On the primary system, upgrade to the supported ESM version.
For detailed instructions, see the ESM Upgrade Guide on the ESM documentation
page. Because you have already stopped the ArcSight services, you do not need to
run Tools/stop_services.sh.
IMPORTANT: The APHA module must be running before you begin upgrading
ESM.
After the ESM upgrade is complete, the APHA module synchronizes the primary
system and the secondary system.
16. As user root, start the ArcSight services:
/opt/arcsight/manager/bin/setup_services.sh
17. Ensure that the ArcSight services are running:
/etc/init.d/arcsight_services status
18. If you have not already done so, use the ArcSight Console to activate the ArcSight
ESM APHA Monitoring Foundation Package.
For more information, see the ArcSight Administration and ArcSight System
Standard Content Guide on the ESM documentation page.
When the upgrade is complete, perform post-upgrade tasks as described in the ESM
Upgrade Guide on the ESM documentation page, and then continue to Verifying the
Active-Passive High Availability Module and ESM Upgrade.
Note: For greater flexibility in configuration, Micro Focus recommends using a host
name (rather than an IP Address).
After you confirm that you want to uninstall the APHA module, the script uninstalls it
on both systems.
/usr/lib/arcsight/highavail/install/uninstall.sh
After the module uninstallation is complete, all of the files required to run ESM
remain on both systems. Choose which server you want to convert to a single ESM
installation.
3. If you are not reusing the service IP address, change the IP address. For information
about changing the IP address of an ESM server, see the ESM Installation Guide on
the ESM documentation page.
4. If you are reusing the service IP address, complete the following steps:
a. As user root, run the following command to update the IP address configuration
on the selected server:
ip addr add <service_ip> dev <primary interface>
Where <service_ip> is the IP address and <primary interface> is the interface
on which the IP address of the host name is configured (for example, eth0).
b. Update the ARP cache:
arping -U -I <primary interface> -s <service_ip> <default_gateway_ip>
c. Complete the uninstallation on the secondary system that you removed from the
APHA cluster.
d. If you uninstalled the APHA module in a distributed correlation environment
where the persistor was part of the APHA cluster, run the following command:
/etc/init.d/arcsight_services sshSetup
At this point ESM is running on the server. However, if you reboot this server, the
service IP address will not be available on the primary interface and ESM will
not be accessible.
f. To ensure that the service IP address is available after reboot, modify the
appropriate scripts in /etc/sysconfig/network-scripts/.
Server Configuration
Each server in this example cluster meets the recommended hardware requirements
specified in the ESM Installation Guide on the ESM documentation page.
l 2 TiB of RAID 10 storage is provided via 15K RPM disks.
l The network interface runs at 1 GB.
l One 1 GB interface on each server will be interconnected by a cable.
l RedHat 7.7 is used with ESM 7.3 software with the APHA Module.
l The company’s internal DNS server is used for name-to-address translation for the
cluster. This is generally the best choice, because there can be thousands of
connectors, and dozens of ESM clients. Changing the ESM hostnames on this many
machines would be difficult.
l Linux configuration files are used to define the hostname, the IP addresses for each
interface, DNS server addresses, and the default route. In a corporate environment, a
more common choice would be to set these values via DHCP. For the purposes of
this example it is convenient to configure these on the machine directly, so what is
going on can be seen. In any case, it is likely that the interconnect ports would be
statically defined, since they connect to each other, and do not have access to a
DHCP server.
l The shared disk partition and the metadata partition are allocated space via the
Logical Volume Manager (LVM). This is strongly recommended that you use Logical
Volume Manager (LVM) tools to manage disk space. It will be much easier for you to
increase the disk space later using LVM tools.
DNS Setup
We will assume that the company puts its intranet on Net 10 – in the private IP space.
Many companies would use public IPs for their intranet – this is a company decision.
Here are some example values that we will use:
Type Hostname IP
for boot. The remaining disk space can be put into a single LVM volume group (vg00) for
later allocation to support ESM.
Give the primary and secondary machines the hostnames specified in the previous
section, and configure the IP address of the primary and secondary on the eth0 interface
of the respective servers.
To make the mount persist across reboots, add the following line to /etc/fstab:
/dev/mapper/vg00-tmp /tmp ext4 defaults 1 2
Next, set up a partition for /opt that is as large as possible. However, it is necessary to
save space for the metadata partition required for APHA installation. Assuming that the
disk will be 2.2 TiB (2,306,867 MiB), then the metadata partition must be at least 72 MiB,
where:
size = (2,306,867 MiB/32768) + 1
Assuming the chunk size of the volume group is 32 MiB, allocate 96 MiB.
To create the partition, run the following command:
lvcreate –L 96M –n metadata vg00
Then, as with /tmp, add an entry to /etc/fstab and mount /opt with the command mount
/opt. The fstab entry is as follows:
/dev/mapper/vg00-lv_opt /opt xfs defaults,inode64 1 2
Note that the inode64 option is used in this example, which is a good idea for very large
file systems. If you have special mount options you want to use, mount your filesystem
with them if you want them to be used after the APHA installation.
Note: The APHA module installation program will comment out the mount line for
/opt during installation. Pacemaker will automatically control when the /opt partition
is mounted.
The first three lines come from the original file that was created when the operating
system was installed. Delete any other lines from the original file. The next line, defining
the IP address, is unique to each machine. On the secondary, we will use the IP
Address 192.168.10.3. The remaining lines are the same for all such files – you may
copy them in.
To bring up the connection, run ifup eth1 as root on both the primary and the
secondary. At this point pings to 192.168.10.3 on the primary and pings to 192.168.10.2
on the secondary should succeed.
Parameter Value
Install ESM
ESM is installed as described in the ESM Installation Guide on the ESM documentation
page. The only special step is when you are prompted for Manager Information. One
value will be entered differently than if you are setting up a single ESM system.
Manager host name (or IP): The correct value to enter for Manager host name (or IP) is
esm.internal.<mycompany>.com.
Administrator user name: There is no change to this variable.
Administrator password: There is no change to this variable.
Password confirmation: There is no change to this variable.
This change requires an increase to the size of the metadata volume. The metadata
volume on each server must be at least 177 MiB, using the equation:
size = (5767168 MiB/32768) + 1
Rounding up to the nearest multiple of 32 gives 192 MiB for the new metadata partition
size. The following command is run as root on each server to increase the size of the
metadata partition:
lvresize –L 192M vg00/metadata
Increase the size of the shared disk partition (not the filesystem) on both the primary and
the secondary to its maximum size. Do that with the following command (as root):
lvresize –l +100%FREE vg00/opt
Inform the APHA software that the partition has increased in size by running the
following command as root on the primary:
./arcsight_cluster increaseDisk
Increase the size of the filesystem on the primary. As the command below uses
/dev/drbd1, the filesystem increases will be mirrored on the secondary. xfs_growfs is
used since this is an XFS filesystem. For an ext4 filesystem resize2fs would be used.
Run the following command as root on the primary only:
xfs_growfs /dev/drbd1
After you run this command, the /opt filesystem will be about 5.5 TiB in size.
Finally, go to the ArcSight Command Center, navigate to Administration > Storage and
Archive, to the Storage tab, and configure the Default Storage Group to take advantage
of this additional disk space. For more information, see the ArcSight Command Center
Users Guide on the ESM documentation page.
Command Syntax
The arcsight_cluster command syntax and options are described below. The actions
(except help) have more detailed explanations in the topics that follow.
Description A tool for managing the APHA Module. Run this as user root .
Syntax
/usr/lib/arcsight/highavail/bin/arcsight_cluster <action> [options]
Actions clusterParameters [--console] Update the cluster parameters using the Cluster
Parameters Wizard. Only run this on the primary.
The --console option displays in console mode.
GUI mode is the default.
diagnose Checks the system health. If any problems are
found it corrects them or suggests how the user can
correct them. After correcting a problem, run it again
to see if there are any other problems.
help (or -h) Provides command usage and APHA version.
increaseDisk Increase the size of the shared partition to fill the
volume that backs it. Only run this on the primary.
There is no option; it increases the size to the
maximum possible size.
offline [hostname] Makes hostname ineligible to be the primary. If
hostname is not specified, the secondary is taken
offline. Once off line, a server stays in that state,
even if it is or becomes operational, until the online
action is issued.
online [hostname] This action makes the server [hostname] a
candidate to be the primary.
If there is already a primary, the other server is
brought online as the secondary and specifying
[hostname] is optional.
If both servers are offline (but ready to be brought on
line) you must specify the server to bring online.
If online is not successful, it will suggest how the
user may bring the server online.
status Print the status of the cluster.
tuneDiskSync Update the configuration to improve disk sync
speed. Do this whenever the speed of the
interconnect cable is changed.
clusterParameters
This command option starts the Cluster Parameters Wizard. Whether you run it in
console or GUI mode, it asks you to provide the following parameters:
l connected hosts
l ping attempts
l ping timeout
diagnose
The command arcsight_cluster diagnose runs a set of tests on your cluster, finds
problems, and recommends actions to clear them. The diagnose action deals with the
following problems:
l Checks for communication problems between the nodes.
l Suggests ways to bring nodes that are offline to online mode.
a. Detects if arcsight_cluster offline has been used to take a node offline, and if
so, recommends using arcsight_cluster online.
b. Suggests that you run crm cluster start, if appropriate.
c. Recovers from ifdown/ifup.
l If the disk state is Diskless, it recommends ways to get out of that state.
l Any failures associated with resources are cleared.
If the command returns 2015-11-30 15:07:10 Reconnect attempt failed., this may
indicate a split-brain condition. See "Disks on Cluster System Fail to Connect " on
page 76 for additional steps to evaluate whether that is the case.
increaseDisk
The increaseDisk action provides a way to increase the size of the shared disk. This
cannot be done directly because this partition contains disk-synchronization metadata,
which must be modified as well. Therefore use this command action as part of the
following procedure. You can increase the size of the shared disk without taking the disk
or ESM off line.
To increase the size of disk:
1. Determine if the metadata volume needs to be increased in size using the following
formula:
The size in mebibytes (MiB, 1,048,576 bytes) can be calculated as
size=(P/32)+1
where P is the size of the shared disk partition in gibibytes. For example, if the
shared disk partition size is 1 TiB, then P=1,048,576 MiB, and the metadata partition
size would be 33 MiB.
If you ever need to increase the size of the shared disk partition , increase the size of
the metadata partition accordingly. Decreasing the size of the shared disk partition
is not supported.
Use the operating system’s Logical Volume Management (LVM) tools to simplify
changes. An LVM partition must be a multiple of the LVM chunk size. If you use 32
MiB for the chunk size, for example, then to get a 33 MiB partition, you would take a
64 MiB partition, because you would need two chunks.
Make sure to increase the size of the metadata on both the primary and secondary.
They must be the same size. If you are using LVM, the command lvresize provides
a simple way to do online resizing.
2. Increase the size of the backing device on both the primary and the secondary. Do
not increase the size of the file system at this point. This will be done later. The
backing device is listed in the file /etc/drbd.d/opt.res, on either the primary or the
secondary. The line looks like this:
disk /dev/mapper/vg00-lv_opt;
Increase the size so that the backing devices on the primary and secondary have
identical sizes. Again, if you are using LVM, the command lvresize provides a
simple way to do online resizing.
3. On the primary system run:
./arcsight_cluster increaseDisk
It will only allow you to proceed if both disks have been increased by the same
amount and the metadata volumes are big enough to accommodate this larger size.
4. Increase the size of the /dev/drbd1 filesystem on the primary. This filesystem is the
one mounted at /opt or /opt/arcsight. The type of the /dev/drbd1 filesystem is
the same as the type of the backing device. If the filesystem is of type ext4, use the
resize2fs command to change the size. If the filesystem is of type xfs, use the
command xfs_growfs.
5. Verify that the command succeeded by running df -h /opt on the primary, and
noting that the available disk space has increased.
To take advantage of this increased disk space, you may also need to increase the size
of the ESM Default Storage Group. You can do this from the ArcSight Command Center
(, navigate to Administration > Storage and Archive, under the Storage tab). For more
information, see the ArcSight Command Center Users Guide on the ESM
documentation page.
license
The license action starts a wizard that allows you to update the license file. You can run
the wizard in GUI mode or console mode.
offline
The offline action lets you take any server out of service for the purpose of performing
maintenance on it. Taking the primary offline forces a failover to the secondary. You get
a “Do you want to continue?” prompt in that case.
A server won’t become “offline” automatically unless all communications with it are lost.
Typically, a server is only off line because someone issued the offline action. A server
can be in the “offline” state and be operating normally, for example, after the
maintenance is completed. An server cannot act as secondary while it is off line. This
means that even if it is operating normally, it cannot take over as primary in a failover.
To bring it back on line use the online action.
online
The online command brings the specified server back online, if it is in the offline state. If
that server is already online, no action is taken. Changing a server state to online does
not make it the primary; it is merely eligible to be the primary.
If there is already a primary server online, then [hostname] is optional; the action brings
the server that is not the primary online as the secondary. If both servers are off line, you
must specify [hostname].
If you specify online [hostname] for an offline server that is not fully operational, the
server’s state is changed to online. In that state, it automatically becomes the secondary
when it becomes fully operational.
Sometimes the APHA Module hesitates to start a resource that has recently and
frequently failed. You can clear memory of all failures with the diagnose action. This
may help to start resources.
status
The status action provides the current status of the cluster. The output varies
depending on whether DRBD 8 or DRBD 9 is in use. The version of DRBD that is in use
depends on the operating system version of the cluster. Examples of DRBD 8 output
and DRBD 9 output are shown below.
OK Network-prod01.acme.com
OK Network-prod02.acme.com
Started Audit-Event-prod01.acme.com
Started Audit-Event-prod02.acme.com
Started ESM
Started Filesystem
Started Ping-prod01.acme.com
Started Ping-prod02.acme.com
Started Service-IP
OK Network-prod01.test.acme.com
OK Network-prod02.test.acme.com
Started Audit-Event-prod01.test.acme.com
Started Audit-Event-prod02.test.acme.com
Started ESM
Started Filesystem
Started Ping-prod01.test.acme.com
Started Ping-prod02.test.acme.com
Started Service-IP
The summary provides the current date and time followed by the overall cluster status.
In this example, FAIL indicates that there are problems with the cluster status: the
secondary disk is out-of-date (primary status/secondary status). FAIL appears if one or
more of the following conditions exist:
l The cluster service is down.
l One of the servers is not online.
l The disk communication state is other than Connected.
l One or more of the pacemaker resources is stopped.
l Network communication to one or more servers failed.
This action (including all options) returns an exit code of zero when the status is OK and
non-zero if there is a failure.
The following example indicates that the cluster function failed:
Tue Sep 30 14:48:32 PDT 2014 FAIL Disk: Unconfigured
Cluster is stopped. Run "crm cluster start" to restart it.
Disk: Unconfigured
It is possible that even though the server on which you ran this command is reporting
this issue, the other server is running as primary without any problems.
Server Status
The next lines give the status of the servers in the network. Each is either online or
offline:
prod01.test.acme.com: online
prod02.test.acme.com: online Primary
Offline might mean that the administrator placed the server in offline mode or that a
failure caused the server to go offline. Primary indicates that this server is the primary
server.
If the secondary server is offline or its cluster function stopped, the status is as follows:
prod01.test.acme.com: offline
prod02.test.acme.com: online Primary
Disk Status
If the disks are up to date, this section contains only one line. If the disks are
inconsistent, the next line shows a progress bar with the percent synchronized and the
bytes synchronized out of the total:
Disk: SyncSource UpToDate/Inconsistent
[>....................] sync'ed: 0.2% (1047096/1048576)M
finish: 4:08:11 speed: 71,988 (72,092) K/sec
The first line shows the disk connection state. The next two lines appear if the disk is
synchronizing. The first means that synchronization is underway from this server to the
other. The second means that synchronization is underway from the other server to this
server. These lines contain information about how much space requires
synchronization, how much remains, an estimate of how long the synchronization will
take, and how quickly the synchronization is running.
If the secondary server is offline or its cluster function stopped, the output is as follows:
Disk: WFConnection UpToDate/Outdated
The Disk: line indicates the Communication state. The shared disk can have the
following communication states:
Communication
State Description
SyncSource Disk synchronization is underway from the local server to the other server (this server
is the primary).
SyncTarget Disk synchronization is underway from the other server to this server (this server is
the secondary).
WFConnection This server is waiting for the other server to connect to it.
The second part of this line provides the disk state of this server, followed by the disk
state of the other server. Common disk states are as follows:
Outdated The data on the disk is out of date. Synchronization is not occurring.
Inconsistent The data on the disk is out of date, and a synchronization is occurring to correct this.
Diskless No data can be accessed on the disk. This state might indicate disk failure.
DUnknown The disk state of the other server is unknown because there is no communication between
the servers.
Consistent This server's disk state is correct, but until communication is re-established, it will not be
known if it is current.
OK means the server can ping one or more of the hosts that are specified as a cluster
parameter. FAIL means that all pings to all hosts on the list failed. When a server is
offline, its network connectivity is FAIL.
Resource Status
The remaining lines report on internal resources that the APHA module is managing. In
parentheses after each item is the string you can use to search the logs for these entries.
l ESM is the ESM instance on the primary server. When the startup process begins, the
status is Started. ESM takes several minutes to complete the startup process and
become accessible. During this interval, ESM is not available, even though the status
is Started. Wait a few minutes and try again. (ESM services)
l Audit-Event-<hostname> is an agent that provides status information that ESM uses
to generate audit events.
l Filesystem refers to the shared disk filesystem that is mounted on the ESM server.
(Filesystem)
l Service-IP is the service IP of the ESM server. (IPAddr2)
l Ping-<hostname> is a program that checks this server’s connectivity to the network
using a ping command. An instance runs on each machine. (ping)
An F after started means that this resource has a positive failure count. You can reset the
counter using the diagnose action. This action restarts the resource.
tuneDiskSync
The tuneDiskSync action adjusts the disk sync parameters to match the speed of the
interconnect cable. It only needs to be run when the speed of these cables is changed.
Doing so results in no interruption of service. This is done automatically at installation. If
it is not done when the interconnect cable configuration changes, then background sync
performance (sync after the systems have been disconnected) may suffer. In particular, if
the speed of the interconnect cable is increased, the increase is not translated to an
improvement in sync performance until this command is run.
Log Output
The APHA Module produces the following log output:
The APHA Module produces log files in /usr/lib/arcsight/highavail/logs. These
logs are concerned with user-initiated operations. The APHA Module configures the
operating system to rotate these log files.
This folder contains the following log files:
l upgrade.log contains information about the upgrade process.
l arcsight_cluster.log contains descriptions of arcsight_cluster requests and
responses to the user.
l install-console.log contains console output for installations run on this machine.
l install.log is the installation file for installations run on this machine and contains
more detail than install-console.log.
l secondaryHelper.log contains detailed installation output for installation operations
run on this machine, which were actually initiated when the other machine was the
primary.
Log rotation occurs at most weekly. Logs are rotated when their size exceeds 1Mbyte.
Rotated logs are named <log-name>-YYYYMMDD, for example, install.log-20140501.
The original log plus five rotated logs are kept. The oldest log is removed each time a
new log is created.
Cluster log messages from resources other than DRBD are placed in
/var/log/cluster/corosync.log. Some of these messages plus DRBD messages are
sent to the syslog facility local5. The storage location of that file depends on the
configuration in rsyslogd.conf. By default, this output goes to /var/log/messages.
In the subtopic "Resource Status" on the previous page, each resource description is
followed by a string you can use to search /var/log/cluster/corosync.conf and
/var/log/messages to find messages from each of the resources.
There is a field for the Service hostname on the Parameter Configuration panel.
Finish the First Boot Wizard.
2. Stop the Manager by running (as user arcsight):
/etc/init.d/arcsight_services stop manager
3. While logged in as user arcsight, run the following to start the setup program for the
Manager from /opt/arcsight/manager/bin directory:
./arcsight managersetup
information to generate the self-signed certificate with the new service IP address.
If ESM is configured for FIPS mode, you will not get this option. The key-pair must
be generated manually using the runcertutil utility.
4. Start the Manager and all other processes by running (as user arcsight):
/etc/init.d/arcsight_services start
5. As the user arcsight, see if the manager is running yet by running the command
/etc/init.d/arcsight_services status manager
Run this command about once a minute. Go on to the next step when you see the
line “manager service is available”.
6. Make sure you can start the ArcSight Command Center by browsing to the following
URL:
https://<hostname>:8443/
Where <hostname> is the new hostname (note that hostnames with underscores do
not work on IE, so use the IP address.)
7. Import the Manager’s newly-generated certificate on all clients (ArcSight Console
and connectors) that access the Manager. Use keytoolgui. For more information
about this tool, see the ESM Administrator’s Guide on the ESM documentation
page. Use runcertutil if you are running ESM using FIPS mode. For more
information about this tool, see the ESM Administrator’s Guide.
8. Test to make sure that:
l The clients can connect to the Manager.
l Peer configuration works as expected. If not, redo the peer configuration.
/usr/lib/arcsight/highavail/bin/arcsight firstBootWizard
9. In the First Boot Wizard, specify the new hostname or IP address for the new
secondary (System A). When the First Boot Wizard completes, the cluster restarts.
Replacing a Server
This topic describes how to use the First Boot Wizard to replace a server (for example, if
it has hardware issues).
To replace a server:
1. Power down the server to be replaced.
The remaining server becomes the primary server.
2. Prepare the new server as described in Existing ESM Installation: Configuring the
Active-Passive High Availability System.
The new server can have a different IP address and host name than the one it is
replacing.
3. As user root, stop ESM services on the primary system:
/opt/arcsight/manager/bin/remove_services.sh
4. As user arcsight, run the First Boot Wizard on the primary system and specify the
host name or IP address for the new secondary system if it is different from the
original.
5. If you are replacing a server in a distributed correlation environment where the
persistor is part of the APHA cluster, run the following command:
/etc/init.d/arcsight_services sshSetup
At this point, ESM should start on the primary system and the new server will become
the secondary system. The synchronization process between the primary system and
the new secondary system might take some time. For more information, see Planning for
the Initial Disk Synchronization.
Where:
l <file system type> must be ext4 or xfs, and cannot be changed.
l <new mount options> are the new options you want.
l <shared disk> is where the shared disk is mounted, which cannot be changed
(typically /opt or /opt/arcsight).
l /dev/drbd1 is the name of the mirrored volume.
Then run the following command as user root on the primary. This command makes
the changes permanent across failovers:
./arcsight_cluster tuneDiskSync
Fatal error on <hostname>. See <log file>. An unexpected error caused SSH to fail to
<hostname>. For more information, check the
specified log file.
Timeout on SSH to <hostname>. SSH access to Resolve the SSH communication problem.
<hostname> failed to connect quickly.
Incorrect root password for <hostname> - please The password that you specified is not correct.
enter correct one. Provide the correct password.
Failed to set up SSH access. See <log file> for SSH access was not successful. For more
details. information, check the specified log file.
No arcsight user on secondary. Please create On the secondary, create an arcsight user that is
one identical to that on primary. identical to the arcsight user on the primary.
arcsight users on primary and secondary must The user or group IDs of the arcsight users differ on
be set up identically. the primary and the secondary. Ensure that they are
identical.
arcsight users on primary and secondary must Ensure that the arcsight users on the primary and the
have the same home directory. secondary have the same home directory.
Speed of primary end of crossover cable is The primary interface for interconnect is slower than
<primaryCableSpeed>M - must be at least the Gigabit ethernet. Use a faster interface.
1000M.
Speed of secondary end of crossover cable is The secondary interface for interconnect is slower than
<secondaryCableSpeed>M - must be at least the Gigabit ethernet. Use a faster interface.
1000M.
Primary Cable IP <primaryCableCIDR> and Ensure that the primary cable IP address and the
Secondary Cable IP <secondary_cable_ip> must secondary cable IP address are in the same subnet.
be in the same subnet.
No interface found for <primary_cable_ip> on The primary cable IP address does not correspond to
Primary. an interface. This was probably a data entry error in the
First Boot Wizard.
No interface found for <secondary_cable_ip> on The secondary cable IP address does not correspond
Secondary. to an interface. This was probably a list selection error
in the First Boot Wizard.
Unmount of <shared_disk> failed. Fix the Correct the problem, and then run the First Boot
problem, and re-run this script. Wizard again.
Permanently unmount the following mounts on The listed mounts mount on top of /opt or
<shared_disk>, and then retry installation: <mount /opt/arcsight . This is not supported. Unmount them
name> and then remove them from /etc/fstab .
<metadata_vol> should not be mounted. The metadata volume is mounted. Unmount the
volums. The APHA module will probably also generate
the <metadata_vol> appears to be in use error.
Follow the instructions for that error.
<metadata_vol> appears to be in use. See the The metadata volume is in use. Ensure that this is not
following output from the case, run the command specified in the message,
blkid -o export <metadata_vol> and then run the First Boot Wizard again.
--- blkid output here ---
If this volume is not in use, run
dd if=/dev/zero of=<metadata_vol>
as user root on hosthame <hostname> to clear
this volume and then rerun the First Boot Wizard.
Disk status must be Connected to reconfigure The APHA module is already installed on both
cluster. machines, so this call to the First Boot Wizard is to
reconfigure the installation. This can only be done if the
disk status is Connected (normal).
Run ./arcsight_cluster diagnose and then run the
First Boot Wizard again.
Please mount <shared_disk partition>, and re-run Mount the shared disk and then run the installation
installation. again.
Size of metadata volume <metadata_vol> is less The metadata volume is too small to support the
than required minimum of <megabytes>M shared disk. Increase the size of the metadata volume.
The size of <volume> on the secondary is This could refer either to the shared disk volume or the
<megabytes>M. It must be the same as the metadata volume. The size of each must be the same
primary - <megabytes>M. on each server (rounded to the nearest Mbyte).
Change the sizes to be identical.
<volume> is not a valid disk volume. Either the shared disk or the metadata volume is not
really a volume. Check to see if there is a
typographical error in the name you specified.
Found <megabytes>M disk space used on The installation found more than 10MB of files on
<shared_disk>. <shared_disk> on the secondary. The installation is
The installation will not proceed with these files in terminated. Remove the files, and then run the First
place. If these files are not important, run Boot Wizard again.
"rm -rf <shared_disk>/*" as root on <hostname>
and re-run the First Boot Wizard.
<shared disk volume> mounted on <shared_disk> Ensure that the volume of the shared disk is mounted
on the primary and on <secondary_disk> on the on the same mount point on both machines.
secondary. It must be mounted on the same
mount point on both machines.
Cannot do a Reconfiguration when disks are in The <status> value in the message is either
<status> status. Please correct the disk status StandAlone or WFConnection . The reconfiguration will
before doing reconfiguration. not work unless disk mirroring is functioning. You can
usually use the arcsight_cluster script,
./arcsight_cluster diagnose , to fix this problem.
DRBD Connection State is 'WFConnection' - This message indicates that the shared disk software
should be Connected. on the primary and secondary systems cannot
communicate. Typically, this is because a firewall is
running. To resolve the issue:
1. Ensure that firewall software is not blocking TCP
port 7789 on the interconnect cable.
2. As user root , complete the following steps on the
primary and secondary systems:
a. Run drbdadm down opt .
b. Edit /etc/fstab to uncomment the mount
statement for the shared disk (typically /opt
or /opt/arcsight ).
c. Run mount -a .
3. As user arcsight , run the First Boot Wizard on
the primary system:
/usr/lib/arcsight/highavail/bin/arcsight
firstBootWizard
No interface found for <primary_ip> on Primary The primary IP address or host name must be the first
IP address on an interface. Configure the primary host
name to correspond to an interface.
No interface found for <secondary_ip> on The secondary IP address or host name must be the
Secondary first IP address on an interface. Configure the
secondary host name to correspond to an interface.
unsupported kernel version <version> The kernel version on this server does not correspond
to an operating system that the APHA module
supports. Upgrade the operating system to a
supported version.
ERROR installing RPMs - Please check the log RPMs were not installed. For more information, check
file <logfile> on <hostname> for details about the the specified log file.
error.
Primary IP <primary IP> and Secondary IP Change the host IP addresses so that they are in the
<secondary IP> must be in the same subnet. same subnet.
<hostname> - the hostname of this host does not Correct the host name.
correspond to the hostname given for either the
Primary or the Secondary.
<host> does not resolve to <IP>/ Correct the DNS or /etc/hosts so that <host>
resolves to the IP address on the server.
OS version on primary and secondary are Ensure that the primary and the secondary are on the
different. same operating system version.
Could not send and return test string using ssh. There is a problem with the SSH login. Ensure that the
Expected "test", saw "$returnedString" root user can SSH between systems in both
directions (i.e., from System A to System B and from
System B to System A).
remove added message of the day or login string A message of the day string has been detected.
from root logins. Expected “test” saw This might cause problems with SSH communication.
<returnedString> Disable the SSH banner by creating an empty
hushlogin file in the root user's home directory: #
touch /root/.hushlogin .
Cluster did not come up after installation. See the This happens rarely. Check the install.log file for
status output above this message. details. This message might have been generated
because of a temporary condition, and within a few
minutes the system will work as expected. If the
problem persists, contact Technical Support.
Host IP addresses are all <IPv4/IPv6> and cable This is a warning only. You can improve the system's
IP addresses are all <IPv6/IPv4>. Change both to redundancy by using exclusively IPv4 or IPv6
either IPv4 or IPv6 to support redundant addresses. If you do so, the system will be more
pacemaker communication. Otherwise, resistant to communications failures.
pacemaker communication will not be redundant.
SELinux on <primary/secondary> is Enforcing - The APHA module does not support SELinux. For
APHA does not support SELinux. Please disable information about disabling it, see the documentation
it. for your operating system.
Cluster Upgrade Issues
Cluster should not be running during upgrade. Run The system should not be running during the upgrade
"crm cluster stop" as root to stop cluster. process. As user root , run crm cluster stop to stop
the cluster.
General Problems
Your first resort for troubleshooting cluster problems should be the command:
./arcsight_cluster diagnose
This command clears some common problems automatically and provides simple
solutions for others.
Audit Events
Audit events are events generated within the Manager to mark a wide variety of routine
actions that can occur manually or automatically, such as adding an event to a case or
when synchronization of the two systems begins. Audit events have many applications,
which can include notifications, task validation, compliance tracking, automated
housekeeping, and system administration.
This topic lists the APHA audit events you can use in rules, filters, and other analytical
or administrative resources. Observe the way these events are used in the standard
system-related content for examples of how to apply them.
From the table below, use the Device Event Class (DEC) ID string in rules and filters.
The Audit Event Description reflects the event name you see in active channel grids.
highavailability:100
This event occurs when there is a failover causing the secondary system to take over
and become the primary machine. It also occurs every time ESM starts up, with or
without a failover.
Severity: 3
Device event category: /Monitor/Manager/HighAvailability/Primary/Up
highavailability:200
This is a system-failure event that occurs if the secondary system becomes unavailable
and cannot assume the role of the primary system. This event is generated every five
minutes until the secondary system is restored. The event includes a reason field that
provides more detailed information. There are numerous possible causes:
l Failure of either network interface card (NIC)
l Cross-over cable failure or disconnect
l Secondary system failure or shutdown
highavailability:300
This event occurs when the Distributed Replicated Block Device (DRBD) storage
system begins the process of synchronizing the primary and secondary hard drives and
continues every five minutes (by default) until the synchronization is complete. Each
event includes the amount of date between the two systems that has been synchronized
as a percentage until it reaches 100 percent. You can change the interval using the
highavailability.notification.interval property as described in “Configurable
Properties” on page 1.
Severity: 4
Device event category: /Monitor/Manager/HighAvailability/Sync/InProgress
highavailability:500
The APHA system is restored. This event occurs when the secondary system changes
from a failed status (highavailability: 200 or 300) to OK. It may take 30 seconds for this
event to generate after the secondary system and high-availability service is restored.
Severity: 3
Device event category: /Monitor/Manager/HighAvailability/Status/OK
Failover Triggers
The following situations can trigger a failover:
l You place the primary in offline mode using the arcsight_cluster command.
l The primary operating system goes down. In the case of a routine system restart, the
machine doing the restart might continue to be primary. This is true when the system
starts again before the failover had time to trigger.
l The hard disk on the primary system fails.
l The primary system loses an internet connection.
The following situations do not trigger a failover:
l You manually stop the ArcSight Manager or any of its services. For example,
changing a property in the server.properties file and starting the Manager again
does not trigger a failover.
l The network switch fails, causing a communications failure to both primary and
secondary systems. Users will immediately detect that the ArcSight Console or
ArcSight Command Center has lost communication with the Manager. The primary
continues to run and connectors cache events until communications are restored, at
which time the primary ESM continues as usual.
l The primary system runs out of disk space and the secondary system also runs out of
space because of mirroring.
To check whether this is a split brain condition, run the following command as the root
user:
grep Split-Brain /var/log/messages
If the 'Split-Brain' keyword appears in recent messages, this confirms that the split brain
condition has occurred. You must choose which machine has the most up-to-date data,
called System A in the following procedure. The machine with the older data is called
System B in the following procedure.
Perform the following steps to correct the split brain condition. When these steps are
complete, data from System A will be synced to System B.
1. On System B, as the root user run crm cluster stop. It may take up to 10 minutes
for ESM to stop.
2. On System B, make sure that the shared disk (e.g. /opt) is unmounted before you
perform the next steps.
3. On System B, run the following commands as the root user:
drbdadm up opt
drbdadm disconnect opt
drbdadm secondary opt
drbdadm connect --discard-my-data opt
4. On System B, as the root user run either crm cluster start.
5. On System A (the machine with up-to-date data), run the following command: