As we move to the next slide
We can clearly see that Apache Ambari and Zookeeper
Are the main technologies of the management
Coordination and scheduling, compartment, through this
Presentation we’re going to understand why
Before getting into the detailed explanation
Let’s have an overview on the Ambari user interface
As we previously saw in class, when you log into the user Interface
The first thing that you will see is the ambari Dashbord and all
The services that are deployed
For instance, this cluster has HDFS, YARN MAPREDUCE
On the right side you see reports related to the cluster
Like the HDFS Disk USAGE and the resource manager heap
If it’s in green, the component is in a good state.
Apache Ambari is one of the components in the Hadoop ecosystem that is used for
Ambari provides a user interface which is very easy to use
It has also an intuitive web browser based user interface built on the top of rest APIS
The main of these rest apis is that they allow application developpers and system integrators
To easily integrate Hadoop provisioning
Let’s try to understand what do these functionalities mean
By the provisioning of Hadoop cluster what ambri does is that it provides a wizard
Ambari also helps in the
For managing the
Ambari provides
When it comes to Hadoop monitoring
Ambari provides a dhashbord
Like which nodes are down, which ones are running without errors
All these calculations are done via a component called the AMS
Which collects the required monitoring metrics from the different hosts
There’s also an alerting mechanism in place called the Ambari alert Framework
Which helps creating alerts for critical health issues
Now let’s see the architecture of Ambari
Apache Ambari consists of mainly the ambari server
It is considered to be the entry point for all the admin related activities on the master server
Like the software update for example
It also have the authorization provider like LDAP
Then wave the agent interface which is the AMS
The next component is the ambari agent
A given host will have an ambari agent
Which consists of the metrics monitor
And the Hadoop sink running
The metrics monitor collects and sends system level metrics
While the Hadoop sinks collects and sends the Hadoop level metrics
From the host to the ambari metrics system
Similarly all the hosts in the same cluste will do the same
All these metrics flowing from different metrics are stored
And aggregated by AMS
The data is either stored in the local file system
Called the embedded mode
Or can use an external hdfs for storage, called the distributed mode
Based on the information collected, the critical health issues and errors
The ambari alert framework alerts the admin to take action
Ambari supports multiple relationnels management systems
To keep track of the entire Hadoop infrastructure
The Rest Api interface acts as the connecting link between the ambari core and the user
interfaces
All the metrics and alert are sent to the last major component which is the AMBARI WEB UI
This application is protected by an authentication, through this interface the user can
manage clusters, integrate its own custom apps and deploy their services