186 | Big Data Simplied
7.6.3 Apache Solr—Basic Commands
Starting Solr
hduser@sayan:~/$ cd Solr/
hduser@sayan:Solr$ cd bin/
hduser@sayan:~bin$ ./Solr start
Waiting up to 30 seconds to see Solr running on port 8983 []
Started Solr server on port 8983 (pid = 6035). Happy searching!
StoppingSolr
hduser@sayan:~bin$ ./Solrstop
Sending stop command to Solr running on port 8983 ... waiting 5
seconds to
allow Jetty process 6035 to stop gracefully.
Solr – help command
hduser@sayan:~bin$ ./Solr-help
7.7 ZOOKEEPER
Apache ZooKeeper provides operational services for a Hadoop cluster. ZooKeeper is a distrib-
uted coordination service to manage large set of hosts. Coordinating and managing a service in
a distributed environment is a complicated process. ZooKeeper solves this issue with its simple
architecture and API. ZooKeeper allows developers to focus on core application logic without
worrying about the distributed nature of the application.
ZooKeeper is especially fast with workloads where reads to the data are more common than
writes. The ideal read/write ratio is about 10 : 1. ZooKeeper is replicated over a set of hosts and
the servers are aware of each other. As long as a critical mass of servers is available, the ZooKeeper
service will also be available. There is no single point of failure. Hadoop NoSql database such as
HBase uses Zookeeper to manage internal service.
7.8 APACHE NIFI
Apache NiFi is an open source project which enables the automation of data ow between
systems known as ‘data logistics’. The project is written using ow-based programming and it
provides a web-based user interface to manage data ows in real time. The project was created
by the United States National Security Agency (NSA) and it is originally named as NiagaraFiles.
In 2014, the NSA released it as an open-source software.
7.8.1 What Apache NiFi Does
Apache NiFi is an integrated data logistics platform for automating the movement of data between
disparate systems. It is data source agnostic and supports sources of different formats, schemas,
protocols, speeds and sizes. Some of the common formats are geolocation devices, click streams,
M07 Big Data Simplified XXXX 01.indd 186 5/17/2019 2:50:16 PM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset