Short-answer Type Questions (5 Marks Questions)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

152 | Big Data Simplied

a. False

b. Error

c. True

d. None of the above

5. What is immutable in Spark?

a. Once created and assign a value, it’s

not possible to change, this property is

called immutability.

b. Spark is by default immutable, it does

not allow updates and modifications.

c. Data collection is not immutable, but

data value is immutable.

d. All the above

6. How Spark store the data?

a. Spark is a processing engine, there is no

storage engine.

b. It can retrieve data from any storage

engine, like HDFS, S3 and other data

resources.

c. Not applicable

d. Only a is true.

e. Both a and b are true.

7. What is the role of the ZooKeeper in Kafka?

a. Kafka uses Zookeeper to store offsets of

messages consumed for a specific topic

and partition by a specific Consumer

Group.

b. To maintain the Kafka cluster.

c. Zookeeper established the channel in

between producer and consumer.

d. All the above

8. Is it possible to use Kafka without

ZooKeeper?

a. Yes, it is possible.

b. It is possible if Kafka runs locally.

c. No, it is not possible to bypass

Zookeeper and connect directly to

the Kafka server. If, for some reason,

ZooKeeper is down, you cannot service

any client request.

d. Not applicable

9. What is RDD lineage?

a. Hive Spark does not support data rep-

lication in the memory and thus, if any

data is lost, then it is rebuilt using RDD

lineage.

b. RDD lineage is a process that recon-

structs lost data partitions. The best is

that RDD always remembers how to

build from other datasets.

c. Only a is true

d. Both a and b are true.

10. What is Spark Driver?

a. Spark Driver is the program that runs

on the master node of the machine and

declares transformations and actions on

data RDDs.

b. Driver in Spark creates SparkContext,

connected to a given Spark Master.

c. Driver also delivers the RDD graphs to

Master.

d. Only b and c are true.

e. All the above

Short-answer Type Questions (5 Marks Questions)

1. What is Apache Spark? Explain some key

features of Spark.

2. What are the benefits of Spark over

MapReduce? Please explain.

3. Can you use Spark to access and analyse

data stored in Cassandra databases? What

are the languages supported by Apache

Spark for developing big data applications?

4. Explain about the different cluster manag-

ers in Apache Spark.

5. Explain about the major libraries that con-

stitute the Spark ecosystem?

M06 Big Data Simplified XXXX 01.indd 152 5/17/2019 2:49:20 PM

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Short-answer Type Questions (5 Marks Questions)

Create new playlist

Sign In

Sign Up

Table of Contents for
Short-answer Type Questions (5 Marks Questions)