Short-answer Type Questions (5 Marks Questions)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Introducing MapReduce | 95

7. Can the number of reducers be set to zero?

a. Yes

b. No

c. Not applicable

d. None of the above

8. Where is the Map Output (intermediate

key-value data) stored?

a. HDFS

b. Local File System

c. Name Node

d. Data Node

9. When does the Reduce start in a Map

Reduce?

a. Before any Map job starts

b. When first Map job is completed

c. When all the child Map job is completed

d. None of the above

Short-answer Type Questions (5 Marks Questions)

1. Name the most common input formats

defined in Hadoop. Which one is default?

2. Rearrange the main configuration param-

eters that the user need to specify to run

Mapreduce Job.

a. Job’s input locations in the distributed

file system.

b. Input format

c. Class containing the map function.

d. Output format

e. Job’s output location in the distributed

file system.

f. Class containing the reduce function.

g. Application JAR file containing the

mapper, reducer and driver classes for

execution and deployment.

3. What is InputSplit in Hadoop? Please

explain.

4. Assume that Hadoop spawned 100 tasks

for a job and one of the tasks failed. What

will Hadoop MapReduce framework do?

5. What is the difference between an Input

Split and HDFS Block? Please explain.

6. Explain the difference between Job.sub-

mit() and waitForCompletion().

7. What will happen if we run a MapReduce

job with an output directory that already

exists? Please explain the root cause here.

8. How an input file is made ready from

HDFS by MapReduce framework. Please

explain.

9. How are the keys grouped before reaching

the Reduce phase? Explain in detail.

10. What will be the problem if the Reducer

function does not receive the values (com-

ing values from Map) in a List? Why it is

needed so? Please explain.

11. What are the main configuration parame-

ters specified in MapReduce?

Long-answer Type Questions (10 Marks Questions)

1. What is shuffling and sorting in

MapReduce? Please explain in detail.

2. Explain the internal flow of a MapReduce

job with a diagram.

3. What is Speculative Execution MapReduce?

What is the main reason behind it and how

does MapReduce framework handle it?

4. How can you troubleshoot a MapReduce

job after getting an exception? Please

explain in detail.

5. Explain in detail how Yarn schedules a

MapReduce job in the job queue.

6. How can we troubleshoot a MapReduce

job? What will be the action you take if a

M04 Big Data Simplified XXXX 01.indd 95 5/10/2019 9:58:28 AM

96 | Big Data Simplied

MapReduce job is taking too much time to

complete? How can you find out the root

cause?

7. Explain the different types of events inside

Intermediate Event (between Map and

Reduce phase).

8. What are the parameters of mappers and

reducers? Please explain the meaning of

each parameter of Mapper <LongWritable,

Text, Text, IntWritable> and Reducer

<Text, IntWritable, Text, IntWritable>.

9. Explain the differences between a com-

biner and reducer. When is it suggested to

use a combiner in a MapReduce job?

10. What is the main difference between

Mapper and Reducer? What will happen if

the number of Reducer is set to 0 (zero)?

Why Compute Nodes and the Storage

Nodes are same? Please explain in detail.

M04 Big Data Simplified XXXX 01.indd 96 5/10/2019 9:58:28 AM

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Short-answer Type Questions (5 Marks Questions)

Create new playlist

Sign In

Sign Up

Table of Contents for
Short-answer Type Questions (5 Marks Questions)