Index
A
B
- bad records
- benchmarks
- built-in data types
C
- <configuration> tag
- capacity scheduler
- classifiers
- CLI
- cluster deployments
- clustering
- clustering algorithm
- collaborative filtering-based recommendations
- comapreTo() method / How it works...
- combiner
- completebulkload command
- complex dataset
- computational complexity / How it works...
- conf/core-site.xml
- conf/hdfs-site.xml
- conf/mapred-site.xml
- configuration files
- configuration properties, conf/core-site.xml
- configuration properties, conf/hdfs-site.xml
- configuration properties, conf/mapred-site.xml
- content-based recommendations
- createRecordReader() method
- custom Hadoop key type
- custom Hadoop Writable data type
- custom InputFormat
- custom Partitioner
- Cygwin / Getting ready
D
- data
- data de-duplication
- Dataflow language / How to do it...
- data mining algorithm
- DataNodes
- data preprocessing
- datasets
- debug scripts
- decommissioning process
- DFSIO
- distributed cache / How it works...
- distributed mode, Hadoop installation
- document classification
E
- EC2 console
- ElasticSearch
- EMR
- EMR Bootstrap actions
- EMR CLI
- EMR job flows
- exclude file / How to do it...
F
G
H
- Hadoop
- about / Introduction
- setting up / Setting up Hadoop on your machine, How to do it...
- URL / How to do it...
- MapReduce program, writing / Writing a WordCount MapReduce sample, bundling it, and running it using standalone Hadoop, How to do it...
- MapReduce program, executing / Writing a WordCount MapReduce sample, bundling it, and running it using standalone Hadoop
- setting, in distributed cluster environment / Setting Hadoop in a distributed cluster environment, Getting ready, How to do it...
- used, for parsing complex dataset / Parsing a complex dataset with Hadoop, How to do it..., How it works...
- content-based recommendations / Content-based recommendations
- hierarchical clustering / Hierarchical clustering
- Amazon sales dataset clustering / Clustering an Amazon sales dataset
- collaborative filtering-based recommendations / Collaborative filtering-based recommendations
- Adwords balance algorithm / Assigning advertisements to keywords using the Adwords balance algorithm
- Hadoop's Writable-based serialization framework
- Hadoop Aggregate package / How it works...
- Hadoop cluster
- Hadoop configurations
- Hadoop counters
- Hadoop data types
- Hadoop DistributedCache
- Hadoop GenericWritable data type / How to do it...
- Hadoop InputFormat
- Hadoop installation
- Hadoop intermediate data partitioning
- Hadoop Kerberos security
- Hadoop monitoring UI
- Hadoop OutputFormats
- Hadoop Partitioners
- Hadoop results
- Hadoop scheduler
- hadoop script / How to do it...
- Hadoop security
- Hadoop Streaming
- Hadoop streaming
- Hadoop Tool interface
- HADOOP_LOG_DIR
- hashCode() method / How it works..., How it works...
- HashPartitioner partitions
- HBase
- HBase cluster
- HBase data model
- HBase TableMapper / How it works
- HDFS
- HDFS basic command-line file operations
- HDFS block size
- HDFS C API
- HDFS configuration files
- hdfsConnectAsUser command / How it works...
- hdfsConnect command / How it works...
- HDFS disk usage
- HDFS filesystem
- HDFS Java API
- HDFS monitoring UI
- hdfsOpenFile command / How it works...
- hdfsRead command / How it works...
- HDFS replication factor
- HDFS setup
- HDFS web console
- hierarchical clustering
- higher-level programming interfaces
- histograms
- Hive
- Hive interactive session
- Hive script
- Human Development Report (HDR) / Running a SQL-style query with Hive
- Human Development Report (HDR) data / Running your first Pig command
I
J
K
L
- large text dataset
- LDA
- libhdfs
- Libtool package
- local mode, Hadoop installation
- LogFileInputFormat
- LogFileRecordReader class
- LogWritable class
M
- machine learning algorithm
- Mahout
- Mahout installation
- Mahout K-Means algorithm / How it works...
- Mahout seqdumper command / How it works…
- Mahout split command
- map() function / How it works...
- MapFile
- mapper
- MapReduce
- about / Introduction
- used, for calculating simple analytics / Simple analytics using MapReduce, Getting ready, How to do it..., How it works...
- used, for grouping data / Performing Group-By using MapReduce, How to do it..., How it works...
- used, for calculating frequency distributions / Calculating frequency distributions and sorting using MapReduce, How it works...
- used, for calculating histograms / Calculating histograms using MapReduce, Getting ready, How to do it..., How it works...
- used, for calculating Scatter plots / Calculating scatter plots using MapReduce, Getting ready, How to do it..., How it works...
- used, for joining datasets / Joining two datasets using MapReduce, How to do it..., How it works...
- used, for generating inverted index / Generating an inverted index using Hadoop MapReduce, How to do it..., How it works...
- MapReduce application
- MapReduce computations
- MapReduce computations results
- MapReduce jobs
- MapReduce monitoring UI
- MBOX format / Joining two datasets using MapReduce
- minSupport / How it works…
- modes, Hadoop installation
- mrbench / There's more...
- multi-dimensional space / Clustering an Amazon sales dataset
- multiple disks/volumes
- MultipleInputs feature
N
O
P
- <path> parameter / How it works...
- Partitioner / How it works...
- Pattern.compile() method / How it works...
- Pig
- Pig command
- Pig interactive session
- Pig script
- primitive data types
- principals
- Pseudo distributed mode, Hadoop installation
R
S
T
V
W
Z
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.