Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Index

A

ACID
Actor model
1. Akka installation
2. Akka logos
3. OOP vs . actors
4. thread-based concurrency
Agents server
Aggregation techniques
1. materialized views
2. probabilistic data structures
3. windowed events
Akka Actors
1. actor communication
2. actor lifecycle methods
3. actor monitoring
4. actor reference
5. actorSelection () method
6. actor system
7. BadPerformer
8. deadlock
9. GoodPerformer
10. GreeterActor
11. import akka.actor.Actor
12. installation
13. kill Actors
14. match expression
15. receive() method
16. shut down () method
17. starting actors
18. stopping actors
19. Thread.sleep
Apache Cassandra
1. cassandra.yaml
2. client-server architecture
  1. driver
  2. service petitioners
  3. service providers
  4. via CQLs
3. cluster booting
4. cluster setting
5. connection establishment
6. data model
7. GitHub
8. gossip
9. installation
  1. CQL commands
  2. CQL shell
  3. DESCRIBE command
  4. execution
  5. file download
  6. requirements
  7. validation
10. memory access
  1. column-family
  2. key-value
11. NoSQL
  1. characteristics
  2. data model
Apache Kafka
1. add servers
  1. amazingTopic
  2. cluster mirroring
  3. headers
  4. Kafka topics
  5. reAmazingTopic
  6. reassign-partition tool
  7. remove configuration
  8. replication factor
2. architecture
  1. design
  2. goals
  3. groups
  4. leaders
  5. log compaction
  6. message compression
  7. offset
  8. replication modes
  9. segment files
3. cluster
  1. broker property
  2. components
  3. multiplebroker

SeeMultiple broker

singlebroker

SeeSingle broker

consumer
1. consumer API
2. multithreadedconsumer

SeeMultithreaded consume

properties
Scalaconsumer

SeeScala consumer

GitHub project
Gradle compilation
installation
1. importing
2. install Java 1.7
3. Linux
integration
1. Apache Spark
2. consumer parameters
3. data processing algorithms
JDK validation
libmesos
message broker
1. CEP
2. distributed
3. multiclient
4. persistent
5. scenario
6. types of, actors
7. uses
producers
1. custompartitioning

SeeCustom partitioning

Producer API
Properties
Scala Kafkaproducer

SeeScala Kafka producer

tools

Apache Mesos
1. clusters
  1. ApacheKafka

SeeApache Kafka

Apache Spark
indicators
MASTER
SLAVES

concurrency
coordinators
distributed systems
1. characteristics
2. complexity
3. models
4. types of, processes
dynamic process
Framework
1. abstraction levels
2. architecture
implementation
Mesos 101
1. Aurora framework
2. Chronosframework

SeeChronos framework

installation

SeeMesos installation

Marathon framework
ZooKeeper framework

rule

Apache Spark
1. Amazon S3
2. architecture
  1. metadata
  2. methods
  3. object creation
  4. sparkcontext
3. cluster manager
  1. administration commands
  2. Amazon EC2
  3. architecture
  4. cluster mode
  5. deploy-mode option
  6. driver
  7. environment variables
  8. execution
  9. master flag
  10. Mesos
  11. scheduling data
  12. Spark Master UI
  13. spark-submit flags
  14. spark-submit script
  15. variables
4. core module
5. download page
6. GraphX module
7. MLIB module
8. modern shells
9. Parallelism
10. RDDs
  1. dataframes API
  2. goals
  3. operations

SeeRDD operations

rules
standalone applications
types

SQL module
Streaming
1. 24/7 spark streaming
2. architecture
3. batch size
4. checkpointing
5. garbage collector
6. module
7. operation
8. parallelism techniques
9. Transformations

SeeTransformations

testing
Upload text file

Application programming interface (API)

B

Big Data
1. Akka model
2. Apache Cassandra
3. Apache Hadoop
4. Apache Kafka
5. Apache Mesos
6. data center operation
  1. DevOps
  2. open source technology
7. data engineers
8. ETL
9. infrastructure needs
10. lambda architecture
11. OLAP
12. prediction
13. SMACK stack
14. vs . Modern Big Data
15. vs . Traditional Big Data
16. vs . Traditional Data
Business intelligence (BI)

C

Cassandra Query Language (CQL)
cassandra.yaml
Chronos framework
1. architecture
2. installation process
3. .jar file
4. web interface
Client-server
Cloud
Cluster
Commutative operations
Complex event processing (CEP)
Concurrency
Conflict-free replicated data types (CRDTs)
Consistent, Available, and Partition Tolerant (CAP)
Coordinator
Cqlsh
Custom Partitioning
1. compile
2. consumer program
3. create topic
4. CustomPartitionProducer.scala
5. import
6. properties
7. RUN command
8. SimplePartitioner class

D

Dashboard
Data allocation
Data analyst
Data architects
Data feed
Data gravity
Data pipelines
1. Akka and Cassandra
  1. CassandraCluster
  2. ConfigCassandraCluster App
  3. TestActorRef class
  4. TweetScanActor downloads
  5. TweetWriteActor writes
  6. TwitterReadActor reads
2. Akka and Kafka
3. Akka and Spark
  1. ReceiverInputDStream
  2. remote actor system
  3. ssc.start() method
  4. StreamingContext
4. asynchronous message passing
5. checkpointing
6. consensus
7. data locality
8. data parallelism
9. Dynamo system
10. failure detection
11. gossip protocol
12. HDFS implementations
13. isolation
14. kafka-connect-cassandra
  1. bulk mode
  2. CQL types
  3. SinkRecords
  4. timestamp based mode
15. location transparency
16. masterless
17. network partition
18. replication
19. scalable infrastructure
20. shared nothing architecture
21. Spark-Cassandra connector
  1. Cassandra function
  2. CassandraOption.deleteIfNone
  3. CassandraOption.unsetIfNone
  4. collection of, Objects
  5. collection of, Tuples
  6. Enable Spark Streaming
  7. modify CQL collections
  8. save RDD
  9. saving data
  10. setting up Spark Streaming
  11. Stream creation
  12. user-defined types
22. SPOF
Data recovery
DBMS
Determinism
Development operations (DevOps)
Dimension data
Directed acyclic graph (DAG)
Distributed computing

E

Eventual consistency (EC)
Exponential backoff
Extract, Transformtransform, and Loadload (ETL)

F

Failover
Fast data
1. ACID vs . CAP
  1. consistency
  2. CRDT
  3. properties
  4. theorem
2. Apache Hadoop
3. applications
4. big data
5. characteristics
  1. analysis streaming
  2. direct ingestion
  3. message queue
  4. per-event transactions
6. data enrichment
  1. advantages
  2. capacity
7. data pipelines
8. data recovery
9. data streams analysis
10. queries
11. real-time user interaction
12. Streaming Transformations
13. Tag data identifiers
  1. avoid idempotency
  2. idempotent operation
  3. ordered requests
  4. timestamp resolution
  5. unique id
  6. unordered requests
  7. use offset
  8. use upsert

G

gossip
Graph database

H

Hadoop Distributed File System (HDSF)
Hybrid Transaction Analytical Processing (HTAP)

I

Infrastructure as a Service (IaaS)
In-memory data grid (IMDG)
Internet of Things (IoT)

J

Java Message Service (JMS)

K

Keyspace
Key-value

L

Lambda architecture
Latency
Lazy evaluation
Literal functions

M

Map() method
Maps
1. immutable maps
2. mutable maps
master-slave
Mesos installation
1. libraries
2. master server
3. missing dependency
4. slave server
5. stepby-step installation
Metadata
Multiple broker
1. consumer client
2. reAmazingTopic
3. server.properties
4. start producers
5. ZooKeeper running
Multithreaded consumer
1. amazingTopic
2. Compile
3. import
4. MultiThreadConsumer class
5. properties
6. Run MultiThreadConsumer
7. Run SimpleProducer

N

NoSQL

O

Online analytical analytical processing (OLAP)
Online transaction processing (OLTP)
Operational analytics

P, Q

Platform as a Service (PaaS)
Probabilistic data structures

R

Relational database management system (RDBMS)
RDD operations
1. main spark actions
2. persistence levels
3. Transformations
Real-time analytics
Recovery time objective (RTO)
reduce() method
Replication
1. modes
  1. asynchronous replication process
  2. synchronous replication process
Resilient distributed dataset (RDDs)

S

Software as a Service (SaaS)
Scala
1. Array
  1. creation
  2. type
2. ArrayBuffer
3. extract subsequences
4. filtering
5. flattening
6. functional programming
  1. implicit loops
  2. literal functions
  3. predicate
7. hierarchy collections
  1. map
  2. sequences
  3. set
8. Lazy evaluation
9. mapping
10. merging and subtracting
11. queues
12. ranges
13. sort method
14. split method
15. stacks
16. streams
17. traversing collections
  1. for loop
  2. foreach method
  3. iterators
18. unicity
Scalability
Scala consumer
1. amazingTopic
2. Compile
3. Import
4. properties
5. Run command
6. Run SimpleConsumer
7. SimpleConsumer class
Scala Kafka producer
1. compile command
2. consumer program
3. create topic
4. define properties
5. import
6. metadata.broker.list
7. request.required.acks
8. Run command
9. serializer.class
10. SimpleProducer.scala code
Sequence collections
1. immutable sequences
2. mutable sequences
Sets
1. immutable sets
2. mutable sets
Shared nothing
Single broker
1. amazingTopic
2. consumer client
3. producer.properties
4. start producers
5. start ZooKeeper
Single point of failure (SPOF)
SMACK stack model
Spark-Cassandra Connector
Streaming analytics
Streaming Transformations
Synchronization

T

Transformations
1. output operations
2. stateful transformations
  1. updateStateByKey()method
  2. Windowed operations
3. stateless transformations

U, V, W, X, Y, Z

Unstructured data

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Backmatter

Create new playlist

Sign In

Sign Up

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P, Q

R

S

T

U, V, W, X, Y, Z

Table of Contents for
Backmatter