Index

A

  1. ACID

  2. Actor model

    1. Akka installation

    2. Akka logos

    3. OOP vs . actors

    4. thread-based concurrency

  3. Agents server

  4. Aggregation techniques

    1. materialized views

    2. probabilistic data structures

    3. windowed events

  5. Akka Actors

    1. actor communication

    2. actor lifecycle methods

    3. actor monitoring

    4. actor reference

    5. actorSelection () method

    6. actor system

    7. BadPerformer

    8. deadlock

    9. GoodPerformer

    10. GreeterActor

    11. import akka.actor.Actor

    12. installation

    13. kill Actors

    14. match expression

    15. receive() method

    16. shut down () method

    17. starting actors

    18. stopping actors

    19. Thread.sleep

  6. Apache Cassandra

    1. cassandra.yaml

    2. client-server architecture

      1. driver

      2. service petitioners

      3. service providers

      4. via CQLs

    3. cluster booting

    4. cluster setting

    5. connection establishment

    6. data model

    7. GitHub

    8. gossip

    9. installation

      1. CQL commands

      2. CQL shell

      3. DESCRIBE command

      4. execution

      5. file download

      6. requirements

      7. validation

    10. memory access

      1. column-family

      2. key-value

    11. NoSQL

      1. characteristics

      2. data model

  7. Apache Kafka

    1. add servers

      1. amazingTopic

      2. cluster mirroring

      3. headers

      4. Kafka topics

      5. reAmazingTopic

      6. reassign-partition tool

      7. remove configuration

      8. replication factor

    2. architecture

      1. design

      2. goals

      3. groups

      4. leaders

      5. log compaction

      6. message compression

      7. offset

      8. replication modes

      9. segment files

    3. cluster

      1. broker property

      2. components

      3. multiplebroker

SeeMultiple broker
  1. singlebroker

SeeSingle broker
  1. consumer

    1. consumer API

    2. multithreadedconsumer

SeeMultithreaded consume
  1. properties

  2. Scalaconsumer

SeeScala consumer
  1. GitHub project

  2. Gradle compilation

  3. installation

    1. importing

    2. install Java 1.7

    3. Linux

  4. integration

    1. Apache Spark

    2. consumer parameters

    3. data processing algorithms

  5. JDK validation

  6. libmesos

  7. message broker

    1. CEP

    2. distributed

    3. multiclient

    4. persistent

    5. scenario

    6. types of, actors

    7. uses

  8. producers

    1. custompartitioning

SeeCustom partitioning
  1. Producer API

  2. Properties

  3. Scala Kafkaproducer

SeeScala Kafka producer
  1. tools

  1. Apache Mesos

    1. clusters

      1. ApacheKafka

SeeApache Kafka
  1. Apache Spark

  2. indicators

  3. MASTER

  4. SLAVES

  1. concurrency

  2. coordinators

  3. distributed systems

    1. characteristics

    2. complexity

    3. models

    4. types of, processes

  4. dynamic process

  5. Framework

    1. abstraction levels

    2. architecture

  6. implementation

  7. Mesos 101

    1. Aurora framework

    2. Chronosframework

SeeChronos framework
  1. installation

SeeMesos installation
  1. Marathon framework

  2. ZooKeeper framework

  1. rule

  1. Apache Spark

    1. Amazon S3

    2. architecture

      1. metadata

      2. methods

      3. object creation

      4. sparkcontext

    3. cluster manager

      1. administration commands

      2. Amazon EC2

      3. architecture

      4. cluster mode

      5. deploy-mode option

      6. driver

      7. environment variables

      8. execution

      9. master flag

      10. Mesos

      11. scheduling data

      12. Spark Master UI

      13. spark-submit flags

      14. spark-submit script

      15. variables

    4. core module

    5. download page

    6. GraphX module

    7. MLIB module

    8. modern shells

    9. Parallelism

    10. RDDs

      1. dataframes API

      2. goals

      3. operations

SeeRDD operations
  1. rules

  2. standalone applications

  3. types

  1. SQL module

  2. Streaming

    1. 24/7 spark streaming

    2. architecture

    3. batch size

    4. checkpointing

    5. garbage collector

    6. module

    7. operation

    8. parallelism techniques

    9. Transformations

SeeTransformations
  1. testing

  2. Upload text file

  1. Application programming interface (API)

B

  1. Big Data

    1. Akka model

    2. Apache Cassandra

    3. Apache Hadoop

    4. Apache Kafka

    5. Apache Mesos

    6. data center operation

      1. DevOps

      2. open source technology

    7. data engineers

    8. ETL

    9. infrastructure needs

    10. lambda architecture

    11. OLAP

    12. prediction

    13. SMACK stack

    14. vs . Modern Big Data

    15. vs . Traditional Big Data

    16. vs . Traditional Data

  2. Business intelligence (BI)

C

  1. Cassandra Query Language (CQL)

  2. cassandra.yaml

  3. Chronos framework

    1. architecture

    2. installation process

    3. .jar file

    4. web interface

  4. Client-server

  5. Cloud

  6. Cluster

  7. Commutative operations

  8. Complex event processing (CEP)

  9. Concurrency

  10. Conflict-free replicated data types (CRDTs)

  11. Consistent, Available, and Partition Tolerant (CAP)

  12. Coordinator

  13. Cqlsh

  14. Custom Partitioning

    1. compile

    2. consumer program

    3. create topic

    4. CustomPartitionProducer.scala

    5. import

    6. properties

    7. RUN command

    8. SimplePartitioner class

D

  1. Dashboard

  2. Data allocation

  3. Data analyst

  4. Data architects

  5. Data feed

  6. Data gravity

  7. Data pipelines

    1. Akka and Cassandra

      1. CassandraCluster

      2. ConfigCassandraCluster App

      3. TestActorRef class

      4. TweetScanActor downloads

      5. TweetWriteActor writes

      6. TwitterReadActor reads

    2. Akka and Kafka

    3. Akka and Spark

      1. ReceiverInputDStream

      2. remote actor system

      3. ssc.start() method

      4. StreamingContext

    4. asynchronous message passing

    5. checkpointing

    6. consensus

    7. data locality

    8. data parallelism

    9. Dynamo system

    10. failure detection

    11. gossip protocol

    12. HDFS implementations

    13. isolation

    14. kafka-connect-cassandra

      1. bulk mode

      2. CQL types

      3. SinkRecords

      4. timestamp based mode

    15. location transparency

    16. masterless

    17. network partition

    18. replication

    19. scalable infrastructure

    20. shared nothing architecture

    21. Spark-Cassandra connector

      1. Cassandra function

      2. CassandraOption.deleteIfNone

      3. CassandraOption.unsetIfNone

      4. collection of, Objects

      5. collection of, Tuples

      6. Enable Spark Streaming

      7. modify CQL collections

      8. save RDD

      9. saving data

      10. setting up Spark Streaming

      11. Stream creation

      12. user-defined types

    22. SPOF

  8. Data recovery

  9. DBMS

  10. Determinism

  11. Development operations (DevOps)

  12. Dimension data

  13. Directed acyclic graph (DAG)

  14. Distributed computing

E

  1. Eventual consistency (EC)

  2. Exponential backoff

  3. Extract, Transformtransform, and Loadload (ETL)

F

  1. Failover

  2. Fast data

    1. ACID vs . CAP

      1. consistency

      2. CRDT

      3. properties

      4. theorem

    2. Apache Hadoop

    3. applications

    4. big data

    5. characteristics

      1. analysis streaming

      2. direct ingestion

      3. message queue

      4. per-event transactions

    6. data enrichment

      1. advantages

      2. capacity

    7. data pipelines

    8. data recovery

    9. data streams analysis

    10. queries

    11. real-time user interaction

    12. Streaming Transformations

    13. Tag data identifiers

      1. avoid idempotency

      2. idempotent operation

      3. ordered requests

      4. timestamp resolution

      5. unique id

      6. unordered requests

      7. use offset

      8. use upsert

G

  1. gossip

  2. Graph database

H

  1. Hadoop Distributed File System (HDSF)

  2. Hybrid Transaction Analytical Processing (HTAP)

I

  1. Infrastructure as a Service (IaaS)

  2. In-memory data grid (IMDG)

  3. Internet of Things (IoT)

J

  1. Java Message Service (JMS)

K

  1. Keyspace

  2. Key-value

L

  1. Lambda architecture

  2. Latency

  3. Lazy evaluation

  4. Literal functions

M

  1. Map() method

  2. Maps

    1. immutable maps

    2. mutable maps

  3. master-slave

  4. Mesos installation

    1. libraries

    2. master server

    3. missing dependency

    4. slave server

    5. stepby-step installation

  5. Metadata

  6. Multiple broker

    1. consumer client

    2. reAmazingTopic

    3. server.properties

    4. start producers

    5. ZooKeeper running

  7. Multithreaded consumer

    1. amazingTopic

    2. Compile

    3. import

    4. MultiThreadConsumer class

    5. properties

    6. Run MultiThreadConsumer

    7. Run SimpleProducer

N

  1. NoSQL

O

  1. Online analytical analytical processing (OLAP)

  2. Online transaction processing (OLTP)

  3. Operational analytics

P, Q

  1. Platform as a Service (PaaS)

  2. Probabilistic data structures

R

  1. Relational database management system (RDBMS)

  2. RDD operations

    1. main spark actions

    2. persistence levels

    3. Transformations

  3. Real-time analytics

  4. Recovery time objective (RTO)

  5. reduce() method

  6. Replication

    1. modes

      1. asynchronous replication process

      2. synchronous replication process

  7. Resilient distributed dataset (RDDs)

S

  1. Software as a Service (SaaS)

  2. Scala

    1. Array

      1. creation

      2. type

    2. ArrayBuffer

    3. extract subsequences

    4. filtering

    5. flattening

    6. functional programming

      1. implicit loops

      2. literal functions

      3. predicate

    7. hierarchy collections

      1. map

      2. sequences

      3. set

    8. Lazy evaluation

    9. mapping

    10. merging and subtracting

    11. queues

    12. ranges

    13. sort method

    14. split method

    15. stacks

    16. streams

    17. traversing collections

      1. for loop

      2. foreach method

      3. iterators

    18. unicity

  3. Scalability

  4. Scala consumer

    1. amazingTopic

    2. Compile

    3. Import

    4. properties

    5. Run command

    6. Run SimpleConsumer

    7. SimpleConsumer class

  5. Scala Kafka producer

    1. compile command

    2. consumer program

    3. create topic

    4. define properties

    5. import

    6. metadata.broker.list

    7. request.required.acks

    8. Run command

    9. serializer.class

    10. SimpleProducer.scala code

  6. Sequence collections

    1. immutable sequences

    2. mutable sequences

  7. Sets

    1. immutable sets

    2. mutable sets

  8. Shared nothing

  9. Single broker

    1. amazingTopic

    2. consumer client

    3. producer.properties

    4. start producers

    5. start ZooKeeper

  10. Single point of failure (SPOF)

  11. SMACK stack model

  12. Spark-Cassandra Connector

  13. Streaming analytics

  14. Streaming Transformations

  15. Synchronization

T

  1. Transformations

    1. output operations

    2. stateful transformations

      1. updateStateByKey()method

      2. Windowed operations

    3. stateless transformations

U, V, W, X, Y, Z

  1. Unstructured data

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset