Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Understanding streaming challenges

How to do it...

Start the Spark shell with the Kafka integration package:

$ spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.1

Create a stream to listen to messages for the oscars topic:

scala> val data = spark.readStream.format("kafka").option("kafka.bootstrap.servers","localhost:9092").option("subscribe","oscars").load()

To find out if it is really a streaming DataFrame or not:

scala> data.isStreaming

Get the schema of data DataFrame:

scala> data.printSchema

Create a stream to listen to messages for the oscars topic:

scala> val data = spark.readStream.format("kafka").option("kafka.bootstrap.servers","localhost:9092").option("subscribe","oscars").load()

Cast the stream to String datatype:

scala> val kvstream = data.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")

Write the stream to console based receiver and keep it running until terminated:

scala> val feed = kvstream.writeStream.format("console").start
scala> feed.awaitTermination

Publish a message on the oscars topic in Kafka in another window:

$ kafka-console-producer.sh --broker-list localhost:9092 --topic oscars

Now, publish messages on Kafka by pressing Enter at step 8 and after every message.
Now as you publish messages on Kafka, you will see them in the Spark shell.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.