Streaming Twitter data

Twitter is a famous microblogging platform. It produces a massive amount of data with around 500 million tweets sent each day. Twitter allows its data to be accessed by APIs, and that makes it the poster child of testing any big data streaming application.

In this recipe, we will see how we can live stream data in Spark using Twitter-streaming libraries. Twitter is just one source of providing streaming data to Spark and has no special status. Therefore, there are no built-in libraries for Twitter. Spark does provide some APIs to facilitate the integration with Twitter libraries, though.

An example use of a live Twitter data feed can be to find trending tweets in the last 5 minutes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset