Summary

In this chapter, we laid out the overall architecture of our app. We explained the two main paradigms of processing data: batch processing, also called data at rest, and streaming analytics, referred to as data in motion. We proceeded to establish connections to three social networks of interest: Twitter, GitHub, and Meetup. We sampled the data and provided a preview of what we are aiming to build. The remainder of the book will focus on the Twitter dataset. We provided here the tools and API to access three social networks, so you can at a later stage create your own data mashups. We are now ready to investigate the data collected, which will be the topic of the next chapter.

In the next chapter, we will delve deeper into data analysis, extracting the key attributes of interest for our purposes and managing the storage of the information for batch and stream processing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset