Planning your next steps

Before we conclude by summarizing the chapter, there are a few things I highly recommend that you try out with Amazon EMR, as well as with Amazon Redshift. First up, EMRFS.

We briefly touched upon the topic of EMRFS while deciding which filesystem to opt for when it comes to deploying the EMR Cluster. EMR File System (EMRFS) is an implementation of the traditional HDFS that allows for reading and writing files from Amazon EMR directly to Amazon S3. This essentially allows you to leverage the consistency provided by S3, as well as some of its other feature sets, such as data encryption. To read more about EMRFS and how you can use it for your EMR clusters, visit: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-fs.html.

Secondly, Amazon EMR also provides an enterprise-grade Hadoop distribution in the form of MapR. The MapR distribution of Hadoop provides you with a plethora of features that enhances your overall experience when it comes to building distributed applications, as well as managing the overall Hadoop cluster. For example, selecting MapR as the Hadoop distribution provides support for industry-standard interfaces, such as NFS and ODBC, using which you can connect your EMR cluster with any major BI tool, including Tableau and Toad. MapR internally also provides built-in high availability, data protection, higher performances, and a whole list of additional features. You can read more about the MapR distribution for Hadoop at EMR at: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-mapr.html.

Last but not the least, I would also recommend that you try out some of Amazon Redshift's advanced features in the form of reserved nodes and parameter groups. Parameter groups are essentially a group of parameters that are applied to the database when it is created. You can find the parameter group for your existing database by selecting the Parameter Group option from the Redshift's navigation pane. You can use and tweak these parameter groups based on your requirements to fine tune and customize the database. To know how to leverage parameter groups for your database tuning, visit: https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-parameter-groups.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset