Chapter 21

Ten Essential Data Science Resource Collections

In This Chapter

arrow Getting the lowdown on essential learning resources at Data Science Weekly

arrow Finding resources at U Climb Higher

arrow Learning about data mining and data science with KDnuggets

arrow Locating an obscure resource on Data Science Central

arrow Educating yourself about open source data science through Masters

arrow Obtaining a free education with Quora

arrow Discovering answers for advanced topics at Conductrics

arrow Reading the Aspirational Data Scientist blog posts

arrow Discovering data intelligence and analytics resources at AnalyticBridge

arrow Getting the developer resource you need with Jonathan Bower

In reading this book, you discover quite a lot about data science and Python. Before your head explodes from all the new knowledge you gain, it’s important to realize that this book is really just the tip of the iceberg. Yes, there really is more information available out there, and that’s what this chapter is all about. The following sections introduce you to a wealth of data science resource collections that you really need to make the best use of your new knowledge.

In this case, a resource collection is simply a listing of really cool links with some text to tell you why they’re so great. In some cases, you gain access to articles about data science; in other cases, you’re exposed to new tools. In fact, data science is such a huge topic that you could easily find more resources than those discussed here, but the following sections provide a good place to start.

remember As with anything else on the Internet, links break, sites go out of business, and new sites take their place. If you find that a link is broken, please let me know about it at [email protected].

Gaining Insights with Data Science Weekly

The Data Science Weekly is a free newsletter that you can sign up for to obtain the latest information about changes in data science. However, for this chapter, the most important element is the list of resources you find at http://www.datascienceweekly.org/data-science-resources. The resources cover the following broad range of topics:

  • Data Science Books
  • Data Science Meetups
  • Data Science Massive Open Online Courses (MOOCs)
  • Data Science Datasets
  • Data Science Most Read Articles
  • Data Scientist Talks
  • Data Scientists on Twitter
  • Data Science Blogs

Obtaining a Resource List at U Climb Higher

Even with the right connections online and a good search engine, trying to find just the right resource can be hard. U Climb Higher has published a list of 24 data science resources at http://blog.udacity.com/2014/12/24-data-science-resources-keep-finger-pulse.html that’s guaranteed to help keep your finger on the pulse of new strategies and technologies. This resource broaches the following topics: trends and happenings; places to learn more about data science; joining a community; data science news; people who really know data science well; all the latest research

Getting a Good Start with KDnuggets

Learning about data mining and data science is a process. KDnuggets breaks the learning process down into a series of steps at http://www.kdnuggets.com/faq/learning-data-mining-data-science.html. Each step provides you with an overview of what you should be doing and why. You also find links to a variety of resources online to make the learning process considerably easier. Even though the site emphasizes the use of R, Python (clicking the Getting Started With Python For Data Science link shows that even Kaggle prefers Python 2.7), and SQL (in that order) to perform data science tasks, the steps will actually work for any of a number of approaches that you might take.

remember As with any other learning experience, a procedure like the one shown on the KDnuggets site will work for some people and not for others. Everyone learns a little differently. Don’t be afraid to improvise. The resources on this site might provide insights into other things that you can do to make your learning process easier.

Accessing the Huge List of Resources on Data Science Central

Many of the resources you find online cover mainstream topics. Data Science Central (http://www.datasciencecentral.com/) provides access to a relatively large number of data science experts that will tell you about the most obscure facts of data science. One of the more interesting blog posts appears at http://www.datasciencecentral.com/profiles/blogs/huge-trello-list-of-great-data-science-resources.

This resource points you to a Trello list (https://trello.com/) of some truly amazing resources. Navigating the huge list can be a bit difficult, but the process is aided by the treelike structure that Trello provides for organizing information. You want to meander through this sort of list when you have time and simply want to see what is available. The categories include the following (with possibly more by the time you read this book):

  • Data news
  • Data business people track
  • Data journalist track
  • Data padawan track
  • Data scientist track
  • Statistics
  • R
  • Python
  • Big data and other tools
  • Data
  • Others

Obtaining the Facts of Open Source Data Science from Masters

Many organizations now focus on open source for data science solutions. The focus has become so prevalent that you can now get an Open-Source Data Science Masters (OSDSM) education at http://datasciencemasters.org/. The emphasis is on providing you with the materials that are normally lacking from a purely academic education. In other words, the site provides pointers to courses that fill in gaps in your education so that you become more marketable in today’s computing environment. The various links provide you with access to online courses, books, and other resources that help you gain a better understanding of just how OSDSM works.

Locating Free Learning Resources with Quora

It’s really hard to resist the word free, especially when it comes to education, which normally costs many thousands of dollars. The Quora site at http://www.quora.com/What-are-the-best-free-resources-to-learn-data-science provides a listing of the best nonpaid learning resource for data science.

Most of the links take on a question format, such as, “How do I become a data scientist?” The question-and-answer format is helpful because you might be asking the questions that the site answers. The resulting list of sites, courses, and resources are introductory, for the most part, but they are a good way to get started working in the data science field.

tip A few of the links are for prestigious institutions such as Harvard. The link provides you with access to course materials such as lecture videos and blackboards. However, you don’t get the actual course free of charge. If you want the benefits of the course, you still need to pay for it. Even so, just by viewing the course materials, you can obtain a lot of useful data science knowledge.

Receiving Help with Advanced Topics at Conductrics

The Conductrics site (http://conductrics.com/) as a whole is devoted to selling products that help you perform various data science tasks. However, the site includes a blog that contains a couple of useful blog posts that answer the sorts of advanced questions that you might find it difficult to answer elsewhere. The two posts appear at http://conductrics.com/data-science-resources/ and http://conductrics.com/data-science-resources-2.

The author of the blog posts, Matt Gershoff, makes it clear that the listings are the result of answering people’s questions in the past. The list is huge, which is why it appears in two posts rather than one, so Matt must answer many questions. The list focuses mostly on machine learning rather than hardware or specific coding issues. Therefore, you can expect to see entries for topics such as Latent Semantic Indexing (LSI); Single Value Decomposition (SVD); Linear Discriminant Analysis (LDA); non-parametric Bayesian approaches; statistical machine translation; Reinforcement Learning (RL); Temporal Difference (TD) learning; context bandits.

tip The list goes on and on. Many of these entries won’t make much sense to you right now unless you’re already heavily involved in data science. However, the authors write many of the articles in a way that helps you pick up the information even if you aren’t completely familiar with it. In most cases, your best course of action is to at least scan the article to see whether you can understand it. If the article starts to make sense, read it in detail. Otherwise, hold on to the article reference for later use. You might be surprised to discover that the article you can’t completely understand today becomes something you understand with ease tomorrow.

Learning New Tricks from the Aspirational Data Scientist

The Aspirational Data Scientist (http://newdatascientist.blogspot.com/) blog site provides you with an amazing array of essays on various data science topics. The author splits the posts into these areas: data science commentary; online course reviews; becoming a data scientist.

Data science attracts practitioners from all sorts of existing fields. The site seems mainly devoted to serving the needs of social scientists moving into the data science field. In fact, the most interesting post that appears at http://newdatascientist.blogspot.com/p/useful-links.html provides a listing of resources to help the social scientist move into the data scientist field. The list of resources is organized by author, so you may find names that you already recognize as potential informational resources.

tip As with any other resource, even if an article is meant for one audience, it often serves the needs of another audience with equal ease. Even if you aren’t a social scientist, you might find that the articles contain helpful information as you progress on the road to fully discovering the wonders of data science.

Finding Data Intelligence and Analytics Resources at AnalyticBridge

The AnanlyticBridge site (http://www.analyticbridge.com/) contains an amazing array of helpful resources for the data scientist. One of the more helpful resources is the list of data intelligence and analytics resources at http://www.analyticbridge.com/page/links. This page contains a wealth of resources you won’t find anywhere else that are organized into the following categories: general resources; big data; visualization; best and worst of data science; new analytics startup ideas; rants about healthcare, education, and other topics; career stuff, training, and salary surveys; miscellaneous.

Zeroing In on Developer Resources with Jonathan Bower

More than a few interesting resources appear on GitHub (https://github.com/), a site devoted to collaboration, code review, and code management. One of the sites you need to check out is Jonathan Bower’s listing of data science resources at https://github.com/jonathan-bower/DataScienceResources. The majority of these resources will appeal to the developer, but just about anyone can benefit from them. You find resources categorized into the following topics:

  • Data science, getting started
  • Data pipeline and tools
  • Product
  • Career resources
  • Open source data science resources

The hierarchical formatting of the various topics makes finding just what you need easier. Each major category divides into a list of topics. Within each topic, you find a list of resources that apply to that topic. For example, within Data Pipeline & Tools, you find Python, which includes a link for Anyone Can Code. This is one of the most usable sites in the list.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset