Keeping the House Clean – The Data Mining Architecture

In the previous chapter, we defined the dynamic part of our data mining activities, understanding how a data mining project should be organized in terms of phases, input, and output. In this chapter, we are going to set our scene, defining the static part of our data mining projects, the data mining architecture.

How do we organize data bases, scripts, and output within our project? This is the question this chapter is going to answer. We are going to look at:

  • The general structure of data mining architecture
  • How to build such kind of structure with R

This is a really useful chapter, especially if you are approaching the data mining activity for the first time, and no matter the programming language, since it will let you gain a first view on what you will typically find in a data mining environment. No matter whether you are dealing with a single-man project or a whole team initiative, you will more or less always find the elements we are going to introduce here.

It will therefore be useful, every time you approach a new problem, to try to associate the real elements you will find in your environment with the abstracts ones we will discover in the following pages. A useful map to do this job is reproduced within the next paragraph; you can make a copy of it and keep it as a reference point when you start a new data mining journey.

As a final destination of this chapter, we will see how to implement a data mining architecture with our beloved R language.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset