Home Page Icon
Home Page
Table of Contents for
Exploring Data with RapidMiner
Close
Exploring Data with RapidMiner
by Andrew Chisholm
Exploring Data with RapidMiner
Exploring Data with RapidMiner
Table of Contents
Exploring Data with RapidMiner
Credits
About the Author
About the Reviewer
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Setting the Scene
A process framework
Data volume and velocity
Data variety, formats, and meanings
Missing data
Cleaning data
Visualizing data
Resource constraints
Terminology
Accompanying material
Summary
2. Loading Data
Reading files
Alternative delimiters
Reading complete lines
Reading large numbers of attributes
Splitting files into smaller pieces
Databases
The Read Database operator
Large datasets
Using macros
Summary
3. Visualizing Data
Getting started
Statistical summaries
Relationships between attributes
Scatter plots
Scatter 3D color
Parallel and deviation
Quartile color
Time series data
Plotting series
Using the survey plotter
Relations between examples
Using histograms
Using block plots
Summary
4. Parsing and Converting Attributes
Generating attributes
Date functions
Regular expression functions
Generating extracts
Regular expressions
XPath
Renaming attributes
Searching and replacing attribute values
Using the Map operator
Using the Replace operator
Using Replace (Dictionary)
Summary
5. Outliers
Manual inspection
Increasing the data volume
Rules for handling outliers
Automated detection of example outliers
Detect Outlier (Distances)
Detect Outlier (Densities)
Detect Outlier (LOF)
Detect Outliers (COF)
Summary
6. Missing Values
Missing or empty?
Types of missing data
Missing completely at random
Missing at random
Not missing at random
Categorizing missing data
Finding MCAR data
Finding MAR data
Finding NMAR data
A cautionary note
Effect of missing data
Options for handling missing data
Returning to the root cause
Ignore it
Manual editing
Deletion of examples
Deletion of attributes
Imputation with single values
Modeling
Summary
7. Transforming Data
Creating new attributes
Aggregation
Using pivoting
Using de-pivoting
Windowing data
Summary
8. Reducing Data Size
Removing examples using sampling
Removing attributes
Removing useless attributes
Weighting attributes
Selecting attributes using models
Summary
9. Resource Constraints
Measuring and estimating performance
Measuring performance
Adding memory
Parallel processing
Restructuring processes
Summary
10. Debugging
Breakpoints in RapidMiner Studio
Logging data in RapidMiner Studio
RapidMiner Studio console printing
Groovy scripts
Outputting macros example
Console logging with Groovy
Regex tools
Using XPath effectively
Summary
11. Taking Stock
Exploring new techniques
Time series
Web mining
Using R
Java or Groovy
Third-party components
RapidMiner Server
Where to go next
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Table of Contents
Next
Next Chapter
Exploring Data with RapidMiner
Exploring Data with RapidMiner
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset