Terminology

The following table contains some common terms that RapidMiner uses:

Term

Explanation

Process

A process is an executable unit containing the functionality to be executed. The user creates the process using operators and joins them together in whatever way is required.

Operator

An operator is a single block of functionality available from the RapidMiner Studio GUI that can be arranged in a process and connected to other processes. Each operator has parameters that can be configured as per the specific requirements of the process.

Macro

A macro is a global variable that can be set and used by most operators to modify operator behavior.

Repository

A repository is a location where processes, data, models, and files can be stored and read either from the RapidMiner Studio GUI or from a process.

Example

An example is a single row of data.

Example set

This is a set of one or more examples.

Attribute

An attribute is a column of data.

Type

This is the type of an attribute. It can be real, integer, date_time, nominal (both polynominal and binominal), or text.

Role

An attribute's role dictates how operators will use the attribute. The most obvious role is regular. The other standard types are known as special attributes and these include label, id, cluster, prediction, and outlier. It is also possible to set the role of an attribute that is generally ignored by most operators (there are exceptions).

Label

A label is the target attribute to be predicted in a data mining classification context. This is one of the special role types for an attribute.

ID

This is a special role that indicates an identifier for an example. Some operators use the ID as part of their operation.

This table is given here so that readers are aware of the terminology up front and to make it easier to find later.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset