Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Adjusting values – data preprocessing

Raw data collected from a data source usually presents different particularities, such as data range, sampling, and category. Some variables result from measurements, while the others are a summary or even calculated. Preprocessing means to adapt these variables' values to form neural networks that can handle them properly.

Regarding weather variables, let's take a look at their range, sampling, and type, shown in the following table:

Variable	Unit	Range	Sampling	Type
Mean temperature	°C	23.86–29.25	Hourly	Average of hourly measurements
Precipitation	Mm	0–161.20	Daily	Accumulation of daily rain
Insolation	h	0–10.40	Daily	Count of hours receiving sun radiation
Mean humidity	%	65.50–96.00	Hourly	Average of hourly measurements
Mean wind speed	km/h	0.00–3.27	Hourly	Average of hourly measurements

Except for insolation and precipitation, the variables are all measured and share the same sampling, but if we wanted, for example, to use an hourly dataset, we would have to preprocess all the variables to use the same sample rate. Three of the variables are summarized using daily average values, but if we wanted to, we could use hourly data measurements. However, the range would surely be larger.

Equalizing data – normalization

Normalization is the process to get all the variables into the same data range, usually with smaller values, between 0 and 1 or -1 and 1. This helps the neural network to present values within the variable zone in activation functions such as sigmoid or hyperbolic tangent:

Values too high or too low may drive neurons to produce values that are too high or too low as well for the activation functions, therefore leading the derivative for these neurons to be too small, near zero.

The normalization should consider a predefined range of the dataset. It is performed right away:

Where N_min and N_max represent the normalized minimum and maximum limits, respectively; X_min and X_max denote X variable's minimum and maximum limits, respectively; X indicates the original value; and X_norm refers to the normalized value. If we want the normalization to be between 0 and 1, for example, the equation is simplified as follows:

By applying the normalization, a new "normalized" dataset is produced and is fed to the neural network. One should also take into account that a neural network fed with normalized values will be trained to produce normalized values on the output, so the inverse (denormalization) process becomes necessary as well.

or:

For the normalization between 0 and 1.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Adjusting values – data preprocessing

Create new playlist

Sign In

Sign Up

Adjusting values – data preprocessing

Equalizing data – normalization

Table of Contents for
Adjusting values – data preprocessing