Dataset (medical)

The dataset used in this chapter is publicly available at the UCI Machine Learning Repository maintained by the School of Information and Computer Science at the University of California. You can access this at https://archive.ics.uci.edu/ml/datasets/cardiotocography.

It is to be noted that this URL enables you to download an Excel data file. This file can be easily converted to a .csv format by saving the file as a .csv file.

For data we should use the formatting which is used for .csv, as shown in the following code:

# Read data
library(keras)
data <- read.csv('~/Desktop/data/CTG.csv', header=T)
str(data)

OUTPUT
 ## 'data.frame': 2126 obs. of 22 variables:
 ## $ LB : int 120 132 133 134 132 134 134 122 122 122 ...
 ## $ AC : num 0 0.00638 0.00332 0.00256 0.00651 ...
 ## $ FM : num 0 0 0 0 0 0 0 0 0 0 ...
 ##  $ UC      : num  0 0.00638 0.00831 0.00768 0.00814 ...
 ##  $ DL      : num  0 0.00319 0.00332 0.00256 0 ...
 ##  $ DS      : num  0 0 0 0 0 0 0 0 0 0 ...
 ##  $ DP      : num  0 0 0 0 0 ...
 ##  $ ASTV    : int  73 17 16 16 16 26 29 83 84 86 ...
 ##  $ MSTV    : num  0.5 2.1 2.1 2.4 2.4 5.9 6.3 0.5 0.5 0.3 ...
 ##  $ ALTV    : int  43 0 0 0 0 0 0 6 5 6 ...
 ##  $ MLTV    : num  2.4 10.4 13.4 23 19.9 0 0 15.6 13.6 10.6 ...
 ##  $ Width   : int  64 130 130 117 117 150 150 68 68 68 ...
 ##  $ Min     : int  62 68 68 53 53 50 50 62 62 62 ...
 ##  $ Max     : int  126 198 198 170 170 200 200 130 130 130 ...
 ##  $ Nmax    : int  2 6 5 11 9 5 6 0 0 1 ...
 ##  $ Nzeros  : int  0 1 1 0 0 3 3 0 0 0 ...
 ##  $ Mode    : int  120 141 141 137 137 76 71 122 122 122 ...
 ##  $ Mean    : int  137 136 135 134 136 107 107 122 122 122 ...
 ##  $ Median  : int  121 140 138 137 138 107 106 123 123 123 ...
 ##  $ Variance: int  73 12 13 13 11 170 215 3 3 1 ...
 ##  $ Tendency: int  1 0 0 1 1 0 0 1 1 1 ...
 ##  $ NSP     : int  2 1 1 1 1 3 3 3 3 3 ...

This data consists of fetal CTGs, and the target variable classifies a patient into one of three categories: normal, suspect, and pathological. There are 2,126 rows in this dataset. The CTGs are classified by three expert obstetricians, and a consensus classification label is assigned to each of them as normal (N) (represented by 1), suspect (S) (represented by 2), and pathological (P) (represented by 3). There are 21 independent variables, and the main objective is to develop a classification model to correctly classify each patient into one of the three categories represented by N, S, and P.

Table of Contents for Dataset (medical)

Create new playlist

Sign In

Sign Up

Table of Contents for
Dataset (medical)