Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Disease diagnosis with neural networks

For disease diagnosis, we are going to use the free dataset proben1, which is available on the Web (http://www.filewatcher.com/m/proben1.tar.gz.1782734-0.html). Proben1 is a benchmark set of several datasets from different domains. We are going to use the cancer and the diabetes datasets. We add a class to run the experiments of each case: DiagnosisExample.

Breast cancer

The breast cancer dataset is composed of 10 variables, of which nine are inputs and one is a binary output. The dataset has 699 records, but we excluded from them 16 which were found to be incomplete, thus we used 683 to train and test the neural network.

Tip

In real practical problems, it is common to have missing or invalid data. Ideally, the classification algorithm must handle these records, but sometimes it is recommended that you exclude them, since there would be not enough information to produce an accurate result.

The following table shows a configuration of this dataset:

Variable Name	Type	Maximum Value and Minimum Value
Diagnosis result	OUTPUT	[0; 1]
Clump Thickness	INPUT #1	[1; 10]
Uniformity of Cell Size	INPUT #2	[1; 10]
Uniformity of Cell Shape	INPUT #3	[1; 10]
Marginal Adhesion	INPUT #4	[1; 10]
Single Epithelial Cell Size	INPUT #5	[1; 10]
Bare Nuclei	INPUT #6	[1; 10]
Bland Chromatin	INPUT #7	[1; 10]
Normal Nucleoli	INPUT #8	[1; 10]
Mitoses	INPUT #9	[1; 10]

So, the proposed neural topology will be that of the following figure:

The dataset division was made as follows:

Training: 549 records (80%);
Testing: 134 records (20%)

As in the previous cases, we performed many experiments to try to find the best neural network to classify whether cancer is benign or malignant. So we conducted 12 different experiments (1,000 epochs per experiment), wherein MSE and accuracy values were analyzed. After that, the confusion matrix, sensitivity, and specificity were generated with the test dataset and analysis was done. Finally, an analysis of generalization was taken. The neural networks involved in the experiments are shown in the following table:

Experiment	Number of neurons in hidden layer	Learning rate	Activation Function
#1	3	0.1	Hidden Layer: SIGLOG Output Layer: LINEAR
#2		0.1	Hidden Layer: HYPERTAN Output Layer: LINEAR
#3		0.5	Hidden Layer: SIGLOG Output Layer: LINEAR
#4		0.5	Hidden Layer: HYPERTAN Output Layer: LINEAR
#5		0.9	Hidden Layer: SIGLOG Output Layer: LINEAR
#6		0.9	Hidden Layer: HYPERTAN Output Layer: LINEAR
#7	5	0.1	Hidden Layer: SIGLOG Output Layer: LINEAR
#8		0.1	Hidden Layer: HYPERTAN Output Layer: LINEAR
#9		0.5	Hidden Layer: SIGLOG Output Layer: LINEAR
#10		0.5	Hidden Layer: HYPERTAN Output Layer: LINEAR
#11		0.9	Hidden Layer: SIGLOG Output Layer: LINEAR
#12		0.9	Hidden Layer: HYPERTAN Output Layer: LINEAR

After each experiment, we collected MSE values (Table X); experiments #4, #8, #9, #10, and #11 were equivalents, because they have low MSE values and same total accuracy measure (92.25%). Therefore, we selected experiments #4 and #11, because they have low MSE values among the five experiments mentioned before:

Experiment	MSE training rate	Total accuracy
#1	0.01067	96.29%
#2	0.00443	98.50%
#3	9.99611E-4	97.77%
#4	9.99913E-4	99.25%
#5	9.99670E-4	96.26%
#6	9.92578E-4	97.03%
#7	0.01392	98.49%
#8	0.00367	99.25%
#9	9.99928E-4	99.25%
#10	9.99951E-4	99.25%
#11	9.99926E-4	99.25%
#12	NaN	3.44%

Graphically, the MSE evolution over time is very fast, as can be seen in the following chart of the fourth experiment. Although we used 1,000 epochs to train, the experiment stopped earlier, because the minimum overall error (0.001) was reached:

The confusion matrix is shown in the table with the sensibility and specificity for both experiments. It is possible to check that measures are the same for both experiments:

Experiment	Confusion Matrix	Sensibility	Specificity
#4	[[34.0, 1.0] [0.00, 99.0]]	97.22%	100.0%
#11	[[34.0, 1.0] [0.00, 99.0]]	97.22%	100.0%

Experiment

Confusion Matrix

Sensibility

Specificity

[[34.0, 1.0]

[0.00, 99.0]]

97.22%

100.0%

#11

[[34.0, 1.0]

[0.00, 99.0]]

97.22%

100.0%

If we had to choose between models generated by experiments #4 or #11, we recommend selecting #4, because it's simpler than #11 (it has fewer neurons in the hidden layer).

Diabetes

An additional example to be explored is the diagnosis of diabetes. This dataset has eight inputs and one output, shown in the table below. There are 768 records, all complete. However, proben1 states that there are several senseless zero values, probably indicating missing data. We're handling this data as if it was real anyway, thereby introducing some errors (or noise) into the dataset:

Variable Name	Type	Maximum Value and Minimum Value
Diagnosis result	OUTPUT	[0; 1]
Number of times pregnant	INPUT #1	[0.0; 17]
Plasma glucose concentration a 2 hours in an oral glucose tolerance test	INPUT #2	[0.0; 199]
Diastolic blood pressure (mm Hg)	INPUT #3	[0.0; 122]
Triceps skin fold thickness (mm)	INPUT #4	[0.0; 99]
2-Hour serum insulin (mu U/ml)	INPUT #5	[0.0; 744]
Body mass index (weight in kg/(height in m)^2)	INPUT #6	[0.0; 67.1]
Diabetes pedigree function	INPUT #7	[0.078; 2420]
Age (years)	INPUT #8	[21; 81]

The dataset division was made as follows:

Training: 617 records (80%)
Test: 151 records (20%)

To discover the best neural net topology to classify diabetes, we used the same schema of neural networks with the same analysis described in the last section. However, we're using multiple class classification in the output layer: two neurons in this layer will be used, one for the presence of diabetes and one for absence.

So, the proposed neural architecture looks like that of the following figure:

The table below shows the MSE training value and accuracy of the first six experiments and of the last six experiments:

Experiment	MSE training rate	Total accuracy
#1	0.00807	60.54%
#2	0.00590	71.03%
#3	9.99990E-4	75.49%
#4	9.98840E-4	74.17%
#5	0.00184	61.58%
#6	9.82774E-4	59.86%
#7	0.00706	63.57%
#8	0.00584	72.41%
#9	9.99994E-4	74.66%
#10	0.01047	72.14%
#11	0.00316	59.86%
#12	0.43464	40.13%

The fall of the MSE is fast in both cases. However, experiment #9 generates an increase of error rate in the first values. It is shown in the following figure:

Analyzing the confusion matrixes, it can be seen that the measures are very similar:

Experiment	Confusion Matrix	Sensibility	Specificity
#3	[[35.0, 12.0] [25.0, 79.0]]	74.46%	75.96%
#9	[[34.0, 12.0] [26.0, 78.0]]	73.91%	75.00%

Experiment

Confusion Matrix

Sensibility

Specificity

[[35.0, 12.0]

[25.0, 79.0]]

74.46%

75.96%

[[34.0, 12.0]

[26.0, 78.0]]

73.91%

75.00%

One more time, we suggest choosing the simplest model. In the diabetes example, it is the artificial neural network generated by experiment #3.

Tip

It is recommended you explore the class D iagnosisExample and create a GUI to become easy select neural net parameters, as was done in the previous chapter. You should try to reuse code already programmed through the inheritance concept.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Disease diagnosis with neural networks

Create new playlist

Sign In

Sign Up

Disease diagnosis with neural networks

Breast cancer

Tip

Diabetes

Tip

Table of Contents for
Disease diagnosis with neural networks