CHAPTER 2
OBSERVATIONS AND THEIR ANALYSIS

2.1 INTRODUCTION

Sets of data can be represented and analyzed using either graphical or numerical methods. Simple graphical analyses to depict trends commonly appear in newspapers or on television. A plot of the daily variation of the closing Dow Jones industrial average over the past year is an example. A bar chart showing daily high temperatures over the past month is another. Also, data can be presented in numerical form and be subjected to numerical analysis. Instead of using the bar chart, the daily high temperatures could be tabulated and their mean computed. In surveying, observational data can also be represented and analyzed either graphically or numerically. In this chapter, some rudimentary methods for doing so are discussed.

2.2 SAMPLE VERSUS POPULATION

image Due to time and financial constraints in statistical analyses, generally, only a small sample of data is collected from a much larger, possibly infinite population. For example, political parties may wish to know the percentage of voters who support their candidate. It would be prohibitively expensive to query the entire voting population to obtain the desired information. Instead, polling agencies select a subset of voters from the voting population. This is an example of population sampling.

As another example, suppose that an employer wishes to determine the relative measuring capabilities of two prospective new employees. The candidates could theoretically spend days or even weeks demonstrating their abilities. Obviously, this would not be very practical, so instead, the employer could have each person record a sample of readings, and from the readings predict the person's abilities. For instance, the employer could have each candidate read a micrometer 30 times. The 30 readings would represent a sample of the entire population of possible readings. In fact, in surveying, every time that distances, angles, or elevation differences are measured, samples are being collected from an infinite population of measurements.

From the preceding discussion, the following definitions can be made:

  1. Population. A population consists of all possible measurements that can be made on a particular item or procedure. Often, a population has an infinite number of data elements.
  2. Sample. A sample is a subset of data selected from the population.

TABLE 2.1 Fifty Readings

22.7 25.4 24.0 20.5 22.5
22.3 24.2 24.8 23.5 22.9
25.5 24.7 23.2 22.0 23.8
23.8 24.4 23.7 24.1 22.6
22.9 23.4 25.9 23.1 21.8
22.2 23.3 24.6 24.1 23.2
21.9 24.3 23.8 23.1 25.2
26.1 21.2 23.0 25.9 22.8
22.6 25.3 25.0 22.8 23.6
21.7 23.9 22.3 25.3 20.1

2.3 RANGE AND MEDIAN

Suppose that a one-second (1″) micrometer theodolite is used to read a direction 50 times. The second's portions of the readings are shown in Table 2.1. These readings constitute what is called a data set. How can these data be organized to make them more meaningful? How can one answer the question: Are the data representative of readings that should reasonably be expected with this instrument and a competent operator? What statistical tools can be used to represent and analyze this data set?

One quick numerical method used to analyze data is to compute its range, also called dispersion. Range is the difference between the highest and lowest values. It provides an indication of the precision of the data. From Table 2.1, the lowest value is 20.1 and the highest is 26.1. Thus, the range is 26.1−20.1, or 6.0. The range for this data set can be compared with ranges of other sets, but this comparison has little value when the two sets differ in size. For instance, would a set of 100 data points with a range of 8.5 be better than the set in Table 2.1? Clearly, other methods of statistically analyzing data sets would be useful.

To assist in analyzing data, it is often helpful to list the values in order of increasing size. This was done with the data of Table 2.1 to produce the results shown in Table 2.2. By looking at this ordered set, it is possible to determine quickly the data's middle value or midpoint. In this example, it lies between the values of 23.4 and 23.5. The midpoint value is also known as the median. Since there is an even number of values in this example, the median is given by the average of the two values closest to (which straddle) the midpoint. That is, the median is assigned the average of the 25th and 26th entries in the ordered set of 50 values, and thus for the data set of Table 2.2, the median is the average of 23.4 and 23.5 or 23.45.

TABLE 2.2 Data in Ascending Order

20.1 20.5 21.2 21.7 21.8
21.9 22.0 22.2 22.3 22.3
22.5 22.6 22.6 22.7 22.8
22.8 22.9 22.9 23.0 23.1
23.1 23.2 23.2 23.3 23.4
23.5 23.6 23.7 23.8 23.8
23.8 23.9 24.0 24.1 24.1
24.2 24.3 24.4 24.6 24.7
24.8 25.0 25.2 25.3 25.3
25.4 25.5 25.9 25.9 26.1

2.4 GRAPHICAL REPRESENTATION OF DATA

image Although an ordered numerical tabulation of data allows for some data distribution analysis, it can be improved with a frequency histogram, usually simply called a histogram. Histograms are bar graphs that show the frequency distributions in data. To create a histogram, the data are divided into classes. These are subregions of data that usually have a uniform range in values, or class width. Although there are no universally applicable rules for the selection of class width, generally 5 to 20 classes are used.

As a rule of thumb, a data set of 30 values may have only 5 or 6 classes, whereas a data set of 100 values may have as many as 15 to 20 classes. In general, the smaller the data set, the lower the number of classes used.

The histogram class width (range of data represented by each histogram bar) is determined by dividing the total range by the number of classes to be used. For example, consider the data of Table 2.2. If they were divided into seven classes, the class width would be the range divided by the number of classes, or 6.0/7 = 0.857, or 0.86. The first class interval is found by adding the class width to the lowest data value. For the data in Table 2.2, the first class interval is from 20.1 to (20.1 + 0.86), or 20.96. This class interval includes all data from 20.1 up to, but not including, 20.96. The next class interval is from 20.96 up to (20.96 + 0.86), or 21.82. Remaining class intervals are found by adding the class width to the upper boundary value of the preceding class. The class intervals for the data of Table 2.2 are listed in column (1) of Table 2.3.

TABLE 2.3 Frequency Table

(1) Class Interval (2) Class Frequency (3) Class Relative Frequency
20.10 – 20.96      2  2/50 = 0.04
20.96 – 21.82      3  3/50 = 0.06
21.82 – 22.67      8  8/50 = 0.16
22.67 – 23.53     13 13/50 = 0.26
23.53 – 24.38     11 11/50 = 0.22
24.38 – 25.24      6  6/50 = 0.12
25.24 – 26.10      7       7/50 = 0.14
∑ = 50 ∑ = 50/50 = 1

After creating class intervals, the number of data values in each interval is tallied. This is called the class frequency. Obviously, having data ordered consecutively as shown in Table 2.2 aids greatly in this counting process. Column (2) of Table 2.3 shows the class frequency for each class interval of the data in Table 2.2.

Often, it is also useful to calculate the class relative frequency for each interval. This is found by dividing the class frequency by the total number of observations. For the data in Table 2.2, the class relative frequency for the first class interval is 2/50 = 0.04. Similarly, the class relative frequency of the fourth interval (from 22.67 to 23.53) is 13/50 = 0.26. The class relative frequencies for the data of Table 2.2 are given in column (3) of Table 2.3. Notice that the sum of all class relative frequencies is always 1. The class relative frequency enables easy determination of percentages, which are called percentage points. For instance, the class interval from 21.82 to 22.67 contains 16% (0.16 × 100%) of the sample observations.

A histogram is a bar graph plotted with either class frequencies or relative class frequencies on the ordinate, versus values of the class interval bounds on the abscissa. Using the data from Table 2.3, the histogram shown in Figure 2.1 was constructed. Notice that in this figure, relative frequencies have been plotted as ordinates.

Illustration of Frequency histogram.

FIGURE 2.1 Frequency histogram.

Histograms drawn with the same ordinate and abscissa scales can be used to compare two different data sets. If one data set is more precise than the other, it will have comparatively tall bars in the center of the histogram, with relatively short bars near its edges. Conversely, the less precise data set will yield a wider range of abscissa values, with shorter bars at the center.

A summary of items seen easily on a histogram include:

  • Whether the data are symmetrical about a central value
  • The range or dispersion in the measured values
  • The frequency of occurrence of the measured values
  • The steepness of the histogram, which is an indication of measurement precision

Figure 2.2 shows several possible histogram shapes. Figure 2.2(a) depicts a histogram that is symmetric about its central value with a single peak in the middle. Figure 2.2(b) is also symmetric about the center but has a steeper slope than Figure 2.2(a), with a higher peak for its central value. Assuming the ordinate and abscissa scales to be equal, the data used to plot Figure 2.2(b) are more precise than those used for Figure 2.2(a). Symmetric histogram shapes are common in surveying practice, as well as in many other fields. In fact, they are so common that the shapes are said to be examples of a normal distribution. In Chapter 3, the reasons why these shapes are so common are discussed.

Illustration of Common histogram shapes.

FIGURE 2.2 Common histogram shapes.

Figure 2.2(c) has two peaks and is said to be a bimodal histogram. In the histogram of Figure 2.2(d), there is a single peak with a long tail to the left. This results from a skewed data set, and in particular, these data are said to be skewed to the right. The data of histogram Figure 2.2(e) is skewed to the left.

In surveying, the varying histogram shapes just described result from variations in personnel, physical conditions, and equipment. For example, repeated observations of a long distance made with an EDM instrument and by taping. By EDM procedure would probably produce data having a very narrow range, and thus the resulting histogram would be narrow and steep with a tall central bar like that in Figure 2.2(b). The histogram of the same distance measured by tape and plotted at the same scales would probably be wider with neither sides as steep nor the central value as great, like that shown in Figure 2.2(a). Since observations in surveying practice tend to be normally distributed, bimodal or skewed histograms from measured data are not expected. The appearance of such a histogram should lead to an investigation for the cause of this shape. For instance, if a data set from an EDM calibration plots as a bimodal histogram, it could raise questions about whether the instrument or reflector were moved during the measuring process, or if atmospheric conditions changed dramatically during the session. Similarly, a skewed histogram in EDM work may indicate the appearance of a weather front that stabilized over time. The existence of multipath errors in GNSS observations could also produce these types of histogram plots.

2.5 NUMERICAL METHODS OF DESCRIBING DATA

Numerical descriptors are values computed from a data set that are used to interpret the data's precision or quality. Numerical descriptors fall into three categories: (1) measures of central tendency, (2) measures of data variation, and (3) measures of relative standing. These categories are all called statistics. Simply described, a statistic is a numerical descriptor computed from sample data.

2.6 MEASURES OF CENTRAL TENDENCY

Measures of central tendency are computed statistical quantities that give an indication of the value within a data set that tends to exist at the center. The arithmetic mean, median, and mode are three such measures. They are described as follows:

  1. Arithmetic mean: For a set of n observations, y1, y2,…, yn, the arithmetic mean is the average of the observations. It's value, images, is computed from the following equation:
    (2.1)images

    Typically, the symbol images is used to represent the sample's arithmetic mean and the symbol μ is used to represent the population mean. Otherwise, the same equation applies. Using Equation (2.1), the mean of the observations in Table 2.2 is 23.5.

  2. Median: As mentioned previously, this is the midpoint of a sample set when arranged in ascending or descending order. One-half of the data are above the median and one-half are below it. When there are an odd number of quantities, only one such value satisfies this condition. For a data set with an even number of quantities, the average of the two observations that straddle the midpoint is used to represent the median. Due to the relatively small number of observations in surveying, it is seldom used.
  3. Mode: Within a sample of data, the mode is the most frequently occurring value. It is seldom used in surveying because of the relatively small number of values observed in a typical set of observations. And in small sample sets, several different values may occur with the same frequency, and hence, the mode can be meaningless as a measure of central tendency. The mode for the data in Table 2.2 is 23.8. It is possible for a set of data to have more than one mode. A common example is a data set with two modes, which is said to be bimodal.

2.7 ADDITIONAL DEFINITIONS

Nine other terms, pertinent to the study of observations and their analysis, are listed and defined here:

  1. True value, μ: A quantity's theoretically correct or exact value. As noted in Section 1.3, the true value can never be determined.
  2. Error, ε: The error is the difference between any individual observed quantity and its true value. The true value is simply the population's arithmetic mean if all repeated observations have equal precision. Since the true value of an observed quantity is indeterminate, errors are also indeterminate and are therefore only theoretical quantities. As given in Equation (1.1), and repeated for convenience here, errors are expressed as
    (2.2)images

    where yi is the individual observation associated with εi, and μ is the true value for that quantity.

  3. Most probable value, images: The most probable value is that value for a measured quantity, which, based upon the observations, has the highest probability of occurrence. It is derived from a sample set of data rather than the population and is simply the mean, if the repeated observations have the same precision.
  4. Residual, v: A residual is the difference between any individual measured quantity and the most probable value for that quantity. Residuals are the values that are used in adjustment computations since most probable values can be determined. The term error is frequently used when residual is meant, and although they are very similar and behave in the same manner, there is this theoretical distinction. The mathematical expression for a residual is
    (2.3)images

    where vi is the residual in the ith observation, yi, and images is the most probable value for the unknown.

  5. Degrees of freedom: Also called redundancies, the degrees of freedom are the number of observations that are in excess of the number necessary to solve for the unknowns. In other words, the number of degrees of freedom equals the number of redundant observations (see Section 1.6). As an example, if a distance between two points is measured three times, one observation would determine the unknown distance and the other two are redundant. These redundant observations reveal the discrepancies and inconsistencies in observed values. This, in turn, makes possible the practice of adjustment computations for obtaining the most probable values based on the measured quantities.
  6. Variance, σ2: This is a value by which the precision for a set of data is given. Population variance applies to a data set consisting of an entire population. It is the mean of the squares of the errors and is given by:
    (2.4)images

    Sample variance applies to a sample set of data. It is an unbiased estimate for the population variance given in Equation (2.4) and is calculated as:

    (2.5)images

    Note that Equations (2.4) and (2.5) are identical except that ε has been changed to v, and n has been changed to (n − 1) in Equation (2.5). The validity of these modifications is demonstrated in Section 2.10.

    It is important to note that the simple algebraic average of all errors in a data set cannot be used as a meaningful precision indicator. This is because random errors are as likely to be positive as negative, and thus the algebraic average will equal zero. This fact is shown for a population of data in the following simple proof. Summing Equation (2.2) for n samples gives

    (a)images

    Then substituting Equation (2.1) into Equation (a) yields:

    (b)images

    Similarly, it can be shown that the mean of all residuals of a sample data set equals zero.

  7. Standard error, σ: This is the square root of the population variance. From Equation (2.4) and this definition, the following equation is written for the standard error:
    (2.6)images

    where n is the number of observations and images is the sum of the squares of the errors. Note that the population variance, σ2, and standard error, σ, are indeterminate because true values, and hence errors, are indeterminate.

    As will be explained in Section 3.5, 68.3% of all observations in a population data set lie within ±σ of the true value, μ. Thus, the larger the standard error, the more dispersed are the values in the data set and the less precise is the measurement.

  8. Standard deviation, S: This is the square root of the sample variance. It is calculated using the following expressions
    (2.7)images

    where S is the standard deviation, n − 1 the degrees of freedom, and images the sum of the residuals squared. Standard deviation is an estimate for the standard error of the population. Since the standard error cannot be determined, the standard deviation is a practical expression for the precision of a sample set of data. Residuals are used rather than errors because they can be calculated from most probable values whereas errors cannot be determined. As discussed in Section 3.5 for a sample set of data, 68.3% of the observations will theoretically lie between the most probable value plus and minus the standard deviation, S. The meaning of this statement will be clarified in example that follows.

  9. Standard deviation of the mean: Because all observed values contain errors, the mean, which is computed from a sample set of measured values, will also contain error. The standard deviation of the mean is computed from the sample standard deviation according to the following equation:
    (2.8)images

    Notice that as n approaches infinity, images approaches 0. This illustrates that as the size of the sample set approaches the total population, the computed mean images will approach the true mean μ. This equation is derived in Section 6.2.3.

2.8 ALTERNATIVE FORMULA FOR DETERMINING VARIANCE

From the definition of residuals, Equation (2.5) is rewritten as

(2.9)images

Expanding Equation (2.9) yields

(c)images

Substituting Equation (2.1) for images into Equation (c), and dropping the bounds for the summation,

(d)images

Expanding Equation (d),

(e)images

Rearranging Equation (e) and recognizing that images occurs n times in Equation (e) yields

(f)images

Adding the summation symbol to Equation (f) yields

(g)images

Factoring and regrouping similar summations in Equation (g) produces

(h)images

Multiplying the last term in Equation (h) by n/n yields

(i)images

Finally by substituting Equation (2.1) in Equation (i), the following expression for the variance results:

(2.10)images

Using Equation (2.10), the variance of a sample data set can be computed by subtracting n times the square of the data's mean from the summation of the squared individual observations. With this equation, the variance and the standard deviation can be computed directly from the data. However, it should be stated that with large numerical values, Equation (2.10) may overwhelm a hand-held calculator or a computer working in single precision. The data should be centered or Equation (2.5) used when this problem arises. Centering a data set involves subtracting a constant value (usually, the arithmetic mean or something near the mean) from all values in a data set. By doing this, the values are modified to a smaller, more manageable size.

2.9 NUMERICAL EXAMPLES

By demonstration in Example 2.1, it can be seen that Equations (2.7) and (2.10) will yield the same standard deviation for a sample set. Notice that the number of observations within a single standard deviation from the mean, that is, between (23.5″ − 1.37″) and (23.5″ + 1.37″), or between 22.13″ and 24.87″, is 34. This represents 34/50 × 100% or 68% of all observations in the sample and matches the theory noted earlier. Also note that the algebraic sum of residuals is zero as was earlier demonstrated by Equation (b).

The histogram, shown in Figure 2.1, plots class relative frequencies versus class values. Notice how the values tend to be grouped about the central point. This is an example of a precise data set.

2.10 ROOT MEAN SQUARE ERROR AND MAPPING STANDARDS

Today maps can be obtained in hardcopy and digital form. Thus, modern map accuracy standards are often based a statistical quantity known as the root mean square error (RMSE). In order to check the accuracy of a map, it is required to determine the discrepancy between the coordinates as determined from points on a map versus those same points observed with a higher-order check survey. Obviously, these points must be well-defined on the map and on the surface of the earth in order to obtain coordinate values in the same locations during the field survey.

Since the check survey must be of higher accuracy than the map, these discrepancies are often considered to be residuals where the results of the check survey are considered to be without error, and thus represent the true values. This concept of values determined from a higher-order survey being considered true values is not without precedence. For example, the distances listed on the coordinate datasheet for a station or the lengths listed on an EDM calibration baseline report are often considered the true values since the surveys determining these values are considered to be more accurate than what a typical field survey would yield.

RMSE is defined as the square root of the average of squared residuals for values tested. Thus, for map accuracy standards, the differences between coordinates and elevations of points obtained from a map and their values as determined by a check survey are used to determine the accuracies of the map. Mathematically, RMSE is denoted as

(2.11)images

where n is the number of tested samples from the map, f(xi) is the position of the point obtained from the map, xi is the position obtained from the check survey, and f(xi) – xi the residual error for the point.

As an example consider the data shown in Table 2.7. The map coordinate values for the well-defined points are shown in columns (1) – (3). The check survey coordinate values are shown in columns (4) – (6). The discrepancies shown in columns (7) – (9) are computed as the difference between the mapped and surveyed coordinate values, which are listed as residuals. Thus, the residual for the x coordinate of point 1 is computed as 672,571.819 – 672,571.777 = 0.042 m. Similarly all other coordinate differences are determined.

TABLE 2.7 Map Coordinates versus Surveyed Checkpoint Coordinates

Mapped Points Check Points Residuals
Point (1) x (m) (2) y (m) (3) z (m) (4) E (m) (5) N (m) (6) H (m) (7) ΔE (8) ΔN (9) ΔH
  1 672,571.819 410,943.912 79.832 672,571.777 410,943.930 79.865 0.042 −0.018 −0.033
  2 671,203.830 418,741.450 72.483 671,203.869 418,741.425 72.457 −0.039 0.025 0.026
  3 671,203.830 426,812.590 91.565 671,203.847 426,812.566 91.627 −0.017 0.024 −0.062
  4 660,396.717 427,222.982 99.340 660,396.666 427,222.983 99.377 0.051 −0.001 −0.037
  5 637,824.897 425,170.999 77.340 637,824.849 425,170.984 77.306 0.048 0.015 0.034
  6 638,372.093 409,165.527 71.839 638,372.105 409,165.507 71.830 −0.012 0.020 0.009
  7 638,782.490 416,963.064 70.133 638,782.487 416,963.064 70.137 0.003 0.000 −0.004
  8 651,504.788 426,128.591 78.531 651,504.796 426,128.575 78.571 −0.008 0.016 −0.040
  9 643,980.848 422,571.819 81.486 643,980.840 422,571.835 81.414 0.008 −0.016 0.072
10 645,212.038 408,755.130 86.276 645,212.063 408,755.129 86.282 −0.025 0.001 −0.006
11 667,236.662 409,575.923 93.552 667,236.659 409,575.935 93.514 0.003 −0.012 0.038
12 664,911.081 422,982.216 75.410 664,911.068 422,982.213 75.387 0.013 0.003 0.023
13 655,198.358 415,595.075 71.369 655,198.395 415,595.052 71.433 −0.037 0.023 −0.064
14 647,674.419 414,774.282 80.002 647,674.397 414,774.273 80.041 0.022 0.009 −0.039
15 654,787.962 421,340.629 89.366 654,787.971 421,340.648 89.315 −0.009 −0.019 0.051
16 663,269.494 416,279.070 78.303 663,269.486 416,279.044 78.328 0.008 0.026 −0.025
17 656,984.354 408,996.255 81.205 656,984.379 408,996.226 81.209 −0.025 0.029 −0.004
18 668,113.208 431,698.113 76.001 668,113.253 431,698.112 76.087 −0.045 0.001 −0.086
19 655,660.377 431,797.820 72.424 655,660.433 431,797.795 72.446 −0.056 0.025 −0.022
20 643,962.264 430,943.396 84.189 643,962.266 430,943.429 84.179 −0.002 −0.033 0.010
Sum 0.017378 0.007031 0.033783
S ±0.030 ±0.019 ±0.042
RMSE ±0.029 ±0.019 ±0.041
RMSEr ±0.035

The standard deviations for each coordinate type are computed as

images

Using Equation (2.11), the RMSE in ΔE (RMSEΔE), ΔN (RMSEy), and ΔH (RMSEΔH) are computed from their corresponding residuals as

images

From the RMSEΔE and RMSEΔN, a radial value for the horizontal positional accuracy is determined as

images

Following these computations, standards usually require a certain confidence level for the horizontal and vertical components of the map.

2.11 DERIVATION OF THE SAMPLE VARIANCE (BESSEL'S CORRECTION)

Recall from Section 2.7 that the denominator of the equation for sample variance was n − 1, while the denominator of the population variance was n. A simple explanation for this difference is that one observation is necessary to compute the mean (images), and thus, only n − 1 observations remain for the variance's computation. A derivation of Equation (2.5) will clarify.

Consider a sample size of n drawn from a population with a mean, μ, and standard error of σ. Let yi be an observation from the sample, then

(j)images

where images is the error or deviation of the sample mean. Squaring and expanding Equation (j) yields

images

Summing all the observations in the sample from i equaling 1 to n yields

(k)images

Since by definition of the sample mean images

(l)images

Equation (k) becomes

(m)images

Repeating this calculation for many samples, the mean value of the left-hand side of Equation (m) will (by definition of σ2) tend to 2. Similarly, by Equation (2.8), the mean value of images will tend to n times the variance of images since ε represents the deviation of the sample mean from the population mean. Thus, images where images is the variance in images as images. The above discussion and Equation (m) results in

(n)images

Rearranging Equation (n) produces

(o)images

Thus, from Equation (o) and recognizing the left side of the equation as images for a sample set of data, it follows that

(p)images

In other words, for a large number of random samples, the value of images tends to images. That is, S2 is an unbiased estimate of the population's variance.

2.12 SOFTWARE

A Windows-based, statistical software package called STATS is available on the companion website for this book. It can be used to quickly perform statistical analysis of data sets as presented in this chapter. The data file used in STATS is simply a listing of the individual observations. For example, in Example 2.1, the data file can be entered as it is shown in Table 2.1. After saving this file, the “Histogram data” option under the programs menu is selected. After entering the appropriate file into the software, the software performs the computations discussed in this chapter and plots a frequency histogram of the data using the user-specified class interval or the desired number of classes.

Additionally, an electronic book is provided on the companion website for this book. To view the electronic book interactively, the Mathcad® software is required. However for those you do not have a copy of Mathcad®, html files of the electronic book are also on the website. The electronic book demonstrates most of the numerical examples provided in this book. In particular, the electronic book c2.xmcd demonstrates the use of Mathcad® to solve Examples 2.1 and 2.2.

Also, a spreadsheet can be used to perform the computations in this chapter. For example, Microsoft Excel® has functions for determining the mean, median, mode, standard deviation, and histogram data. The average function computes the mean for a selected set of data. The stdev.s function computes the standard deviation for a selected sample set of data. Similarly, the mode and median functions determine these values for a set of data, and the min and max functions determine the minimum and maximum values for the data. Additionally, with an available plug-in, the software can automatically histogram the data based on the first number of the class intervals, which is known as a bin number. These functions are demonstrated for Example 2.1 in the file c2.xls on the companion website.

Many of the chapters have programming problems listed at the end of each chapter. The electronic book demonstrates the rudiments of programming these problems. Other programs on the companion website include MATRIX and ADJUST. MATRIX can be used to solve problems involving matrices in this book. ADJUST has working least squares adjustment examples discussed in this book. ADJUST can be used to check solutions of many of the problems in this book.

The installation software for the programs ADJUST, MATRIX, and STATS is available in the zip file on the website. This software is available as an aid in learning the material in this book. Purchasers of this book may install this software on their computers. The spreadsheet and worksheet files discussed in this book can be copied from the companion website to your computer. The Mathcad® e-book should be copied to the handbook subdirectory of the Mathcad program. If you do not own Mathcad, html files of the e-book are provided, which can be copied viewed once you've unpacked the zipped archive from the companion website. Readers should refer to Appendix G for specific details about the software on the website.

PROBLEMS

Note: Partial answers to problems marked with an asterisk are given in Appendix H.

  1. *2.1 The optical micrometer of a precise differential level is set and read 10 times as 8.801, 8.803, 8.798, 8.801, 8.799, 8.802, 8.802, 8.804, 8.800, and 8.802. What value would you assign to the operator's ability to set micrometer on this instrument?
  2. 2.2 A distance measured in units of meters is observed 10 times as 186.499, 186.498, 186.495, 186.499, 186.498, 186.489, 186.489, 186.498, 186.500, and 186.491. What is the:
    1. (a) Range of the data?
    2. (b) Mean?
    3. (c) Median?
    4. *(d) Mode?
  3. 2.3 Using the data in Problem 2.2 tabulate the residuals and compute the variance, standard deviation, and standard deviation of the mean.
  4. 2.4 The second's portion of 10 pointings and readings for a particular direction 26.5, 27.4, 31.5, 27.4, 24.8, 25.7, 33.1, 29.0, 28.8, and 26.0. What is the:
    1. *(a) Largest discrepancy in the data?
    2. (b) Mean?
    3. (c) Median?
    4. (d) Mode?
  5. 2.5 Using the data in Problem 2.4, tabulate the residuals and compute the variance, standard deviation, and standard deviation of the mean.
  6. 2.6 The second's portion of 32 pointings and readings for a particular direction made using a 1″ total station with a 0.1″ display are: 48.9, 48.8, 48.6, 49.0, 48.9, 47.8, 47.8, 48.8, 49.1, 48.0, 48.0, 48.2, 48.9, 48.6, 48.8, 48.9, 48.2, 48.5, 48.5, 49.1, 48.6, 47.8, 47.8, 48.1, 49.0, 48.0, 49.1, 48.4, 47.9, 48.2, 47.9, and 48.1.
    1. (a) What is the mean of the data set?
    2. (b) Construct a frequency histogram of the data using seven uniform-width class intervals.
    3. *(c) What are the variance and standard deviation of the data?
    4. (d) What is the standard deviation of the mean?
  7. 2.7 An EDM instrument and reflector are set at the ends of a baseline that is 200.014 m long. Its length is measured 21 times, with the following results: 200.014, 200.013, 200.007, 200.016, 200.011, 200.015, 200.012, 200.018, 200.014, 200.012, 200.011, 200.009, 200.019, 200.016, 200.009, 200.016, 200.015, 200.018, 200.016, 200.007, and 200.014.
    1. (a) What are the mean, median, and standard deviation of the data?
    2. (b) Construct a histogram of the data with seven intervals and describe its properties. On the histogram lay off the sample standard deviation from both sides of the mean.
    3. (c) How many observations are between images, and what percentage of observations does this represent?
  8. 2.8 Answer Problem 2.7 with the following additional observations. 200.009, 200.015, 200.010, 200.016, 200.010, and 200.011.
  9. 2.9 Answer Problem 2.8 with the following additional observations. 200.010, 200.016, 200.016, and 200.015.
  10. 2.10 A distance was measured in two parts with a 100-ft. steel tape and then in its entirety with a 200-ft. steel tape. Five repetitions were made by each method. What are the mean, variance, and standard deviation for each method of measurement?
    Distance measured with 100-ft tape Distances measured with 200-ft tape
    Section 1
    100.006, 100.004, 100.001, 100.006, 100.005
     
    186.778, 186.776, 186.781, 186.786, 186.782
    Section 2
    86.777, 86.779, 86.785, 86.778, 86.774
  11. 2.11 Repeat Problem 2.10 using the following additional data for the 200-ft taped distance. 186.781, 186.784, 186.779, 186.778, and 186.776.
  12. 2.12 During a triangulation project, an observer made 16 readings for each direction. The seconds portion of the directions to Station Orion are listed 26.9, 27.5, 27.1, 26.5, 25.6, 27.2, 27.4, 26.6, 26.9, 26.1, 27.4, 27.3, 27.7, 26.4, 28.4, and 27.4.
    1. (a) Using a 0.5″ class interval, plot the histogram using relative frequencies for the ordinates.
    2. (b) Analyze the data and note any abnormalities.
    3. (c) As a supervisor, would you recommend reobservation of the station?
  13. 2.13 The particular line in a survey is measured three times on four separate occasions. The resulting 12 observations in units of meters are 536.191, 536.189, 536.187, 536.202, 536.200, 536.203, 536.202, 536.201, 536.199, 536.196, 536.205, and 536.202.
    1. (a) Compute the mean, median, and mode of the data.
    2. (b) Compute the variance and standard deviation of the data.
    3. (c) Using a class width of 0.004 m, plot a histogram of the data, and note any abnormalities that may be present.
  14. 2.14 Repeat Problem 2.13, but use a class width of 0.003 m in part (c).
  15. 2.15 During a triangulation project, an observer made 32 readings for each direction using a 3″ total station. The second's portions of the directions are listed below. Using 7 class intervals, plot the histogram with relative frequencies for the ordinates. Analyze the data and state whether this set appears to be reasonable. 18, 17, 19, 13, 23, 18, 14, 18, 22, 21, 22, 17, 17, 20, 20, 16, 24, 16, 17, 20, 21, 20, 22, 16, 26, 21, 17, 21, 25, 24, 25, and 20.
  16. 2.16 Two students have an argument over who can turn an angle better. To resolve the argument, they agree to each measure a single angle 10 times. The results of the observations are:
    Student A Student B
    108°26′10″, 108°26′10″, 108°26′08″,
    108°26′10″, 108°26′10″, 108°26′05″,
    108°26′04″, 108°26′10″, 108°26′11″,
    108°26′05″
    108°26′12″, 108°26′11″, 108°26′09″,
    108°26′13″, 108°26′12″, 108°26′01″,
    108°26′01″, 108°26′11″, 108°26′14″,
    108°26′03″
    1. (a) What are the means and variances of both data sets?
    2. (b) Construct a histogram of each data set using a 3″ class width.
    3. (c) Which student performed the best in this situation?

    Use the program STATS to do Problems 2.17–2.21.

  17. 2.17 Use the program STATS to compute the mean, median, mode, and standard deviation of the data in Table 2.2 and plot a centered histogram of the data using nine intervals.
  18. 2.18 Problem 2.6.
  19. 2.19 Problem 2.7.
  20. 2.20 Problem 2.10.
  21. 2.21 Problem 2.16.
  22. 2.22 Compute the standard deviation, root mean square error, and horizontal root mean square error for the map and check survey coordinates listed in the following table.
    Map Coordinates Check Survey
    Point e (m) n (m) h (m) E (m) N (m) H (m)
      1 643,012.990 382,012.235 151.012 643,012.978 382,012.236 151.029
      2 643,018.605 382,008.065 145.000 643,018.602 382,008.045 144.986
      3 643,018.538 382,001.677 157.675 643,018.525 382,001.672 157.676
      4 643,027.819 382,002.114 152.290 643,027.813 382,002.111 152.295
      5 643,025.532 382,007.696 148.788 643,025.534 382,007.700 148.776
      6 643,033.905 382,006.250 150.051 643,033.894 382,006.239 150.047
      7 643,034.443 382,002.517 159.903 643,034.430 382,002.489 159.916
      8 643,028.021 382,015.161 154.465 643,028.029 382,015.162 154.455
      9 643,034.510 382,010.184 154.197 643,034.501 382,010.175 154.200
    10 643,034.510 382,019.938 150.347 643,034.506 382,019.951 150.333
    11 643,026.138 382,022.895 153.941 643,026.142 382,022.899 153.959
    12 643,020.959 382,014.623 151.639 643,020.953 382,014.629 151.654
    13 643,014.268 382,022.626 147.369 643,014.250 382,022.639 147.359
    14 643,011.308 382,018.725 153.243 643,011.313 382,018.714 153.231
    15 643,002.498 382,020.575 151.955 643,002.502 382,020.556 151.954
    16 643,003.137 382,013.614 155.297 643,003.141 382,013.629 155.285
    17 643,008.080 382,009.344 145.239 643,008.076 382,009.349 145.241
    18 643,002.532 382,005.880 153.661 643,002.534 382,005.885 153.669
    19 643,006.231 382,002.988 149.267 643,006.240 382,002.997 149.270
    20 643,002.061 382,001.206 150.317 643,002.054 382,001.211 150.315

PRACTICAL EXERCISES

  1. 2.23 Using a total station, point and read horizontal circle to a well-defined target. With the tangent screw or jog-shuttle mechanism, move the instrument of the point and repoint on the same target. Record this reading. Repeat this process 50 times. Perform the calculations of Problem 2.6 using this data set.
  2. 2.24 Determine your EDM/reflector constant, K, by observing the distances between three points that are online, as shown in the figure. Distance AB should be roughly 60 m long and BC roughly 90 m long with B situated at some location between A and C. From measured values AC, AB, and BC, the constant K can be determined as follows:
    Illustration of the distances between three points A, B, and C on line.

    Since

    images

    then

    images

    When establishing the line, be sure that AB ≠ BC and that all three points are precisely on a straight line. Use three tripods and tribrachs to minimize setup errors and be sure that all are in adjustment. Measure each line 20 times with the instrument in the metric mode. Be sure to adjust the distances for the appropriate temperature and pressure and for differences in elevation. Determine the 20 values of K and analyze the sample set. What is the mean value for K, and what is its standard deviation?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset