Fahrenheit and Celsius conversion - linear regression on perfect data

For example, Fahrenheit and Celsius degrees are related in a linear way. Given a table with pairs of both Fahrenheit and Celsius degrees, we can estimate the constants to devise a conversion formula from degrees Fahrenheit to degrees Celsius or vice versa:

⁰F

⁰C

5

-15

14

-10

23

-5

32

0

41

5

50

10

Analysis from first principles:

We would like to derive a formula converting F (degrees Fahrenheit) to C (degrees Celsius) as follows:

C=a*F+b

Here, a and b are the constants to be found. A graph of the function C=a*F+b is a straight line and thus is uniquely determined by two points. Therefore, we actually need only the two points from the table, say pairs (F1,C1) and (F2,C2). Then we will have the following:

C1=a*F1+b C2=a*F2+b

Now, C2-C1=(a*F2+b)-(a*F1+b)=a*(F2-F1). Therefore, we have the following:

a=(C2-C1)/(F2-F1)
b=C1-a*F1=C1-[(C2-C1)/(F2-F1)]

So let us take for example the first two pairs (F1,C1)=(5,-15) and (F2,C2)=(14,-10), then we have the following:

a=(-10-(-15))/(14-5)=5/9
b=-15-(5/9)*5=-160/9

Therefore, the formula to calculate degrees Celsius from degrees Fahrenheit is C=(5/9)*F-160/9~0.5556*F-17.7778.

Let us verify it against the data in the table:

⁰F

⁰C

(5/9)*F-160/9

5

-15

-15

14

-10

-10

23

-5

-5

32

0

0

41

5

5

50

10

10

Therefore, the formula fits our input data 100%. The data we worked with was perfect. In later examples, we will see that the formula that we can derive cannot fit the data perfectly. The aim will be to derive a formula that fits the data best, so that the error between the prediction and the actual data is minimized.

Analysis using R:

We use the statistical analysis software R to calculate the linear dependence relation between the variables degrees Celsius and degrees Fahrenheit.

The R package has the function lm which calculates the linear relationship between the variables. It can be used in the following form: lm(y ~ x, data = dataset_for_x_y), where y is the variable dependent on x. The data frame temperatures should contain the vectors with the values for x and y:

Input:

# source_code/6/frahrenheit_celsius.r
temperatures = data.frame( fahrenheit = c(5,14,23,32,41,50), celsius = c(-15,-10,-5,0,5,10) ) model = lm(celsius ~ fahrenheit, data = temperatures)
print(model)

Output:

$ Rscript fahrenheit_celsius.r
Call: lm(formula = celsius ~ fahrenheit, data = temperatures) Coefficients: (Intercept) fahrenheit -17.7778 0.5556

Therefore, we can see the following approximate linear dependence relation between C (degrees Celsius) and F (degrees Fahrenheit):

C=fahrenheit*F+Intercept=0.5556*F-17.7778

Note that this agrees with our previous calculation.

Visualization:

We display the linear model predicting degrees Celsius from degrees Fahrenheit underneath by a linear line. Its meaning is that the point (F,C) is on the green line if and only if F (degrees Fahrenheit) converts to C (degrees Celsius) and vice versa:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset