Teacher resources and professional development across the curriculum

Teacher professional development and classroom resources across the curriculum

Monthly Update sign up
Mailing List signup
Search
MENU
Learning Math Home
Data Session 7, Part D: Fitting Lines to Data
 
Session 7 Part A Part B Part C Part D Homework
 
Glossary
Data Site Map
Session 7 Materials:
Notes
Solutions
Video

Session 7, Part D:
Fitting Lines to Data

In This Part: Trend Lines | Error | The SSE | More Lines | Summary

Another way to see how close an individual's data point is to a line is to square the error. This is similar to how you calculated the variance in Session 5, where you squared the distances from the mean. Like the absolute value, each squared error produces a positive number. Again, for each individual point, the smaller the squared error, the closer the actual data point is to the line. Here are the squared errors for Persons 1 through 12:

Person #

Arm Span (X)

Height (Y)

YL = X

Error = Y - YL

(Error)2
=
(Y - YL)2

1

156

162

156

6

36

2

157

160

157

3

9

3

159

162

159

3

9

4

160

155

160

-5

25

5

161

160

161

-1

1

6

161

162

161

-1

1

7

162

170

162

8

64

8

165

166

165

1

1

9

170

170

170

0

0

10

170

167

170

-3

9

11

173

185

173

12

144

12

173

176

173

3

9

Problem D6

show answers  

Complete the table to find the squared error for the remaining 12 people.

Person #

Arm Span (X)

Height (Y)

YL = X

Error = Y - YL

(Error)2
=
(Y - YL)2

13

177

173

14

177

176

15

178

178

16

184

180

17

188

188

18

188

187

19

188

182

20

188

181

21

188

192

22

194

193

23

196

184

24

200

186

 

Person #

Arm Span (X)

Height (Y)

YL = X

Error = Y - YL

(Error)2
=
(Y - YL)2

13

177

173

177

-4

16

14

177

176

177

-1

1

15

178

178

178

0

0

16

184

180

184

-4

16

17

188

188

188

0

0

18

188

187

188

-1

1

19

188

182

188

-6

36

20

188

181

188

-7

49

21

188

192

188

4

16

22

194

193

194

-1

1

23

196

184

196

-12

144

24

200

186

200

-14

196

hide answers


 
 

Another measure of how well a particular line describes the relationship in bivariate data is the total of the squared errors. When comparing two lines, the line with the smaller total of the squared errors is the "better" line in terms of how well it describes the linear relationship between the two variables. For the line Height = Arm Span, this is the sum of the sixth column in the above table, which is 784.

This quantity, the sum of squared errors (SSE), is what statisticians prefer to use when comparing different lines for potential fit. If you could consider all possible lines, then the one with the smallest SSE is called the least squares line; it may also be referred to as the line of best fit.

Before we determine the SSE for the line Height = Arm Span - 1 (i.e., YL = X - 1), let's take a look at Person 1 and the line YL = X - 1:

Person #

Arm Span (X)

Height (Y)

YL = X - 1

Error
=
Y - YL

(Error)2
=
(Y - YL)2

1

156

162

155

7

49

Person 1's squared error can be represented on the graph as a square with a side whose length is |Y - YL|:

The following is the scatter plot for the data and a graph of the line YL = X - 1.

Note once again that a point above the line is indicated by a positive error; a point below the line is indicated by a negative error; and a point is on the line when the error is 0.

The following table shows the arm span (X), the observed height (Y), the predicted height based on the line Height = Arm Span - 1 (i.e., YL = X - 1), the error, and the vertical distance between the person's observed height (Y) and predicted height (YL) for Persons 1 through 6 in our study:

Person #

Arm Span (X)

Height (Y)

YL=X-1

Error = Y - YL

(Error)2
=
(Y - YL)2

1

156

162

155

7

49

2

157

160

156

4

16

3

159

162

158

4

16

4

160

155

159

-4

16

5

161

160

160

0

0

6

161

162

160

2

4


 

Problem D7

show answers  

Complete table below for the remaining 18 people. Then compute the sum of the squared errors for the line Height = Arm Span - 1, and compare the result to the sum of squared errors for the line Height = Arm Span. Based on your calculations, which line provides the better fit?

Person #

Arm Span (X)

Height (Y)

YL = X - 1

Error = Y - YL

(Error)2
=
(Y - YL)2

7

162

170

8

165

166

9

170

170

10

170

167

11

173

185

12

173

176

13

177

173

14

177

176

15

178

178

16

184

180

17

188

188

18

188

187

19

188

182

20

188

181

21

188

192

22

194

193

23

196

184

24

200

186

 

Person #

Arm Span (X)

Height (Y)

YL = X - 1

Error = Y - YL

(Error)
=
(Y - YL)2

7

162

170

161

9

81

8

165

166

164

2

4

9

170

170

169

1

1

10

170

167

169

-2

4

11

173

185

172

13

169

12

173

176

172

4

16

13

177

173

176

-3

9

14

177

176

176

0

0

15

178

178

177

1

1

16

184

180

183

-3

9

17

188

188

187

1

1

18

188

187

187

0

0

19

188

182

187

-5

25

20

188

181

187

-6

36

21

188

192

187

5

25

22

194

193

193

0

0

23

196

184

195

-11

121

24

200

186

199

-13

169

The sum of squared errors (SSE) is 49 + 16 + ... + 169 = 772. Since this is less than the sum of squared errors for the line Height = Arm Span (which was 784), the line Height = Arm Span - 1 is a slightly better fit.

hide answers



video thumbnail
 

Video Segment
In this video segment, Professor Kader introduces two rules: the sum of errors and the sum of squared errors. He explains that these are used to evaluate how well any given line fits a data set and how well each line can predict the value of one variable when the value of the other variable is known.

If you're using a VCR, you can find this segment on the session video approximately 16 minutes and 44 seconds after the Annenberg Media logo.

 

Next > Part D (Continued): More Lines

Learning Math Home | Data Home | Register | | Glossary | Map | ©

Session 7: Index | Notes | Solutions | Video

Home | Catalog | About Us | Search | Contact Us | Site Map

  • Follow The Annenberg Learner on Facebook

© Annenberg Foundation 2013. All rights reserved. Privacy Policy