BSTAT 5325 –Exam 2 – Spring 2017– Version 1
4. Strong multicollinearity causes …..
a. the standard error of the estimates of the slope to be too large
b. the t-tests of the slopes to be too large
c. the coefficient of determination to be too large
d. the residuals to be random
9. For one of your observations, you find a cook's distance value to be 0.52, a hat value is 0.61 and a studentized deleted residual value of -0.7671. If n = 30 and k =4, you find F 0.50,5,25 = 0.89. You would conclude the observation is …
a. an outlier in X
b. an outlier in Y
c. an influential point
d. none of the other answers are correct
10. If you have a first-order model what would cause you to consider transforming the dependent variable? To fix:
a. the sample size
b. a violation of linearity
c. a violation of equal variance
d. a non-random sample
13. Which one of the following requires the value of n-k-1?
a. the calculation of the standard deviation of the Y values for any given value of X
b. the interpretation of R2
c. the interpretation of mean square regression
d. the calculation of the value of the sample mean of Y (i.e., Y̅)
17. Which of the following procedures estimates the distribution of a statistic by repeatedly sampling from the sample?
a. least squares
b. neural nets
c. bootstrapping
d. an experiment
26. In linear regression with k = 2, you have a random sample of 20 observations from normal distributions that have equal variances. From this you calculate a t-test of β2 to be -0.13. If all the assumptions are correct, which of the following could cause the result to t-test to be misleading?
a. The sample size is too large.
b. The R2 value is 0.98.
c. A t-test can never be negative. This is a calculation error.
d. One of the observations is an influential point
30. Before any data-driven model building is done, you should …
a. run all possible regressions
b. avoid talking with experts in the subject
c. check for bad data
d. run stepwise regression
BSTAT 5325 - 001 - Exam 2 Version 1 – Fall, 2017
13. If you wanted to test H1: β1 < 0, what is one way to compensate for non-normality?
a. increase α
b. transform X
c. increase n
d. use model building
17. If the hat value = 0.54, studentized deleted residual = 1.6, Cook's distance = 0.25, n = 12, k = 2, and the F table value is 0.05, what would you conclude?
a. the model is useful
b. the model is not useful
c. the observation is an outlier in X
d. the observation is influential
18. Holding all other independent variables constant, your data says that increases in studying is associated with decreases in the average grade. Besides multicollinearity, what else could cause the sign of a slope to be opposite what you expected?
a. you have one or more influential point
b. this is not a problem, increases in studying should make your grade worse regardless of how much you studied
c. gremlins
d. a large positive t-test value
20. A house that has typical house characteristics but sells for a very large amount of money (large enough to be unlikely) would be considered to be
a. an outlier in X
b. an outlier in Y
c. influential
d. multicollinear
21. Under what conditions is it difficult or impossible for the definition of a slope to apply?
a. a violation of the e assumption of regression
b. quadratic effects
c. bootstrapping
d. strong multicollinearity
22. In data driven model building, before using R2 to reduce the set of all possible models to a small set, you would
a. fix any errors in the assumptions
b. test the model for significance
c. compare each model in detail
d. hire a statistician (Hint: good idea but bad answer)
23. Data driven model building was one solution to
a. a significant F test
b. the interpretation of the assumption violations
c. the redundancy of information among independent variables
d. bootstrapping
24. The stepwise procedure is
a. what Dr. Eakin recommends
b. fast but could possibly not find the best combination of variables
c. used for find outliers
d. the change in the average value of Y when simultaneously increasing all independent variables by 1
BSTAT 5325 – Section 1 –Exam 2 – Spring, 2018
20. You repeatedly take random samples of size 10 from a random sample while recording Y and X. This approach is called ….
a. convenience sampling
b. the population distribution of a statistic
c. bootstrapping
d. the regression equation
23. In the model building notes the two major categories were _______
a. the rejection region and the test statistic
b. using theory (or opinion of experts) or by examining data.
c. the confidence interval or the hypothesis test.
d. boredom and/or headaches
26. If the removal of an observation from a data set has little effect on the values of the sample estimates of the slope and intercept, the point is
a. not an outlier in the Y’s
b. not influential
c. not an outlier in the X’s
d. not multicollinear
28. In multiple linear regression (n=10 and k=5), you find a hat value of 1.1, a studentized deleted residual of -3.67, and a cook's distance of 0.6 (If you need it, you are given that the 50th percentile of an F with 6 and 4 degrees of freedom is 1.06). You would then conclude that that the observation is
a. an influential point.
b. an outlier in X.
c. multicollinear.
d. an outlier in Y.
Spring, 2017
|
|
Fall, 2017
|
|
Spring, 2018
|
Question
|
key
|
|
Question
|
KEY
|
|
Question
|
key
|
4
|
a
|
|
13
|
C
|
|
20
|
C
|
9
|
a
|
|
17
|
C
|
|
23
|
B
|
10
|
c
|
|
18
|
A
|
|
26
|
B
|
13
|
a
|
|
20
|
B
|
|
28
|
D
|
17
|
c
|
|
21
|
D
|
|
|
|
26
|
d
|
|
22
|
A
|
|
|
|
30
|
c
|
|
23
|
C
|
|
|
|
|
|
|
24
|
B
|
|
|
|
Share with your friends: |