# Bstat 325 -exam Spring 2017- version Strong multicollinearity causes …

BSTAT 5325 –Exam 2 – Spring 2017– Version 1

4. Strong multicollinearity causes …..

a. the standard error of the estimates of the slope to be too large

b. the t-tests of the slopes to be too large

c. the coefficient of determination to be too large

d. the residuals to be random

9. For one of your observations, you find a cook's distance value to be 0.52, a hat value is 0.61 and a studentized deleted residual value of -0.7671. If n = 30 and k =4, you find F 0.50,5,25 = 0.89. You would conclude the observation is …

a. an outlier in X

b. an outlier in Y

d. none of the other answers are correct
10. If you have a first-order model what would cause you to consider transforming the dependent variable? To fix:

a. the sample size

b. a violation of linearity

c. a violation of equal variance

d. a non-random sample
13. Which one of the following requires the value of n-k-1?

a. the calculation of the standard deviation of the Y values for any given value of X

b. the interpretation of R2

c. the interpretation of mean square regression

d. the calculation of the value of the sample mean of Y (i.e., Y̅)
17. Which of the following procedures estimates the distribution of a statistic by repeatedly sampling from the sample?

a. least squares

b. neural nets

c. bootstrapping

d. an experiment
26. In linear regression with k = 2, you have a random sample of 20 observations from normal distributions that have equal variances. From this you calculate a t-test of β2 to be -0.13. If all the assumptions are correct, which of the following could cause the result to t-test to be misleading?

a. The sample size is too large.

b. The R2 value is 0.98.

c. A t-test can never be negative. This is a calculation error.

d. One of the observations is an influential point
30. Before any data-driven model building is done, you should …

b. avoid talking with experts in the subject

d. run stepwise regression

BSTAT 5325 - 001 - Exam 2 Version 1 – Fall, 2017

13. If you wanted to test H1: β1 < 0, what is one way to compensate for non-normality?

a. increase α

b. transform X

c. increase n

d. use model building

17. If the hat value = 0.54, studentized deleted residual = 1.6, Cook's distance = 0.25, n = 12, k = 2, and the F table value is 0.05, what would you conclude?

b. the model is not useful

c. the observation is an outlier in X

d. the observation is influential

18. Holding all other independent variables constant, your data says that increases in studying is associated with decreases in the average grade. Besides multicollinearity, what else could cause the sign of a slope to be opposite what you expected?

a. you have one or more influential point

b. this is not a problem, increases in studying should make your grade worse regardless of how much you studied

c. gremlins

d. a large positive t-test value
20. A house that has typical house characteristics but sells for a very large amount of money (large enough to be unlikely) would be considered to be

a. an outlier in X

b. an outlier in Y

c. influential

d. multicollinear
21. Under what conditions is it difficult or impossible for the definition of a slope to apply?

a. a violation of the e assumption of regression

c. bootstrapping

d. strong multicollinearity
22. In data driven model building, before using R2 to reduce the set of all possible models to a small set, you would

b. test the model for significance

c. compare each model in detail

23. Data driven model building was one solution to

a. a significant F test

b. the interpretation of the assumption violations

c. the redundancy of information among independent variables

d. bootstrapping
24. The stepwise procedure is

a. what Dr. Eakin recommends

b. fast but could possibly not find the best combination of variables

c. used for find outliers

d. the change in the average value of Y when simultaneously increasing all independent variables by 1
BSTAT 5325 – Section 1 –Exam 2 – Spring, 2018

20. You repeatedly take random samples of size 10 from a random sample while recording Y and X. This approach is called ….

a. convenience sampling

b. the population distribution of a statistic

c. bootstrapping

d. the regression equation

23. In the model building notes the two major categories were _______

a. the rejection region and the test statistic

b. using theory (or opinion of experts) or by examining data.

c. the confidence interval or the hypothesis test.

26. If the removal of an observation from a data set has little effect on the values of the sample estimates of the slope and intercept, the point is

a. not an outlier in the Y’s

b. not influential

c. not an outlier in the X’s

d. not multicollinear
28. In multiple linear regression (n=10 and k=5), you find a hat value of 1.1, a studentized deleted residual of -3.67, and a cook's distance of 0.6 (If you need it, you are given that the 50th percentile of an F with 6 and 4 degrees of freedom is 1.06). You would then conclude that that the observation is

a. an influential point.

b. an outlier in X.

c. multicollinear.

d. an outlier in Y.

 Spring, 2017 Fall, 2017 Spring, 2018 Question key Question KEY Question key 4 a 13 C 20 C 9 a 17 C 23 B 10 c 18 A 26 B 13 a 20 B 28 D 17 c 21 D 26 d 22 A 30 c 23 C 24 B