1. Quantitative Methods
Total Questions
233
Correct
0 (0.0%)
Incorrect
0 (0.0%)
Unattempted
233 (100.0%)
Quick Actions:
Reading 1 Multiple Regression 121 questions
Question: During the course of a multiple regression analysis, an analyst has observed several items that she believes may render incorrect conclusions. For example, the coefficient standard errors are too small, although the estimated coefficients are accurate. She believes that these small standard error terms will result in the computed t-statistics being too big, resulting in too many Type I errors. The analyst has most likely observed which of the following assumption violations in her regression analysis?
- A) Positive serial correlation Correct
- B) Homoskedasticity
- C) Multicollinearity
Page 1 | Status: ⏸️ Unattempted
Question: When two or more of the independent variables in a multiple regression are correlated with each other, the condition is called:
- A) multicollinearity Correct
- B) conditional heteroskedasticity
- C) serial correlation
Page 1 | Status: ⏸️ Unattempted
Question: An analyst is trying to determine whether fund return performance is persistent. The analyst divides funds into three groups based on whether their return performance was in the top third (group 1), middle third (group 2), or bottom third (group 3) during the previous year. The manager then creates the following equation: R = a + b1D1 + b2D2 + b3D3 + ε, where R is return premium on the fund (the return minus the return on the S&P 500 benchmark) and Di is equal to 1 if the fund is in group i. Assuming no other information, this equation will suffer from:
- A) multicollinearity Correct
- B) serial correlation
- C)
Page 1 | Status: ⏸️ Unattempted
Question: Which of the following is least likely to result in misspecification of a regression model?
- A) Inappropriate variable form Correct
- B) Transforming a variable
- C) Omission of an important independent variable
Page 2 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding the results of a regression analysis is least accurate? The:
- A) slope coefficient in a multiple regression is the change in the dependent variable for a one-unit change in the independent variable, holding all other variables constant
- B) slope coefficients in the multiple regression are referred to as partial betas Correct
- C) slope coefficient in a multiple regression is the value of the dependent variable for a given value of the independent variable
Page 2 | Status: ⏸️ Unattempted
Question: One of the underlying assumptions of a multiple regression is that the variance of the residuals is constant for various levels of the independent variables. This quality is referred to as:
- A) homoskedasticity Correct
- B) a normal distribution
- C) a linear relationship
Page 2 | Status: ⏸️ Unattempted
Question: Consider the following estimated regression equation, with calculated t-statistics of the estimates as indicated: AUTOt = 10.0 + 1.25 PIt + 1.0 TEENt – 2.0 INSt with a PI calculated t-statistic of 0.45, a TEEN calculated t-statistic of 2.2, and an INS calculated t-statistic of 0.63. The equation was estimated over 40 companies. Using a 5% level of significance, which of the independent variables significantly different from zero?
- A) TEEN only
- B) PI and INS only
- C) PI only
Page 3 | Status: ⏸️ Unattempted
Question: Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses): SALES = 0.000 + 0.004 POP + 1.031 INCOME + 2.002 ADV (0.113) (0.005) (0.337) (2.312) For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million, (2) disposable income will be $300,000,000, and (3) advertising expenditures will be $100,000,000. Based on these estimates and the regression equation, what are predicted sales for the industry for next year?
- A) $656,991,000
- B) $509,980,000
- C) $557,143,000
Page 3 | Status: ⏸️ Unattempted
Question: One possible problem that could jeopardize the validity of the employment growth rate model is multicollinearity. Which of the following would most likely suggest the existence of multicollinearity?
- A) The variance of the observations has increased over time
- B) The Durbin–Watson statistic is significant Correct
- C) The F-statistic suggests that the overall regression is significant, however the regression coefficients are not individually significant
Page 4 | Status: ⏸️ Unattempted
Question: Consider the following estimated regression equation: AUTOt = 10.0 + 1.25 PIt + 1.0 TEENt – 2.0 INSt The equation was estimated over 40 companies. The predicted value of AUTO if PI is 4, TEEN is 0.30, and INS = 0.6 is closest to:
- A) 14.90
- B) 17.50
- C) 14.10. Ben Sasse is a quantitative analyst at Gurnop Asset Managers. Sasse is interviewing Victor Sophie for a junior analyst position. Sasse mentions that the firm currently uses several proprietary multiple regression models and wants Sophie's opinion about regression models. Sophie makes the following statements: Statement 1: Multiple regression models can be used to forecast independent variables. Statement 2: Multiple regression models can be used to test existing theories of relationships among variables. Sasse then discusses a model that the firm uses to forecast credit spread on investment- grade corporate bonds. Sasse states that while the current model parameters are a secret
Page 4 | Status: ⏸️ Unattempted
Shared Context:
Question: Regarding Sophie's statement on multiple regression:
- A) only Statement 1 is correct Correct
- B) only Statement 2 is correct
- C) both statements are correct
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: Based on the credit spread model, if an issuer gets included in the CDX index and assuming everything else the same, which of the following statements most accurately describes the model's forecast?
- A) The credit spread on the firm’s issue would decrease by 10 bps Correct
- B) The credit spread on the firm’s issue will increase by 32 bps
- C) The credit spread on the firm’s issue will decrease by 32 bps
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following is least likely an assumption of multiple linear regression?
- A) The dependent variable is not serially correlated Correct
- B) There is no linear relationship between the independent variables
- C)
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: Which assumption of multiple regression is most likely evaluated using a QQ plot?
- A) Serial correlation of residuals Correct
- B) Conditional heteroskedasticity
- C) Error term is normally distributed
Page 6 | Status: ⏸️ Unattempted
Question: Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses): SALES = α + 0.004 POP + 1.031 INCOME + 2.002 ADV (0.005) (0.337) (2.312) The critical t-statistic for a 95% confidence level is 2.120. Which of the independent variables is statistically different from zero at the 95% confidence level?
- A) INCOME and ADV Correct
- B) ADV only
- C) INCOME only
Page 6 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the expected salary (in $1,000) of a woman with 16 years of education and 10 years of experience?
- A) 65.48
- B) 59.18
- C) 54.98
Page 8 | Status: ⏸️ Unattempted
Shared Context:
Question: If the return on the industry index is 4%, the stock's expected return would be:
- A) 9.7%
- B) 7.6%
- C) 11.2%
Page 8 | Status: ⏸️ Unattempted
Shared Context:
Question: The percentage of the variation in the stock return explained by the variation in the industry index return is closest to:
- A) 72.1%
- B) 63.2%
- C) 84.9%
Page 8 | Status: ⏸️ Unattempted
Question: An analyst runs a regression of monthly value-stock returns on five independent variables over 48 months. The total sum of squares is 430, and the sum of squared errors is 170. Test the null hypothesis at the 2.5% and 5% significance level that all five of the independent variables are equal to zero.
- A) Rejected at 5% significance only Correct
- B) Not rejected at 2.5% or 5.0% significance
- C) Rejected at 2.5% significance and 5% significance
Page 9 | Status: ⏸️ Unattempted
Question: Jill Wentraub is an analyst with the retail industry. She is modeling a company's sales over time and has noticed a quarterly seasonal pattern. If she includes dummy variables to represent the seasonality component of the sales she must use:
- A) three dummy variables Correct
- B) four dummy variables
- C) one dummy variables
Page 9 | Status: ⏸️ Unattempted
Question: An analyst regresses the return of a S&P 500 index fund against the S&P 500, and also regresses the return of an active manager against the S&P 500. The analyst uses the last five years of data in both regressions. Without making any other assumptions, which of the following is most accurate? The index fund:
- A) regression should have higher sum of squares regression as a ratio to the total sum of squares
- B) should have a higher coefficient on the independent variable Correct
- C) should have a lower coefficient of determination
Page 9 | Status: ⏸️ Unattempted
Question: Assume that in a particular multiple regression model, it is determined that the error terms are uncorrelated with each other. Which of the following statements is most accurate?
- A) This model is in accordance with the basic assumptions of multiple regression analysis because the errors are not serially correlated
- B) Unconditional heteroskedasticity present in this model should not pose a problem, but can be corrected by using robust standard errors
- C) Serial correlation may be present in this multiple regression model, and can be confirmed only through a Durbin-Watson test. Vijay Shapule, CFA, is investigating the application of the Fama-French three-factor model (Model 1) for the Indian stock market for the period 2001–2011 (120 months). Using the dependent variable as annualized return (%), the results of the analysis are shown in Indian Equities—Fama-French Model. Indian Equities—Fama-French Model Factor Coefficient P-Value Intercept 1.22 <0.001 SMB 0.23 <0.001 HML 0.34 0.003 Rm-Rf 0.88 <0.001 R-squared 0.36 SSE 38.00 AIC –129.99 BIC –118.84 Shapule then modifies the model to include a liquidity factor. Results for this four-factor model (Model 2) are shown in Revised Fama-French Model With Liquidity Factor Revised Fama-French Model With Liquidity Factor Factor Coefficient P-Value Intercept 1.56 <0.001
Page 10 | Status: ⏸️ Unattempted
Shared Context:
Question: The adjusted R2 of Model 2 is closest to:
- A) 0.39
- B) 0.37
- C) 0.36
Page 11 | Status: ⏸️ Unattempted
Shared Context:
Question: The model better suited for prediction is:
- A) Model 1 because it has a lower Bayesian information criterion Correct
- B) Model 2 because it has a higher Akaike information criterion
- C) Model 2 because it has a lower Akaike information criterion
Page 11 | Status: ⏸️ Unattempted
Shared Context:
Question: The F-statistic for testing H0: coefficient of LIQ = 0 versus Ha: coefficient of LIQ ≠ 0 is closest to:
- A) 13.33
- B) 5.45.
- C) 2.11.
Page 11 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the predicted return for a stock using Model 1 when SMB = 3.30, HML = 1.25 and Rm-Rf = 5?
- A) 6.80%
- B) 7.88%
- C) 9.58%
Page 12 | Status: ⏸️ Unattempted
Question: A multiple regression model has included independent variables that are not linearly related to the dependent variable. The model is most likely misspecified due to:
- A) incorrect data pooling
- B) incorrect variable form Correct
- C) incorrect variable scaling
Page 12 | Status: ⏸️ Unattempted
Question: One of the main assumptions of a multiple regression model is that the variance of the residuals is constant across all observations in the sample. A violation of the assumption is most likely to be described as:
- A) positive serial correlation
- B) heteroskedasticity Correct
- C) unstable remnant deviation
Page 13 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding heteroskedasticity is least accurate?
- A) Heteroskedasticity may occur in cross-sectional or time-series analyses
- B) The assumption of linear regression is that the residuals are heteroskedastic Correct
- C)
Page 13 | Status: ⏸️ Unattempted
Shared Context:
Question: If GDP rises 2.2% and the price of fuels falls $0.15, Baltz's model will predict Company sales to be (in $ millions) closest to:
- A) $82.00
- B) $128.00
- C) $206.00
Page 14 | Status: ⏸️ Unattempted
Shared Context:
Question: Presence of conditional heteroskedasticity is least likely to affect the:
- A) computed F-statistic
- B) coefficient estimates Correct
- C) computed t-statistic
Page 15 | Status: ⏸️ Unattempted
Question: Which of the following statements least accurately describes one of the fundamental multiple regression assumptions?
- A) The error term is normally distributed Correct
- B) The independent variables are not random
- C) The variance of the error terms is not constant (i.e., the errors are heteroskedastic)
Page 15 | Status: ⏸️ Unattempted
Shared Context:
Question: In regard to their conversation about the regression equation:
- A) Brent’s statement is correct; Johnson’s statement is incorrect Correct
- B) Brent’s statement is correct; Johnson’s statement is correct
- C) Brent’s statement is incorrect; Johnson’s statement is correct
Page 17 | Status: ⏸️ Unattempted
Shared Context:
Question: Regarding Brent's Statements 1 and 2:
- A) Both statements are correct
- B) Only Statement 2 is correct Correct
- C) Only Statement 1 is correct
Page 17 | Status: ⏸️ Unattempted
Shared Context:
Question: Assuming that next year's marketing expenditures are $3,500,000 and there are five salespeople, predicted sales for Mega Flowers should will be:
- A) $11,600,000
- B) $24,000,000
- C)
Page 17 | Status: ⏸️ Unattempted
Shared Context:
Question: Brent would like to further investigate whether at least one of the independent variables can explain a significant portion of the variation of the dependent variable. Which of the following methods would be best for Brent to use?
- A) The F-statistic
- B) The multiple coefficient of determination Correct
- C) An ANOVA table. In preparing an analysis of HB Inc., Jack Stumper is asked to look at the company's sales in relation to broad based economic indicators. Stumper's analysis indicates that HB's monthly sales are related to changes in housing starts (H) and changes in the mortgage interest rate (M). The analysis covers the past ten years for these variables. The regression equation is: S = 1.76 + 0.23H - 0.08M Number of observations: 123 Unadjusted R2: 0.77 F statistic: 9.80 Durbin Watson statistic 0.50 p-value of Housing Starts 0.017 p=value of Mortgage Rates 0.033 Variable Descriptions S = HB Sales (in thousands) H = housing starts (in thousands) M = mortgage interest rate (in percent) November 20x6 Actual Data HB's monthly sales: $55,000
Page 18 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the regression model developed, the closest prediction of sales for December 20x6 is:
- A) $55,000
- B) $36,000
- C) $44,000
Page 19 | Status: ⏸️ Unattempted
Shared Context:
Question: Will Stumper conclude that the housing starts coefficient is statistically different from zero and how will he interpret it at the 5% significance level:
- A) different from zero; sales will rise by $100 for every 23 house starts Correct
- B) not different from zero; sales will rise by $0 for every 100 house starts
- C) different from zero; sales will rise by $23 for every 100 house starts
Page 19 | Status: ⏸️ Unattempted
Shared Context:
Question: Is the regression coefficient of changes in mortgage interest rates different from zero at the 5 percent level of significance?
- A) yes, because p-value < 0.05
- B) no, because coefficient is negative Correct
- C) yes, because -0.08 < 0.05
Page 19 | Status: ⏸️ Unattempted
Shared Context:
Question: The regression statistics above indicate that for the period under study, the independent variables (housing starts, mortgage interest rate) together explained approximately what percentage of the variation in the dependent variable (sales)?
- A) 77.00
- B) 9.80
- C) 67.00
Page 20 | Status: ⏸️ Unattempted
Shared Context:
Question: In this multiple regression, if Stumper discovers that the residuals exhibit positive serial correlation, the most likely effect is:
- A) standard errors are too low but coefficient estimate is consistent Correct
- B) standard errors are too high but coefficient estimate is consistent
- C) standard errors are not affected but coefficient estimate is inconsistent. George Smith, an analyst with Great Lakes Investments, has created a comprehensive report on the pharmaceutical industry at the request of his boss. The Great Lakes portfolio currently has a significant exposure to the pharmaceuticals industry through its large equity position in the top two pharmaceutical manufacturers. His boss requested that Smith determine a way to accurately forecast pharmaceutical sales in order for Great Lakes to identify further investment opportunities in the industry as well as to minimize their exposure to downturns in the market. Smith realized that there are many factors that could possibly have an impact on sales, and he must identify a method that can quantify their effect. Smith used a multiple regression analysis with five independent variables to predict
Page 20 | Status: ⏸️ Unattempted
Shared Context:
Question: Sutter has detected the presence of conditional heteroskedasticity in Smith's report. This is evidence that:
- A) the error terms are correlated with each other Correct
- B) the variance of the error term is correlated with the values of the independent variables
- C) two or more of the independent variables are highly correlated with each other
Page 21 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following is most likely to indicate that two or more of the independent variables, or linear combinations of independent variables, may be highly correlated with each other? Unless otherwise noted, significant and insignificant mean significantly different from zero and not significantly different from zero, respectively.
- A) The R2 is low, the F-statistic is insignificant and the Durbin-Watson statistic is significant
- B) The R2 is high, the F-statistic is significant and the t-statistics on the individual slope coefficients are insignificant
- C) The R2 is high, the F-statistic is significant and the t-statistics on the individual slope coefficients are significant
Page 22 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the Durbin-Watson test statistic, Smith rejects the null hypothesis suggested by the test. This is evidence that:
- A) two or more of the independent variables are highly correlated with each other Correct
- B) the error term is normally distributed
- C) the error terms are correlated with each other
Page 22 | Status: ⏸️ Unattempted
Question: Alex Wade, CFA, is analyzing the result of a regression analysis comparing the performance of gold stocks versus a broad equity market index. Wade believes that first lag serial correlation may be present and, in order to prove his theory, should use which of the following methods to detect its presence?
- A) The Breusch-Pagan test
- B) The Hansen method Correct
- C) The Durbin-Watson statistic. Phillip Lee works for Song Bank as a quantitative analyst. He is currently working on a model to explain the returns (in %) of 20 hedge funds for the past year. He includes three independent variables: Market return = return on a broad-based stock index (in %) Closed = dummy variable (= 1 if the fund is closed to new investors; 0 otherwise) Prior period alpha = fund return for the prior 12 months – return on market (in %) Estimated model: hedge fund return = 3.2 + 0.22 market return + 1.65 closed – 0.11 prior period alpha Lee is concerned about the impact of outliers on the estimated regression model and collects the following information:
Page 23 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the correct interpretation of the coefficient of closed in the first regression?
- A) If a model is closed to new investors, the expected excess fund return is 1.65% Correct
- B) A closed fund is likely to generate a return of 1.65%
- C) A closed fund is estimated to have an extra return of 1.65% relative to funds that are not closed
Page 24 | Status: ⏸️ Unattempted
Shared Context:
Question: To check for only the outliers in the sample, Lee should most appropriately use:
- A) Studentized residuals Correct
- B) leverage
- C) Cook’s D
Page 24 | Status: ⏸️ Unattempted
Shared Context:
Question: Which observations, when excluded, cause a significant change to model coefficients?
- A) Observations 1, 10, and 11 Correct
- B) Observations 10 and 19
- C) Observation 19. Toni Williams, CFA, has determined that commercial electric generator sales in the Midwest U.S. for Self-Start Company is a function of several factors in each area: the cost of heating oil, the temperature, snowfall, and housing starts. Using data for the most currently available year, she runs a cross-sectional regression where she regresses the deviation of sales from the historical average in each area on the deviation of each explanatory variable from the historical average of that variable for that location. She feels this is the most appropriate method since each geographic area will have different average values for the inputs, and the model can explain how current conditions explain how generator sales are higher or lower from the historical average in each area. In summary, she regresses current sales for each area minus its respective historical average on the following variables for each area. The difference between the retail price of heating oil and its historical average. The mean number of degrees the temperature is below normal in Chicago. The amount of snowfall above the average. The percentage of housing starts above the average. Williams used a sample of 26 observations obtained from 26 metropolitan areas in the Midwest U.S. The results are in the tables below. The dependent variable is in sales of generators in millions of dollars. Coefficient Estimates Table Variable Estimated Coefficient Standard Error of the Coefficient Intercept 5.00 1.850 $ Heating Oil 2.00 0.827 Low Temperature 3.00 1.200 Snowfall 10.00 4.833 Housing Starts 5.00 2.333
Page 25 | Status: ⏸️ Unattempted
Shared Context:
Question: According to the model and the data for the Chicago metropolitan area, the forecast of generator sales is:
- A) $55 million above average
- B) $35.2 million above the average
- C) $65 million above the average
Page 26 | Status: ⏸️ Unattempted
Shared Context:
Question: Williams proceeds to test the hypothesis that none of the independent variables has significant explanatory power. Using the joint F-test for the significance of all slope coefficients, at a 5% level of significance:
- A) none of the independent variables has explanatory power Correct
- B) all of the independent variables have explanatory power
- C) at least one of the independent variables has explanatory power
Page 27 | Status: ⏸️ Unattempted
Shared Context:
Question: With respect to testing the validity of the model's results, Williams may wish to perform:
- A) both a Breusch-Godfrey test and a Breusch-Pagan test Correct
- B) a Breusch-Pagan test, but not Breusch-Godfrey
- C) a Breusch-Godfrey test, but not a Breusch-Pagan test
Page 27 | Status: ⏸️ Unattempted
Shared Context:
Question: When Williams ran the model, the computer said the R2 is 0.233. She examines the other output and concludes that this is the:
- A) adjusted R2 value
- B) neither the unadjusted nor adjusted R2 value, nor the coefficient of correlation Correct
- C) unadjusted R2 value
Page 27 | Status: ⏸️ Unattempted
Question: Which of the following conditions will least likely affect the statistical inference about regression parameters by itself?
- A) Unconditional heteroskedasticity Correct
- B) Model misspecification
- C) Multicollinearity
Page 28 | Status: ⏸️ Unattempted
Question: The management of a large restaurant chain believes that revenue growth is dependent upon the month of the year. Using a standard 12 month calendar, how many dummy variables must be used in a regression model that will test whether revenue growth differs by month?
- A) 13
- B) 12
- C) 11. Damon Washburn, CFA, is currently enrolled as a part-time graduate student at State University. One of his recent assignments for his course on Quantitative Analysis is to perform a regression analysis utilizing the concepts covered during the semester. He must interpret the results of the regression as well as the test statistics. Washburn is confident in his ability to calculate the statistics because the class is allowed to use statistical software. However, he realizes that the interpretation of the statistics will be the true test of his
Page 28 | Status: ⏸️ Unattempted
Shared Context:
Question: The percentage of the total variation in quarterly stock returns explained by the independent variables is closest to:
- A) 32%
- B) 47%
- C) 42%
Page 30 | Status: ⏸️ Unattempted
Shared Context:
Question: Using a 5% level of significance, there is:
- A) no evidence of serial correlation in the residuals Correct
- B) evidence of first-lag serial correlation in residuals
- C) evidence of second-lag serial correlation in residuals
Page 30 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the predicted quarterly stock return, given the following forecasts? Employment growth = 2.0% GDP growth = 1.0% Private investment growth = -1.0%
- A) 4.4%
- B) 4.7%
- C) 5.0%
Page 30 | Status: ⏸️ Unattempted
Shared Context:
Question: Assuming a restricted model with all three variables removed and a 5% level of significance, the most appropriate conclusion is:
- A) With an F-statistic of 0.472, we fail to reject the null hypothesis of all coefficients equal to zero
- B) With an F-statistic of 2.66, we fail to reject the null hypothesis of all slope coefficients equal to zero
- C) With an F-statistic of 24.54, we reject the null hypothesis that all the slope coefficients are equal to zero
Page 31 | Status: ⏸️ Unattempted
Question: Consider the following estimated regression equation: Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi − 2.0 COMPi + 8.0 CAPi Sales are in millions of dollars. An analyst is given the following predictions on the independent variables: R&D = 5, ADV = 4, COMP = 10, and CAP = 40. The predicted level of sales is closest to:
- A) $320.25 million
- B) $300.25 million
- C) $310.25 million. Miles Mason, CFA, works for ABC Capital, a large money management company based in New York. Mason has several years of experience as a financial analyst, but is currently working in the marketing department developing materials to be used by ABC's sales team for both existing and prospective clients. ABC Capital's client base consists primarily of large net worth individuals and Fortune 500 companies. ABC invests its clients' money in both publicly traded mutual funds as well as its own investment funds that are managed in- house. Five years ago, roughly half of its assets under management were invested in the publicly traded mutual funds, with the remaining half in the funds managed by ABC's investment team. Currently, approximately 75% of ABC's assets under management are invested in publicly traded funds, with the remaining 25% being distributed among ABC's private funds. The managing partners at ABC would like to shift more of its client's assets
Page 31 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following tests is least likely to be used to detect autocorrelation?
- A) Durbin-Watson Correct
- B) Breusch-Godfrey
- C) Breusch-Pagan
Page 32 | Status: ⏸️ Unattempted
Shared Context:
Question: One of the most popular ways to correct heteroskedasticity is to:
- A) improve the specification of the model
- B) adjust the standard errors Correct
- C)
Page 32 | Status: ⏸️ Unattempted
Shared Context:
Question: If a regression equation shows that no individual t-tests are significant, but the F-statistic is significant, the regression probably exhibits:
- A) serial correlation
- B) multicollinearity Correct
- C) heteroskedasticity. Lynn Carter, CFA, is an analyst in the research department for Smith Brothers in New York. She follows several industries, as well as the top companies in each industry. She provides research materials for both the equity traders for Smith Brothers as well as their retail customers. She routinely performs regression analysis on those companies that she follows to identify any emerging trends that could affect investment decisions. Due to recent layoffs at the company, there has been some consolidation in the research department. Two research analysts have been laid off, and their workload will now be distributed among the remaining four analysts. In addition to her current workload, Carter will now be responsible for providing research on the airline industry. Pinnacle Airlines, a leader in the industry, represents a large holding in Smith Brothers' portfolio. Looking back over past research on Pinnacle, Carter recognizes that the company historically has been a strong performer in what is considered to be a very competitive industry. The stock price over the last 52-week period has outperformed that of other industry leaders, although Pinnacle's net income has remained flat. Carter wonders if the stock price of Pinnacle has become overvalued relative to its peer group in the market, and wants to determine if the timing is right for Smith Brothers to decrease its position in Pinnacle. Carter decides to run a regression analysis, using the monthly returns of Pinnacle stock as the dependent variable and monthly returns of the airlines industry as the independent variable. Analysis of Variance Table (ANOVA) Source df SS Mean Square (SS/df)
Page 33 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following is least likely to be an assumption regarding linear regression?
- A) A linear relationship exists between the dependent and independent variables Correct
- B) The variance of the residuals is constant
- C) The independent variable is correlated with the residuals
Page 34 | Status: ⏸️ Unattempted
Shared Context:
Question: Based upon the information presented in the ANOVA table, what is the coefficient of determination?
- A) 0.084, indicating that the variability of industry returns explains about 8.4% of the variability of company returns
- B) 0.916, indicating that the variability of industry returns explains about 91.6% of the variability of company returns
- C) 0.839, indicating that company returns explain about 83.9% of the variability of industry returns
Page 34 | Status: ⏸️ Unattempted
Shared Context:
Question: Based upon her analysis, Carter has derived the following regression equation: Ŷ = 1.75 + 3.25X1. The predicted value of the Y variable equals 50.50, if the:
- A) predicted value of the independent variable equals 15 Correct
- B) predicted value of the dependent variable equals 15
- C)
Page 34 | Status: ⏸️ Unattempted
Shared Context:
Question: Carter realizes that although regression analysis is a useful tool when analyzing investments, there are certain limitations. Carter made a list of points describing limitations that Smith Brothers equity traders should be aware of when applying her research to their investment decisions. Point 1: Regression residuals may be homoskedastic. Point 2: Data from regression relationships tends to exhibit parameter instability. Point 3: Regression residuals may exhibit autocorrelation. Point 4: The variance of the error term may change with one or more independent variables. When reviewing Carter's list, one of the Smith Brothers' equity traders points out that not all of the points describe regression analysis limitations. Which of Carter's points most accurately describes the limitations to regression analysis?
- A) Points 1, 2, and 3
- B) Points 2, 3, and 4
- C) Points 1, 3, and 4
Page 35 | Status: ⏸️ Unattempted
Question: When constructing a regression model to predict portfolio returns, an analyst runs a regression for the past five year period. After examining the results, she determines that an increase in interest rates two years ago had a significant impact on portfolio results for the time of the increase until the present. By performing a regression over two separate time periods, the analyst would be attempting to prevent which type of misspecification?
- A) Incorrectly pooling data
- B) Inappropriate variable scaling Correct
- C) Inappropriate variable form
Page 35 | Status: ⏸️ Unattempted
Question: Which of the following statements most accurately interprets the following regression results at the given significance level? Variable p-value Intercept 0.0201 X1 0.0284 X2 0.0310 X3 0.0143
- A) The variable X2 is statistically significantly different from zero at the 3% significance level
- B) The variables X1 and X2 are statistically significantly different from zero at the 2% significance level
- C) The variable X3 is statistically significantly different from zero at the 2% significance level
Page 36 | Status: ⏸️ Unattempted
Question: A high-yield bond analyst is trying to develop an equation using financial ratios to estimate the probability of a company defaulting on its bonds. A technique that can be used to develop this equation is:
- A) multiple linear regression adjusting for heteroskedasticity
- B) dummy variable regression
- C) logistic regression model
Page 36 | Status: ⏸️ Unattempted
Question: An analyst is estimating whether company sales is related to three economic variables. The regression exhibits conditional heteroskedasticity, serial correlation, and multicollinearity. The analyst uses White and Newey-West corrected standard errors. Which of the following is most accurate?
- A) The regression will still exhibit multicollinearity, but the heteroskedasticity and serial correlation problems will be solved
- B) The regression will still exhibit serial correlation and multicollinearity, but the heteroskedasticity problem will be solved
- C) The regression will still exhibit heteroskedasticity and multicollinearity, but the serial correlation problem will be solved
Page 37 | Status: ⏸️ Unattempted
Question: Consider the following estimated regression equation, with the standard errors of the slope coefficients as noted: Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi where the standard error for the estimated coefficient on R&D is 0.45, the standard error for the estimated coefficient on ADV is 2.2 , the standard error for the estimated coefficient on COMP is 0.63, and the standard error for the estimated coefficient on CAP is 2.5. The equation was estimated over 40 companies. Using a 5% level of significance, which of the estimated coefficients are significantly different from zero?
- A) R&D, ADV, COMP, and CAP
- B) R&D, COMP, and CAP only
- C) ADV and CAP only
Page 37 | Status: ⏸️ Unattempted
Question: Consider the following regression equation: Salesi = 20.5 + 1.5 R&Di + 2.5 ADVi – 3.0 COMPi where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, and COMP is the number of competitors in the industry. Which of the following is NOT a correct interpretation of this regression information?
- A) If R&D and advertising expenditures are $1 million each and there are 5 competitors, expected sales are $9.5 million
- B) If a company spends $1 more on R&D (holding everything else constant), sales are expected to increase by $1.5 million
- C) One more competitor will mean $3 million less in sales (holding everything else constant)
Page 38 | Status: ⏸️ Unattempted
Question: Which of the following questions is least likely answered by using a qualitative dependent variable?
- A) Based on the following company-specific financial ratios, will company ABC enter bankruptcy?
- B) Based on the following subsidiary and competition variables, will company XYZ divest itself of a subsidiary?
- C) Based on the following executive-specific and company-specific variables, how many shares will be acquired through the exercise of executive stock options?
Page 38 | Status: ⏸️ Unattempted
Question: May Jones estimated a regression that produced the following analysis of variance (ANOV
- A) R2 = 0.25 and F = 0.909
- B) R2 = 0.20 and F = 10
- C) R2 = 0.25 and F = 10
Page 39 | Status: ⏸️ Unattempted
Shared Context:
Question: If the number of analysts on NGR Corp. were to double to 4, the change in the forecast of NGR would be closest to?
- A) −0.035
- B) −0.055
- C) −0.019
Page 40 | Status: ⏸️ Unattempted
Shared Context:
Question: Based on a R2 calculated from the information in Table 2, the analyst should conclude that the number of analysts and ln(market value) of the firm explain:
- A) 18.4% of the variation in returns Correct
- B) 84.4% of the variation in returns
- C) 15.6% of the variation in returns
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the F-statistic for the hypothesis that all slope coefficients are not statistically significantly different from 0? And, what can be concluded from its value at a 1% level of significance?
- A) F = 5.80, reject a hypothesis that both of the slope coefficients are equal to zero
- B) F = 17.00, reject a hypothesis that both of the slope coefficients are equal to zero
- C) F = 1.97, fail to reject a hypothesis that both of the slope coefficients are equal to zero
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: Upon further analysis, Turner concludes that multicollinearity is a problem. What might have prompted this further analysis and what is intuition behind the conclusion?
- A) At least one of the t-statistics was not significant, the F-statistic was significant, and a positive relationship between the number of analysts and the size of the firm would be expected
- B) At least one of the t-statistics was not significant, the F-statistic was not significant, and a positive relationship between the number of analysts and the size of the firm would be expected
- C) At least one of the t-statistics was not significant, the F-statistic was significant, and an intercept not significantly different from zero would be expected
Page 41 | Status: ⏸️ Unattempted
Question: A regression with three independent variables have VIF values of 3, 4, and 2 for the first, second, and third independent variables, respectively. Which of the following conclusions is most appropriate?
- A) Only variable two has a problem with multicollinearity Correct
- B) Multicollinearity does not seem to be a problem with the model
- C) Total VIF of 9 indicates a serious multicollinearity problem
Page 42 | Status: ⏸️ Unattempted
Question: Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years. Which of the follow regression equations correctly represents Hilton's hypothesis?
- A) SALES = α x β1 POP x β2 INCOME x β3 ADV x ε Correct
- B) INCOME = α + β1 POP + β2 SALES + β3 ADV + ε
- C) SALES = α + β1 POP + β2 INCOME + β3 ADV + ε. Jessica Jenkins, CFA, is looking at the retail property sector for her manager. She is undertaking a top down review as she feels this is the best way to analyze the industry segment. To predict U.S. property starts (housing), she has used regression analysis. Jessica included the following variables in her analysis: Average nominal interest rates during each year (as a decimal) Annual GDP per capita in $'000 Given these variables, the following output was generated from 30 years of data: Exhibit 1 – Results from regressing housing starts (in millions) on interest rates and GDP per capita Coefficient Standard Error T-statistic
Page 42 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the regression model represented in Exhibit 1, what is the predicted number of housing starts for 20X7?
- A) 1,751,000
- B) 1,394
- C) 1,394,420
Page 44 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following statements best describes the explanatory power of the estimated regression?
- A) The residual standard error of only 0.3 indicates that the regression equation is a good fit for the sample data
- B) The independent variables explain 61.58% of the variation in housing starts Correct
- C) The large F-statistic indicates that both independent variables help explain changes in housing starts
Page 44 | Status: ⏸️ Unattempted
Question: A fund has changed managers twice during the past 10 years. An analyst wishes to measure whether either of the changes in managers has had an impact on performance. R is the return on the fund, and M is the return on a market index. Which of the following regression equations can appropriately measure the desired impacts?
- A) The desired impact cannot be measured Correct
- B) R = a + bM + c1D1 + c2D2 + c3D3 + ε, where D1 = 1 if the return is from the first manager, and D2 = 1 if the return is from the second manager, and D3 = 1 is the return is from the third manager
- C) R = a + bM + c1D1 + c2D2 + ε, where D1 = 1 if the return is from the first manager, and D2 = 1 if the return is from the third manager. Raul Gloucester, CFA, is analyzing the returns of a fund that his company offers. He tests the fund's sensitivity to a small capitalization index and a large capitalization index, as well as to whether the January effect plays a role in the fund's performance. He uses two years of monthly returns data, and runs a regression of the fund's return on the indexes and a January-effect qualitative variable. The "January" variable is 1 for the month of January and zero for all other months. The results of the regression are shown in the tables below. Regression Statistics Multiple R 0.817088 R2 0.667632
Page 45 | Status: ⏸️ Unattempted
Shared Context:
Question: Suppose the Breusch-Godfrey statistic is 3.22. At a 5% level of significance, which of the following is the most accurate conclusion regarding the presence of serial correlation (at two lags) in the residuals?
- A) No, because the BG statistic is less than the critical test statistic of 3.55, we don't have evidence of serial correlation
- B) Yes, because the BG statistic exceeds the critical test statistic of 3.16, there is evidence of serial correlation
- C) No, because the BG statistic is less than the critical test statistic of 3.49, we don't have evidence of serial correlation
Page 47 | Status: ⏸️ Unattempted
Shared Context:
Question: Gloucester subsequently revises the model to exclude the small cap index and finds that the revised model has a RSS of 106.332. Which of the following statements is most accurate? At a 5% level of significance, the test statistic:
- A) of 13.39 indicates that we cannot reject the hypothesis that the coefficient of small- cap index is significantly different from 0
- B) of 1.30 indicates that we cannot reject the hypothesis that the coefficient of small- cap index is not significantly different from 0
- C) of 4.35 indicates that we cannot reject the hypothesis that the coefficient of small- cap index is significantly different from 0
Page 47 | Status: ⏸️ Unattempted
Shared Context:
Question: In the month of January, if both the small and large capitalization index have a zero return, we would expect the fund to have a return equal to:
- A) 2.799
- B) 2.561
- C) 2.322
Page 48 | Status: ⏸️ Unattempted
Shared Context:
Question: Assuming (for this question only) that the F-test was significant but that the t-tests of the independent variables were insignificant, this would most likely suggest:
- A) conditional heteroskedasticity
- B) serial correlation
- C) multicollinearity. Quin Tan Liu, CFA, is looking at the retail property sector for her manager. She is undertaking a top down review as she feels this is the best way to analyze the industry segment. To predict U.S. property starts (housing), she has used regression analysis. Liu included the following variables in her analysis: Average nominal interest rates during each year (as a decimal) Annual GDP per capita in $'000 Given these variables the following output was generated from 30 years of data:
Page 48 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following statements best describes the explanatory power of the estimated regression?
- A) The large F-statistic indicates that both independent variables help explain changes in housing starts
- B) The independent variables explain 61.58% of the variation in housing starts Correct
- C) The residual standard error of only 0.3 indicates that the regression equation is a good fit for the sample data
Page 50 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following is the least appropriate statement in relation to R-square and adjusted R-square:
- A) Adjusted R-square decreases when the added independent variable adds little value to the regression model
- B) R-square typically increases when new independent variables are added to the regression regardless of their explanatory power
- C) Adjusted R-square is a value between 0 and 1 and can be interpreted as a percentage
Page 50 | Status: ⏸️ Unattempted
Question: When interpreting the results of a multiple regression analysis, which of the following terms represents the value of the dependent variable when the independent variables are all equal to zero?
- A) Slope coefficient
- B) Intercept term Correct
- C) p-value
Page 51 | Status: ⏸️ Unattempted
Question: Consider the following regression equation: Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, COMP is the number of competitors in the industry, and CAP is the capital expenditures for the period in millions of dollars. Which of the following is NOT a correct interpretation of this regression information?
- A) If a company spends $1 million more on capital expenditures (holding everything else constant), Sales are expected to increase by $8.0 million
- B) One more competitor will mean $2 million less in Sales (holding everything else constant)
- C) If R&D and advertising expenditures are $1 million each, there are 5 competitors, and capital expenditures are $2 million, expected Sales are $8.25 million
Page 51 | Status: ⏸️ Unattempted
Shared Context:
Question: Concerning the assumptions of multiple regression, Grimbles is:
- A) incorrect to agree with Voiku’s list of assumptions because one of the assumptions is stated incorrectly
- B) incorrect to agree with Voiku’s list of assumptions because two of the assumptions are stated incorrectly
- C) correct to agree with Voiku’s list of assumptions
Page 53 | Status: ⏸️ Unattempted
Shared Context:
Question: The multiple regression, as specified, most likely suffers from:
- A) serial correlation of the error terms
- B) heteroskedasticity
- C) multicollinearity
Page 54 | Status: ⏸️ Unattempted
Question: Jacob Warner, CFA, is evaluating a regression analysis recently published in a trade journal that hypothesizes that the annual performance of the S&P 500 stock index can be explained by movements in the Federal Funds rate and the U.S. Producer Price Index (PPI). Which of the following statements regarding his analysis is most accurate?
- A) If the p-value of a variable is less than the significance level, the null hypothesis cannot be rejected
- B) If the t-value of a variable is less than the significance level, the null hypothesis should be rejected
- C) If the p-value of a variable is less than the significance level, the null hypothesis can be rejected. A real estate agent wants to develop a model to predict the selling price of a home. The agent believes that the most important variables in determining the price of a house are its size (in square feet) and the number of bedrooms. Accordingly, he takes a random sample of 32 homes that has recently been sold. The results of the regression are:
Page 54 | Status: ⏸️ Unattempted
Shared Context:
Question: The predicted price of a house that has 2,000 square feet of space and 4 bedrooms is closest to:
- A) $114,000
- B) $256,000
- C) $185,000
Page 55 | Status: ⏸️ Unattempted
Shared Context:
Question: The conclusion from the hypothesis test of H0: b1 = b2 = 0, is that the null hypothesis should:
- A) not be rejected as the calculated F of 40.73 is greater than the critical value of 3.29
- B) be rejected as the calculated F of 40.73 is greater than the critical value of 3.33 Correct
- C) be rejected as the calculated F of 40.73 is greater than the critical value of 3.29
Page 56 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following is most likely to present a problem in using this regression for forecasting?
- A) Heteroskedasticity
- B) Multicollinearity Correct
- C) Autocorrelation
Page 56 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding serial correlation that might be encountered in regression analysis is least accurate?
- A) Serial correlation occurs least often with time series data Correct
- B) Serial correlation does not affect consistency of regression coefficients
- C) Positive serial correlation and heteroskedasticity can both lead to Type I errors. Peter Pun, an enrolled candidate for the CFA Level II examination, has decided to perform a calendar test to examine whether there is any abnormal return associated with investments and disinvestments made in blue-chip stocks on particular days of the week. As a proxy for blue-chips, he has decided to use the S&P 500 Index. The analysis will involve the use of dummy variables and is based on the past 780 trading days. Here are selected findings of his study: RSS 0.0039
Page 56 | Status: ⏸️ Unattempted
Shared Context:
Question: What is most likely represented by the Y intercept of the regression?
- A) The drift of a random walk
- B) The intercept is not a driver of returns, only the independent variables Correct
- C) The return on a particular trading day
Page 57 | Status: ⏸️ Unattempted
Shared Context:
Question: What can be said of the overall explanatory power of the model at the 5% significance?
- A) The coefficient of determination for the above regression is significantly higher than the standard error of the estimate, and therefore there is value to calendar trading
- B) There is no value to calendar trading
- C) There is value to calendar trading
Page 57 | Status: ⏸️ Unattempted
Shared Context:
Question: The test mentioned by Jessica is known as the:
- A) Breusch-Pagan, which is a two-tailed test Correct
- B) Durbin-Watson, which is a two-tailed test
- C) Breusch-Pagan, which is a one-tailed test
Page 58 | Status: ⏸️ Unattempted
Shared Context:
Question: Are Jessica and her son Jonathan correct in terms of the method used to correct for heteroskedasticity and the likely effects?
- A) Neither is correct Correct
- B) Both are correct
- C) One is correct
Page 58 | Status: ⏸️ Unattempted
Question: Which of the following is a potential remedy for multicollinearity?
- A) Take first differences of the dependent variable Correct
- B) Add dummy variables to the regression
- C) Omit one or more of the collinear variables
Page 58 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding heteroskedasticity is least accurate?
- A) When not related to independent variables, heteroskedasticity does not pose any major problems with the regression
- B) Heteroskedasticity only occurs in cross-sectional regressions Correct
- C)
Page 58 | Status: ⏸️ Unattempted
Shared Context:
Question: Salve runs a regression using the squared residuals from the model using the original dependent variables. The coefficient of determination of this model is 6%. Which of the following is the most appropriate conclusion at a 5% level of significance?
- A) Because the test statistic of 7.20 is higher than the critical value of 3.84, we reject the null hypothesis of no conditional heteroskedasticity in residuals
- B) Because the test statistic of 7.20 is lower than the critical value of 7.81, we fail to reject the null hypothesis of no conditional heteroskedasticity in residuals
- C) Because the test statistic of 3.60 is lower than the critical value of 3.84, we reject the null hypothesis of no conditional heteroskedasticity in residuals
Page 60 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following misspecifications is most likely to cause serial correlation in residuals?
- A) Data improperly pooled
- B) Improper variable scaling Correct
- C) Improper variable form
Page 60 | Status: ⏸️ Unattempted
Shared Context:
Question: Should Salve be concerned about residual multicollinearity?
- A) Yes, and Salve should exclude either variable SMB or HML from the model
- B) Yes, and Salve should exclude variable Rm-Rf from the model
- C) No
Page 61 | Status: ⏸️ Unattempted
Question: An analyst runs a regression of portfolio returns on three independent variables. These independent variables are price-to-sales (P/S), price-to-cash flow (P/CF), and price-to-book (P/B). The analyst discovers that the p-values for each independent variable are relatively high. However, the F-test has a very small p-value. The analyst is puzzled and tries to figure out how the F-test can be statistically significant when the individual independent variables are not significant. What violation of regression analysis has occurred?
- A) multicollinearity Correct
- B) conditional heteroskedasticity
- C) serial correlation
Page 61 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding the R2 is least accurate?
- A) The adjusted-R2 is greater than the R2 in multiple regression Correct
- B) It is possible for the adjusted-R2 to decline as more variables are added to the multiple regression
- C) The adjusted-R2 not appropriate to use in simple regression.
Page 61 | Status: ⏸️ Unattempted
Question: An analyst is trying to estimate the beta for a fund. The analyst estimates a regression equation in which the fund returns are the dependent variable and the Wilshire 5000 is the independent variable, using monthly data over the past five years. The analyst finds that the correlation between the square of the residuals of the regression and the Wilshire 5000 is 0.2. Which of the following is most accurate, assuming a 0.05 level of significance? There is:
- A) no evidence that there is conditional heteroskedasticity or serial correlation in the regression equation
- B) evidence of conditional heteroskedasticity but not serial correlation in the regression equation
- C) evidence of serial correlation but not conditional heteroskedasticity in the regression equation
Page 62 | Status: ⏸️ Unattempted
Question: When pooling the samples over multiple economic environments in a multiple regression model, which of the following errors is most likely to occur?
- A) Heteroskedasticity
- B) Model misspecification Correct
- C) Multicollinearity
Page 62 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding the R2 is least accurate?
- A) R2 is the coefficient of determination of the regression Correct
- B) The R2 of a regression will be greater than or equal to the adjusted-R2 for the same regression. C) The R2 is the ratio of the unexplained variation to the explained variation of the dependent variable.
- C)
Page 62 | Status: ⏸️ Unattempted
Shared Context:
Question: Which model would be a better choice for making a forecast?
- A) Model TWO because it has a higher adjusted R2 Correct
- B) Model TWO because serial correlation is not a problem
- C) Model ONE because it has a higher R2
Page 64 | Status: ⏸️ Unattempted
Shared Context:
Question: Using Model ONE, what is the sales forecast for the second quarter of the next year?
- A) $51.09 million Correct
- B) $46.31 million
- C) $56.02 million
Page 65 | Status: ⏸️ Unattempted
Shared Context:
Question: Which model misspecification is most likely to cause multicollinearity?
- A) Inappropriate variable scaling Correct
- B) Inappropriate variable form
- C) Ommission of important variable(s)
Page 65 | Status: ⏸️ Unattempted
Shared Context:
Question: If it is determined that conditional heteroskedasticity is present in model one, which of the following inferences are most accurate?
- A) Regression coefficients will be unbiased but standard errors will be biased Correct
- B) Regression coefficients will be biased but standard errors will be unbiased
- C) Both the regression coefficients and the standard errors will be biased
Page 65 | Status: ⏸️ Unattempted
Shared Context:
Question: If Mercado determines that Model TWO is the appropriate specification, then he is essentially saying that for each year, value of sales from quarter three to four is expected to:
- A) grow by more than $1,000,000
- B) remain approximately the same
- C)
Page 65 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the regression model developed, the closest prediction of sales for December 20X6 is:
- A) $55,000
- B) $44,000
- C) $36,000
Page 67 | Status: ⏸️ Unattempted
Shared Context:
Question: Will Jack conclude that the housing starts coefficient is statistically different from zero and how will he interpret it at the 5% significance level?
- A) Different from zero; sales will rise by $100 for every 23 house starts Correct
- B) Different from zero; sales will rise by $23 for every 100 house starts
- C) Not different from zero; sales will rise by $0 for every 100 house starts
Page 67 | Status: ⏸️ Unattempted
Shared Context:
Question: The regression statistics indicate that for the period under study, the independent variables (housing starts, mortgage interest rate) together explain approximately what percentage of the variation in the dependent variable (sales)?
- A) 67.00
- B) 9.80
- C) 77.00
Page 68 | Status: ⏸️ Unattempted
Shared Context:
Question: For this question only, assume that the regression of squared residuals on the independent variables has R2 = 11%. At a 5% level of significance, which of the following conclusions is most accurate?
- A) With a test statistic of 13.53, we can conclude the presence of conditional heteroskedasticity
- B) With a test statistic of 0.22, we cannot reject the null hypothesis of no conditional heteroskedasticity
- C) Because the critical value is 3.84, we reject the null hypothesis of no conditional heteroskedasticity
Page 68 | Status: ⏸️ Unattempted
Reading 2 Time-Series Analysis 82 questions
Question: An analyst wants to model quarterly sales data using an autoregressive model. She has found that an AR(1) model with a seasonal lag has significant slope coefficients. She also finds that when a second and third seasonal lag are added to the model, all slope coefficients are significant too. Based on this, the best model to use would most likely be an:
- A) ARCH(1)
- B) AR(1) model with no seasonal lags Correct
- C) AR(1) model with 3 seasonal lags
Page 1 | Status: ⏸️ Unattempted
Question: The model xt = b0 + b1 xt − 1 + b2 xt − 2 + b3 xt −12 + εt is an autoregressive model of type:
- A) AR(2)
- B) AR(1)
- C) AR(12)
Page 1 | Status: ⏸️ Unattempted
Question: An AR(1) autoregressive time series model:
- A) can be used to test for a unit root, which exists if the slope coefficient equals one
- B) cannot be used to test for a unit root
- C) can be used to test for a unit root, which exists if the slope coefficient is less than one
Page 1 | Status: ⏸️ Unattempted
Question: The model xt = b0 + b1 xt-1 + b2 xt-2 + b3 xt-3 + b4 xt-4 + εt is:
- A) an autoregressive conditional heteroskedastic model, ARCH Correct
- B) a moving average model, MA(4)
- C) an autoregressive model, AR(4)
Page 2 | Status: ⏸️ Unattempted
Question: The procedure for determining the structure of an autoregressive model is:
- A) test autocorrelations of the residuals for a simple trend model, and specify the number of significant lags
- B) estimate an autoregressive model (for example, an AR(1) model), calculate the autocorrelations for the model's residuals, test whether the autocorrelations are different from zero, and add an AR lag for each significant autocorrelation
- C) estimate an autoregressive model (e.g., an AR(1) model), calculate the autocorrelations for the model's residuals, test whether the autocorrelations are different from zero, and revise the model if there are significant autocorrelations
Page 2 | Status: ⏸️ Unattempted
Question: The primary concern when deciding upon a time series sample period is which of the following factors?
- A) The length of the sample time period Correct
- B) Current underlying economic and market conditions
- C) The total number of observations. Diem Le is analyzing the financial statements of McDowell Manufacturing. He has modeled the time series of McDowell's gross margin over the last 16 years. The output is shown below. Assume 5% significance level for all statistical tests. Autoregressive Model Gross Margin – McDowell Manufacturing Quarterly Data: 1st Quarter 1985 to 4th Quarter 2000 Regression Statistics R-squared 0.767 Standard error of forecast 0.049 Observations 64 Durbin-Watson 1.923 (not statistically significant) Coefficient Standard Error t-statistic Constant 0.155 0.052 ????? Lag 1 0.240 0.031 ????? Lag 4 0.168 0.038 ?????
Page 3 | Status: ⏸️ Unattempted
Shared Context:
Question: Le can conclude that the model is:
- A) properly specified because the Durbin-Watson statistic is not significant Correct
- B) properly specified because there is no evidence of autocorrelation in the residuals
- C) not properly specified because there is evidence of autocorrelation in the residuals and the Durbin-Watson statistic is not significant
Page 4 | Status: ⏸️ Unattempted
Shared Context:
Question: What is the forecast for the gross margin in the first quarter of 2004?
- A) 0.246
- B) 0.250
- C) 0.256
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: With respect to heteroskedasticity in the model, we can definitively say:
- A) nothing
- B) an ARCH process exists because the autocorrelation coefficients of the residuals have different signs
- C) heteroskedasticity is not a problem because the DW statistic is not significant
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: Supposing the time series is actually a random walk, which of the following approaches would be appropriate prior to using an autoregressive model?
- A) First differencing the time series Correct
- B) ARCH
- C) Convert the time series by taking a natural log of the series
Page 5 | Status: ⏸️ Unattempted
Question: In the time series model: yt=b0 + b1 t + εt, t=1,2,...,T, the:
- A) change in the dependent variable per time period is b1 Correct
- B) disturbance terms are autocorrelated
- C) disturbance term is mean-reverting
Page 5 | Status: ⏸️ Unattempted
Question: Which of the following is a seasonally adjusted model?
- A) Salest = b1 Sales t-1+ εt
- B) (Salest - Sales t-1)= b0 + b1 (Sales t-1 - Sales t-2) + b2 (Sales t-4 - Sales t-5) + εt
- C) Salest = b0 + b1 Sales t-1 + b2 Sales t-2 + εt
Page 6 | Status: ⏸️ Unattempted
Question: Trend models can be useful tools in the evaluation of a time series of data. However, there are limitations to their usage. Trend models are not appropriate when which of the following violations of the linear regression assumptions is present?
- A) Model misspecification
- B) Serial correlation Correct
- C) Heteroskedasticity
Page 6 | Status: ⏸️ Unattempted
Question: A time series that has a unit root can be transformed into a time series without a unit root through:
- A) first differencing Correct
- B) mean reversion
- C) calculating moving average of the residuals
Page 6 | Status: ⏸️ Unattempted
Question: An analyst modeled the time series of annual earnings per share in the specialty department store industry as an AR(3) process. Upon examination of the residuals from this model, she found that there is a significant autocorrelation for the residuals of this model. This indicates that she needs to:
- A) revise the model to include at least another lag of the dependent variable Correct
- B) switch models to a moving average model
- C) alter the model to an ARCH model
Page 7 | Status: ⏸️ Unattempted
Question: Dianne Hart, CFA, is considering the purchase of an equity position in Book World, Inc, a leading seller of books in the United States. Hart has obtained monthly sales data for the past seven years, and has plotted the data points on a graph. Hart notices that the revenues are growing at approximately 4.5% per year. Which of the following statements regarding Hart's analysis of the data time series of Book World's sales is most accurate? Hart should utilize a:
- A) mean-reverting model to analyze the data because the time series pattern is covariance stationary
- B) linear model to analyze the data because the mean appears to be constant Correct
- C) log-linear model to analyze the data because it is likely to exhibit a compound growth trend
Page 9 | Status: ⏸️ Unattempted
Question: Alexis Popov, CFA, wants to estimate how sales have grown from one quarter to the next on average. The most direct way for Popov to estimate this would be:
- A) an AR(1) model with a seasonal lag
- B) a linear trend model Correct
- C) an AR(1) model
Page 10 | Status: ⏸️ Unattempted
Question: The main reason why financial and time series intrinsically exhibit some form of nonstationarity is that:
- A) serial correlation, a contributing factor to nonstationarity, is always present to a certain degree in most financial and time series
- B) most financial and time series have a natural tendency to revert toward their means
- C) most financial and economic relationships are dynamic and the estimated regression coefficients can vary greatly between periods
Page 11 | Status: ⏸️ Unattempted
Question: One choice a researcher can use to test for nonstationarity is to use a:
- A) Breusch-Pagan test, which uses a modified t-statistic
- B) Dickey-Fuller test, which uses a modified χ2 statistic Correct
- C) Dickey-Fuller test, which uses a modified t-statistic
Page 11 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding the instability of time-series models is most accurate? Models estimated with:
- A) shorter time series are usually more stable than those with longer time series Correct
- B) a greater number of independent variables are usually more stable than those with a smaller number
- C) longer time series are usually more stable than those with shorter time series
Page 12 | Status: ⏸️ Unattempted
Question: Which of the following is NOT a requirement for a series to be covariance stationary? The:
- A) expected value of the time series is constant over time Correct
- B) covariance of the time series with itself (lead or lag) must be constant
- C) time series must have a positive trend
Page 13 | Status: ⏸️ Unattempted
Question: The table below shows the autocorrelations of the lagged residuals for quarterly theater ticket sales that were estimated using the AR(1) model: ln(salest) = b0 + b1(ln salest − 1) + et. Assuming the critical t-statistic at 5% significance is 2.0, which of the following is the most likely conclusion about the appropriateness of the model? The time series: Lagged Autocorrelations of the Log of Quarterly Theater Ticket Sales Lag Autocorrelation Standard Error t-Statistic 1 −0.0738 0.1667 −0.44271 2 −0.1047 0.1667 −0.62807 3 −0.0252 0.1667 −0.15117 4 0.5528 0.1667 3.31614
- A) contains seasonality Correct
- B) contains ARCH (1) errors
- C) would be more appropriately described with an MA(4) model. Winston Collier, CFA, has been asked by his supervisor to develop a model for predicting the warranty expense incurred by Premier Snowplow Manufacturing Company in servicing its plows. Three years ago, major design changes were made on newly manufactured plows in an effort to reduce warranty expense. Premier warrants its snowplows for 4 years or 18,000 miles, whichever comes first. Warranty expense is higher in winter months, but some of Premier's customers defer maintenance issues that are not essential to keeping the machines functioning to spring or summer seasons. The data that Collier will analyze is in the following table (in $ millions): Quarter Warranty Expense Change in Warranty Expense yt Lagged Change in Warranty Expense yt-1 Seasonal Lagged Change in Warranty Expense yt-4 2002.1 103 2002.2 52 –51 2002.3 32 –20 –51 2002.4 68 +36 –20
Page 14 | Status: ⏸️ Unattempted
Shared Context:
Question: Collier's supervisors would probably not want to use the results from the trend model for all of the following reasons EXCEPT:
- A) the model is a linear trend model and log-linear models are always superior Correct
- B) the slope coefficient is not significant
- C)
Page 15 | Status: ⏸️ Unattempted
Shared Context:
Question: For this question only, assume that Winston also ran an AR(1) model with the following results: yt = −0.9 − 0.23* yt −1 + et R-squared = 78.3% (0.823) (0.0222) The mean reverting level of this model is closest to:
- A) 1.16
- B) −0.73
- C) 0.77
Page 16 | Status: ⏸️ Unattempted
Shared Context:
Question: Based on the autoregressive model, expected warranty expense in the first quarter of 2005 will be closest to:
- A) $51 million
- B) $60 million
- C) $65 million
Page 16 | Status: ⏸️ Unattempted
Shared Context:
Question: Based on the results, is there a seasonality component in the data?
- A) Yes, because the coefficient on yt–4 is large compared to its standard error Correct
- B) Yes, because the coefficient on yt is small compared to its standard error
- C)
Page 16 | Status: ⏸️ Unattempted
Shared Context:
Question: After discussing the above matter with a colleague, Cranwell finally decides to use an autoregressive model of order one i.e. AR(1) for the above data. Below is a summary of the findings of the model: b0 0.4563 b1 0.6874 Standard error 0.3745 R-squared 0.7548 Durbin Watson 1.23 F 12.63 Observations 180 Calculate the mean reverting level of the series.
- A) 1.26
- B) 1.46
- C) 1.66
Page 17 | Status: ⏸️ Unattempted
Shared Context:
Question: Cranwell is aware that the Dickey Fuller test can be used to discover whether a model has a unit root. He is also aware that the test would use a revised set of critical t-values. What would it mean to Bert to reject the null of the Dickey Fuller test (Ho: g = 0) ?
- A) There is no unit root
- B) There is a unit root and the model cannot be used in its current form Correct
- C)
Page 17 | Status: ⏸️ Unattempted
Shared Context:
Question: Cranwell would also like to test for serial correlation in his AR(1) model. To do this, Cranwell should:
- A) use the provided Durbin Watson statistic and compare it to a critical value Correct
- B) use a t-test on the residual autocorrelations over several lags
- C) determine if the series has a finite and constant covariance between leading and lagged terms of itself
Page 18 | Status: ⏸️ Unattempted
Shared Context:
Question: When using the root mean squared error (RMSE) criterion to evaluate the predictive power of the model, which of the following is the most appropriate statement?
- A) Use the model with the highest RMSE calculated using the in-sample data Correct
- B) Use the model with the lowest RMSE calculated using the out-of-sample data
- C) Use the model with the lowest RMSE calculated using the in-sample data
Page 18 | Status: ⏸️ Unattempted
Shared Context:
Question: The WPM model was specified as a(n):
- A) Moving Average (MA) Model Correct
- B) Autoregressive (AR) Model
- C) Autoregressive (AR) Model with a seasonal lag
Page 20 | Status: ⏸️ Unattempted
Shared Context:
Question: The mean reverting level of monthly sales is closest to:
- A) 381.29 million
- B) 8.83 million
- C) 43.2 million
Page 20 | Status: ⏸️ Unattempted
Shared Context:
Question: Morris concludes that the current price of Car-tel stock is consistent with single stage constant growth model (with g=3%). Based on this information, the sales model is most likely:
- A) Incorrectly specified and first differencing the data would be an appropriate remedy
- B) Correctly specified
- C) Incorrectly specified and first differencing the natural log of the data would be an appropriate remedy
Page 20 | Status: ⏸️ Unattempted
Shared Context:
Question: The preceding table will be used by Johnson to forecast values using:
- A) an autoregressive model with a seasonal lag Correct
- B) a serially correlated model with a seasonal lag
- C) a log-linear trend model with a seasonal lag
Page 21 | Status: ⏸️ Unattempted
Shared Context:
Question: The value that Johnson should enter in the table in place of "w" is:
- A) 164
- B) −115
- C) −48
Page 21 | Status: ⏸️ Unattempted
Shared Context:
Question: Imagine that Johnson prepares a change-in-sales regression analysis model with seasonality, which includes the following: Coefficients Intercept −6.032 Lag 1 0.017 Lag 4 0.983 Based on the model, expected sales in the first quarter of 2015 will be closest to:
- A) 190
- B) 210
- C) 155
Page 22 | Status: ⏸️ Unattempted
Shared Context:
Question: Johnson's model was most likely designed to correct for:
- A) heteroskedasticity of model residuals Correct
- B) nonstationarity in time series data
- C) cointegration in the time series
Page 22 | Status: ⏸️ Unattempted
Shared Context:
Question: To test for covariance-stationarity in the data, Johnson would most likely use a:
- A) Durbin-Watson test Correct
- B) t-test
- C) Dickey-Fuller test
Page 22 | Status: ⏸️ Unattempted
Shared Context:
Question: The presence of conditional heteroskedasticity of residuals in Johnson's model is would most likely to lead to:
- A) invalid standard errors of regression coefficients, but statistical tests will still be valid
- B) invalid standard errors of regression coefficients and invalid statistical tests Correct
- C) invalid estimates of regression coefficients, but the standard errors will still be valid
Page 23 | Status: ⏸️ Unattempted
Question: Suppose you estimate the following model of residuals from an autoregressive model: εt 2 = 0.25 + 0.6ε2 t-1 + µt, where ε = ε^ If the residual at time t is 0.9, the forecasted variance for time t+1 is:
- A) 0.790
- B) 0.736
- C) 0.850
Page 23 | Status: ⏸️ Unattempted
Question: David Brice, CFA, has tried to use an AR(1) model to predict a given exchange rate. Brice has concluded the exchange rate follows a random walk without a drift. The current value of the exchange rate is 2.2. Under these conditions, which of the following would be least likely?
- A) The forecast for next period is 2.2 Correct
- B) The residuals of the forecasting model are autocorrelated
- C) The process is not covariance stationary. Housing industry analyst Elaine Smith has been assigned the task of forecasting housing foreclosures. Specifically, Smith is asked to forecast the percentage of outstanding
Page 23 | Status: ⏸️ Unattempted
Shared Context:
Question: The most appropriate interpretation from the foreclosure share regression equation model is:
- A) Multiple-R of the model is 0.75 Correct
- B) Multiple-R of the model is 0.87
- C) Variable STIM explains 37.5% of the variation in foreclosure share
Page 26 | Status: ⏸️ Unattempted
Shared Context:
Question: Based on her regression results in Exhibit 2, using a 5% level of significance, Smith should conclude that:
- A) stimulus packages do not have significant effects on foreclosure percentages, but housing crises do have significant effects on foreclosure percentages
- B) both stimulus packages and housing crises have significant effects on foreclosure percentages
- C) stimulus packages have significant effects on foreclosure percentages, but housing crises do not have significant effects on foreclosure percentages
Page 26 | Status: ⏸️ Unattempted
Shared Context:
Question: The standard error of estimate for Smith's regression is closest to:
- A) 0.16
- B) 0.53
- C) 0.56
Page 27 | Status: ⏸️ Unattempted
Shared Context:
Question: Is Smith correct or incorrect regarding Concerns 1 and 2?
- A) Correct on both Concerns
- B) Only correct on one concern and incorrect on the other Correct
- C) Incorrect on both Concerns
Page 27 | Status: ⏸️ Unattempted
Shared Context:
Question: The most recent change in foreclosure share was +1 percent. Smith decides to base her analysis on the data and methods provided in Exhibit 4 and Exhibit 5, and determines that the two-step ahead forecast for the change in foreclosure share (in percent) is 0.125, and that the mean reverting value for the change in foreclosure share (in percent) is 0.071. Is Smith correct?
- A) Smith is correct on both the forecast and the mean reverting level Correct
- B) Smith is correct on the mean-reverting level for forecast of change in foreclosure share only
- C) Smith is correct on the two-step ahead forecast for change in foreclosure share only
Page 27 | Status: ⏸️ Unattempted
Question: Suppose you estimate the following model of residuals from an autoregressive model: εt 2 = 0.4 + 0.80εt-1 2 + µt, where ε = ε^ If the residual at time t is 2.0, the forecasted variance for time t+1 is:
- A) 2.0
- B) 3.6
- C) 3.2
Page 28 | Status: ⏸️ Unattempted
Question: Barry Phillips, CFA, has the following time series observations from earliest to latest: (5, 6, 5, 7, 6, 6, 8, 8, 9, 11). Phillips transforms the series so that he will estimate an autoregressive process on the following data (1, -1, 2, -1, 0, 2, 0, 1, 2). The transformation Phillips employed is called:
- A) first differencing Correct
- B) beta drift
- C) moving average
Page 28 | Status: ⏸️ Unattempted
Question: Given an AR(1) process represented by xt+1 = b0 + b1×xt + et, the process would not be a random walk if:
- A) the long run mean is b0 / (1-b1)
- B) E(et)=0
- C) b1 = 1
Page 29 | Status: ⏸️ Unattempted
Question: David Wellington, CFA, has estimated the following log-linear trend model: LN(xt) = b0 + b1t + εt. Using six years of quarterly observations, 2001:I to 2006:IV, Wellington gets the following estimated equation: LN(xt) = 1.4 + 0.02t. The first out-of-sample forecast of xt for 2007:I is closest to:
- A) 1.88
- B) 4.14
- C) 6.69
Page 29 | Status: ⏸️ Unattempted
Shared Context:
Question: Are either of the slope coefficients statistically significant?
- A) The simple trend regression is not, but the log-linear trend regression is
- B) Yes, both are significant
- C) The simple trend regression is, but not the log-linear trend regression
Page 32 | Status: ⏸️ Unattempted
Shared Context:
Question: With respect to the possible problems of autocorrelation and nonstationarity, using the log- linear transformation appears to have:
- A) not improved the results for either possible problems Correct
- B) improved the results for nonstationarity but not autocorrelation
- C) improved the results for autocorrelation but not nonstationarity
Page 32 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the simple linear trend model, the forecast of sales for Very Vegan for the first out-of- sample period is:
- A) $97.6 million
- B) $113.0 million
- C) $123.0 million
Page 32 | Status: ⏸️ Unattempted
Shared Context:
Question: Using the log-linear trend model, the forecast of sales for Very Vegan for the first out-of- sample period is:
- A) $109.4 million
- B) $117.0 million
- C)
Page 32 | Status: ⏸️ Unattempted
Question: Barry Phillips, CFA, is analyzing quarterly data. He has estimated an AR(1) relationship (xt = b0 + b1 × xt-1 + et) and wants to test for seasonality. To do this he would want to see if which of the following statistics is significantly different from zero?
- A) Correlation(et, et-5) Correct
- B) Correlation(et, et-4)
- C) Correlation(et, et-1)
Page 33 | Status: ⏸️ Unattempted
Question: To qualify as a covariance stationary process, which of the following does not have to be true?
- A) Covariance(xt, xt-2) = Covariance(xt, xt+2) Correct
- B) E[xt] = E[xt+1]
- C) Covariance(xt, xt-1) = Covariance(xt, xt-2)
Page 33 | Status: ⏸️ Unattempted
Question: Modeling the trend in a time series of a variable that grows at a constant rate with continuous compounding is best done with:
- A) simple linear regression
- B) a log-linear transformation of the time series Correct
- C) a moving average model
Page 33 | Status: ⏸️ Unattempted
Question: Suppose that the time series designated as Y is mean reverting. If Yt+1 = 0.2 + 0.6 Yt, the best prediction of Yt+1 is:
- A) 0.5
- B) 0.8
- C) 0.3
Page 34 | Status: ⏸️ Unattempted
Question: The data below yields the following AR(1) specification: xt = 0.9 – 0.55xt-1 + Et , and the indicated fitted values and residuals. Time xt fitted values residuals 1 1 - - 2 -1 0.35 -1.35 3 2 1.45 0.55 4 -1 -0.2 -0.8 5 0 1.45 -1.45 6 2 0.9 1.1 7 0 -0.2 0.2 8 1 0.9 0.1 9 2 0.35 1.65 The following sets of data are ordered from earliest to latest. To test for ARCH, the researcher should regress:
- A) (1, 4, 1, 0, 4, 0, 1, 4) on (1, 1, 4, 1, 0, 4, 0, 1)
- B) (-1.35, 0.55, -0.8, -1.45, 1.1, 0.2, 0.1, 1.65) on (0.35, 1.45, -0.2, 1.45, 0.9, -0.2, 0.9, 0.35)
- C) (1.8225, 0.3025, 0.64, 2.1025, 1.21, 0.04, 0.01) on (0.3025, 0.64, 2.1025, 1.21, 0.04, 0.01, 2.7225)
Page 34 | Status: ⏸️ Unattempted
Question: Consider the estimated model xt = -6.0 + 1.1 xt-1 + 0.3 xt-2 + εt that is estimated over 50 periods. The value of the time series for the 49th observation is 20 and the value of the time series for the 50th observation is 22. What is the forecast for the 51st observation?
- A) 23
- B) 24.2
- C) 30.2
Page 35 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding seasonality is least accurate?
- A) The presence of seasonality makes it impossible to forecast using a time-series model
- B) A time series that is first differenced can be adjusted for seasonality by incorporating the first-differenced value for the previous year's corresponding period
- C) Not correcting for seasonality when, in fact, seasonality exists in the time series results in a violation of an assumption of linear regression
Page 35 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding a mean reverting time series is least accurate?
- A) If the time-series variable is x, then xt = b0 + b1xt-1 Correct
- B) If the current value of the time series is above the mean reverting level, the prediction is that the time series will decrease
- C) If the current value of the time series is above the mean reverting level, the prediction is that the time series will increase
Page 38 | Status: ⏸️ Unattempted
Question: Alexis Popov, CFA, is analyzing monthly data. Popov has estimated the model xt = b0 + b1 × xt-1 + b2 × xt-2 + et. The researcher finds that the residuals have a significant ARCH process. The best solution to this is to:
- A) re-estimate the model using only an AR(1) specification
- B) re-estimate the model using a seasonal lag
- C) re-estimate the model with generalized least squares
Page 38 | Status: ⏸️ Unattempted
Question: Suppose that the following time-series model is found to have a unit root: Salest = b0 + b1 Sales t-1+ εt What is the specification of the model if first differences are used?
- A) Salest = b0 + b1 Sales t-1 + b2 Sales t-2 + εt
- B) (Salest - Salest-1)= b0 + b1 (Sales t-1 - Sales t-2) + εt
- C) Salest = b1 Sales t-1+ εt
Page 39 | Status: ⏸️ Unattempted
Shared Context:
Question: How many dummy variables should Rathod use?
- A) Four
- B) Six
- C) Five
Page 40 | Status: ⏸️ Unattempted
Shared Context:
Question: What is most likely represented by the intercept of the regression?
- A) The drift of a random walk
- B) The return on a particular trading day Correct
- C) The intercept is not a driver of returns, only the independent variables
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: What can be said of the overall explanatory power of the model at the 5% significance?
- A) The coefficient of determination for the above regression is significantly higher than the standard error of the estimate, and therefore there is value to calendar trading
- B) There is value to calendar trading
- C) There is no value to calendar trading
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: The test mentioned by Jessica is known as the:
- A) Breusch-Pagan, which is a two-tailed test Correct
- B) Breusch-Pagan, which is a one-tailed test
- C) Durbin-Watson, which is a two-tailed test
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: Are Jessica and her son Jonathan, correct in terms of the method used to correct for heteroskedasticity and the likely effects?
- A) Neither is correct Correct
- B) Both are correct
- C)
Page 41 | Status: ⏸️ Unattempted
Shared Context:
Question: Assuming the a1 term of an ARCH(1) model is significant, the following can be forecast:
- A) The variance of the error term Correct
- B) A significant a1 implies that the ARCH framework cannot be used
- C) The square of the error term
Page 42 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding time series analysis is least accurate?
- A) If a time series is a random walk, first differencing will result in covariance stationarity
- B) We cannot use an AR(1) model on a time series that consists of a random walk Correct
- C) An autoregressive model with two lags is equivalent to a moving-average model with two lags
Page 42 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding an out-of-sample forecast is least accurate?
- A) Out-of-sample forecasts are of more importance than in-sample forecasts to the analyst using an estimated time-series model
- B) There is more error associated with out-of-sample forecasts, as compared to in- sample forecasts
- C) Forecasting is not possible for autoregressive models with more than two lags
Page 42 | Status: ⏸️ Unattempted
Question: The regression results from fitting an AR(1) to a monthly time series are presented below. What is the mean-reverting level for the model? Model: ΔExpt = b0 + b1ΔExpt–1 + εt Coefficients Standard Error t-Statistic p-value Intercept 1.3304 0.0089 112.2849 < 0.0001 Lag-1 0.1817 0.0061 30.0125 < 0.0001
- A) 0.6151
- B) 7.3220
- C) 1.6258
Page 43 | Status: ⏸️ Unattempted
Question: Consider the estimated model xt = −6.0 + 1.1 xt − 1 + 0.3 xt − 2 + εt that is estimated over 50 periods. The value of the time series for the 49th observation is 20 and the value of the time series for the 50th observation is 22. What is the forecast for the 52nd observation?
- A) 42
- B) 24.2
- C) 27.22
Page 43 | Status: ⏸️ Unattempted
Question: Alexis Popov, CFA, has estimated the following specification: xt = b0 + b1 × xt-1 + et. Which of the following would most likely lead Popov to want to change the model's specification?
- A) Correlation(et, et-2) is significantly different from zero Correct
- B) b0 < 0
- C) Correlation(et, et-1) is not significantly different from zero
Page 43 | Status: ⏸️ Unattempted
Shared Context:
Question: If his assumption about a constant is correct, which of the following models is most appropriate for modeling these data?
- A) LuxCarSalest = b0 + b1LuxCarSales(t-1) + et
- B) ln(LuxCarSales) = b0 + b1(t) + et
- C) LuxCarSales = b0 + b1(t) + et
Page 44 | Status: ⏸️ Unattempted
Shared Context:
Question: Bert is aware that the Dickey Fuller test can be used to discover whether a model has a unit root. He is also aware that the test would use a revised set of critical t-values. What would it mean to Bert to reject the null of the Dickey Fuller test (Ho: g = 0)?
- A) There is a unit root and the model cannot be used in its current form Correct
- B) There is a unit root but the model can be used if covariance-stationary
- C) There is no unit root
Page 45 | Status: ⏸️ Unattempted
Shared Context:
Question: Bert would also like to test for serial correlation in his AR(1) model. How could this be done?
- A) use the provided Durbin-Watson statistic and compare it to a critical value
- B) determine if the series has a finite and constant covariance between leading and lagged terms of itself
- C) use a t-test on the residual autocorrelations over several lags
Page 45 | Status: ⏸️ Unattempted
Shared Context:
Question: When using the root mean squared error (RMSE) criterion to evaluate the predictive power of the model, which of the following is the most appropriate statement?
- A) Use the model with the lowest RMSE calculated using the in-sample data Correct
- B) Use the model with the lowest RMSE calculated using the out-of-sample data
- C)
Page 45 | Status: ⏸️ Unattempted
Shared Context:
Question: Bert would like to use his AR(1) model to forecast future sales of luxury automobiles. What is the annualized growth rate between today and 20X3?
- A) 11%
- B) 12%
- C) 10%
Page 46 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding covariance stationarity is CORRECT?
- A) The estimation results of an AR model involving a time series that is not covariance stationary are meaningless
- B) A time series that is covariance stationary may have residuals whose mean changes over time
- C) A time series may be both covariance stationary and heteroskedastic
Page 46 | Status: ⏸️ Unattempted
Question: Which of the following statements regarding unit roots in a time series is least accurate?
- A) A time series that is a random walk has a unit root Correct
- B) A time series with a unit root is not covariance stationary
- C) Even if a time series has a unit root, the predictions from the estimated model are valid
Page 47 | Status: ⏸️ Unattempted
Reading 3 Machine Learning 20 questions
Question: The technique in which a machine learns to model a set of output data from a given set of inputs is best described as:
- A) supervised learning Correct
- B) deep learning
- C) unsupervised learning. Joyce Tan manages a medium-sized investment fund at Marina Bay Advisors that specializes in international large cap equities. Over the four years that she has been portfolio manager, Tan has been invested in approximately 40 stocks at a time. Tan has used a number of methodologies to select investment opportunities from the universe of investable stocks. In some cases, Tan uses quantitative measures such as accounting ratios to find the most promising investment candidates. In other cases, her team of analysts suggest investments based on qualitative factors and various investment hypotheses. Tan begins to wonder if her team could leverage financial technology to make better decisions. Specifically, she has read about various machine learning techniques to extract useful information from large financial datasets, in order to uncover new sources of alpha
Page 1 | Status: ⏸️ Unattempted
Shared Context:
Question: Tan is interested in using a supervised learning algorithm to analyze stocks. This task is least likely to be a classification problem if the target variable is:
- A) continuous
- B) ordinal Correct
- C) categorical
Page 1 | Status: ⏸️ Unattempted
Shared Context:
Question: At first Tan bases her stock picks on the results of a single machine-learning model, but then begins to wonder if she should instead be using the predictions of a group of models. Compared to a single machine-learning model, an ensemble machine learning algorithm is most likely to produce predictions that are:
- A) more accurate and more stable Correct
- B) more precise but less dependable
- C) less reliable but more steady
Page 2 | Status: ⏸️ Unattempted
Shared Context:
Question: Tan is interested in applying neural networks, deep learning nets, and reinforcement learning to her investment process. Regarding these techniques, which of the following statements is most accurate?
- A) Neural networks with one or more hidden layers would be considered deep learning nets (DLNs)
- B) Reinforcement learning algorithms achieve maximum performance when they stay as far away from their constraints as possible
- C) Neural networks work well in the presence of non-linearities and complex interactions among variables
Page 2 | Status: ⏸️ Unattempted
Question: Considering the various supervised machine learning algorithms, a penalized regression where the penalty term is the sum of the absolute values of the regression coefficients best describes:
- A) support vector machine (SVM)
- B) least absolute shrinkage and selection operator (LASSO) Correct
- C) k-nearest neighbor (KNN)
Page 3 | Status: ⏸️ Unattempted
Question: The unsupervised machine learning algorithm that reduces highly correlated features into fewer uncorrelated composite variables by transforming the feature covariance matrix best describes:
- A) k-means clustering Correct
- B) hierarchical clustering
- C) principal components analysis
Page 3 | Status: ⏸️ Unattempted
Question: Overfitting is least likely to result in:
- A) higher number of features included in the data set Correct
- B) higher forecasting accuracy in out-of-sample data
- C) inclusion of noise in the model
Page 3 | Status: ⏸️ Unattempted
Question: An algorithm that involves an agent that performs actions that will maximize its rewards over time, taking into consideration the constraints of its environment, best describes:
- A) deep learning nets
- B) neural networks Correct
- C) reinforcement learning
Page 4 | Status: ⏸️ Unattempted
Question: What is the appropriate remedy in the presence of excessive number of features in a data set?
- A) Big data analysis
- B) Dimension reduction Correct
- C) Unsupervised learning
Page 4 | Status: ⏸️ Unattempted
Question: Which of the following statements about supervised learning is most accurate?
- A) Supervised learning requires human intervention in machine learning process Correct
- B)
- C) Supervised learning does not differentiate between tag and features.
Page 4 | Status: ⏸️ Unattempted
Question: Dimension reduction is most likely to be an example of:
- A) clustering
- B) supervised learning Correct
- C) unsupervised learning
Page 5 | Status: ⏸️ Unattempted
Question: In machine learning, out-of-sample error equals:
- A) Standard error plus data error plus prediction error
- B) bias error plus variance error plus base error Correct
- C) forecast error plus expected error plus regression error. Hanna Kowalski is a senior fixed-income portfolio analyst at Czarnaskala BP. Kowalski supervises Lena Nowak, who is a junior analyst. Over the past several years, Kowalski has become aware that investment firms are increasingly using technology to improve their investment decision making. Kowalski has become particularly interested in machine learning techniques and how they might be applied to investment management applications. Kowalski has read a number of articles about machine learning in various journals for financial analysts. However, she has only a minimal knowledge of how she might source appropriate model inputs, interpret model outputs, and translate those outputs into investment actions. Kowalski and Nowak meet to discuss plans for incorporating machine learning into their investment model. Kowalski asks Nowak to research machine learning and report back on
Page 5 | Status: ⏸️ Unattempted
Shared Context:
Question: Nowak first tries to explain classification and regression tree (CART) to Kowalski. CART is least likely to be applied to predict a:
- A) discrete target variable, producing a cardinal tree
- B) continuous target variable, producing a regression tree Correct
- C) categorical target variable, producing a classification tree
Page 6 | Status: ⏸️ Unattempted
Shared Context:
Question: Which of the following statements Nowak makes about hierarchical clustering is most accurate?
- A) Bottom-up hierarchical clustering begins with each observation being its own cluster
- B) In divisive hierarchical clustering, the algorithm seeks out the two closest clusters
- C) Hierarchical clustering is a supervised iterative algorithm that is used to build a hierarchy of clusters
Page 6 | Status: ⏸️ Unattempted
Shared Context:
Question: Nowak tries to explain the reinforcement learning (RL) algorithm to Kowalski and makes a number of statements about it. The reinforcement learning (RL) algorithm involves an agent that is most likely to:
- A) perform actions that will minimize costs over time
- B) take into consideration the constraints of its environment Correct
- C) make use of direct labeled data and instantaneous feedback
Page 7 | Status: ⏸️ Unattempted
Question: A random forest is least likely to:
- A) reduce signal-to-noise ratio Correct
- B) provide a solution to overfitting problem
- C) be a classification tree
Page 7 | Status: ⏸️ Unattempted
Question: A rudimentary way to think of machine learning algorithms is that they:
- A) “develop the pattern, interpret the pattern.” Correct
- B) “synthesize the pattern, review the pattern.”
- C)
Page 7 | Status: ⏸️ Unattempted
Question: Which of the following about unsupervised learning is most accurate?
- A) There is no labeled data Correct
- B) Unsupervised learning has lower forecasting accuracy as compared to supervised learning
- C) Classification is an example of unsupervised learning algorithm
Page 8 | Status: ⏸️ Unattempted
Question: The degree to which a machine learning model retains its explanatory power when predicting out-of-sample is most commonly described as:
- A) hegemony
- B) generalization Correct
- C) predominance
Page 8 | Status: ⏸️ Unattempted
Question: Considering the various supervised machine learning algorithms, a linear classifier that seeks the optimal hyperplane and is typically used for classification, best describes:
- A) classification and regression tree (CART)
- B) support vector machine (SVM) Correct
- C) k-nearest neighbor (KNN)
Page 8 | Status: ⏸️ Unattempted
Reading 4 Big Data Projects 10 questions
Question: Which of the following uses of data is most accurately described as curation?
- A) An investor creates a word cloud from financial analysts’ recent research reports about a company
- B) An analyst adjusts daily stock index data from two countries for their different market holidays
- C) A data technician accesses an offsite archive to retrieve data that has been stored there. Freja Karlsson is a bond analyst with Storbank AB. Over the past several months, Karlsson has been working to develop her own machine learning (ML) model that she plans to use to predict default of the various bonds that she covers. The inputs to the model are various pieces of financial data that Karlsson has compiled from multiple sources. After Karlsson has constructed the model using her knowledge of appropriate variables, Karlsson runs the model on the training set. Each firm's bonds are classified as predicted- to- default or predicted-not-to-default. When Karlsson's model predicts that a bond will default and the bond actually defaults, Karlsson considers this to be a true positive. Karlsson then evaluates the performance of her model using error analysis. The confusion matrix that results is shown in Exhibit 1. N = 474 Actual Bond Status Bond Default No Default Model Prediction Bond Default 307 31 No Default 23 113
Page 1 | Status: ⏸️ Unattempted
Shared Context:
Question: Karlsson is especially concerned about the possibility that her model may indicate that a bond will not default, but then the bond actually defaults. Karlsson decides to use the model's recall to evaluate this possibility. Based on the data in Exhibit 1, the model's recall is closest to:
- A) 93%
- B) 83%
- C) 73%
Page 2 | Status: ⏸️ Unattempted
Shared Context:
Question: Karlsson would like to gain a sense of her model's overall performance. In her research, Karlsson learns about the F1 score, which she hopes will provide a useful measure. Based on Exhibit 1, Karlsson's model's F1 score is closest to:
- A) 82%
- B) 92%
- C) 72%
Page 2 | Status: ⏸️ Unattempted
Shared Context:
Question: Karlsson also learns of the model measure of accuracy. Based on Exhibit 1, Karlsson's model's accuracy metric is closest to:
- A) 79%
- B) 89%
- C) 69%.
Page 2 | Status: ⏸️ Unattempted
Question: Big data is most likely to suffer from low:
- A) veracity
- B) velocity Correct
- C) variety
Page 3 | Status: ⏸️ Unattempted
Question: In big data projects, data exploration is least likely to encompass:
- A) feature design
- B) feature engineering Correct
- C) feature selection
Page 3 | Status: ⏸️ Unattempted
Question: Under which of these conditions is a machine learning model said to be underfit?
- A) The input data are not labelled
- B) The model treats true parameters as noise Correct
- C) The model identifies spurious relationships
Page 3 | Status: ⏸️ Unattempted
Question: The process of splitting a given text into separate words is best characterized as:
- A) stemming
- B) tokenization Correct
- C)
Page 3 | Status: ⏸️ Unattempted
Question: An executive describes her company's "low latency, multiple terabyte" requirements for managing Big Data. To which characteristics of Big Data is the executive referring?
- A) Volume and variety
- B) Volume and velocity Correct
- C) Velocity and variety
Page 4 | Status: ⏸️ Unattempted
Question: When evaluating the fit of a machine learning algorithm, it is most accurate to state that:
- A) accuracy is the ratio of correctly predicted positive classes to all predicted positive classes
- B) recall is the ratio of correctly predicted positive classes to all actual positive classes
- C) precision is the percentage of correctly predicted classes out of total predictions
Page 4 | Status: ⏸️ Unattempted