Home Page > > Details

STAT 5511 Homework 6

 STAT 5511 (Spring 2020) Homework 6

Charles R. Doss
General formatting guidelines:
The usual formatting rules:
• Your homework (HW) should be formatted to be easily readable by the grader.
• You may use knitr or Sweave in general to produce the code portions of the HW. However, the output from knitr/Sweave that you include should be
only what is necessary to answer the question, rather than just any automatic output that R produces. (You may thus need to avoid using default R
functions if they output too much unnecessary material, and/or should make use of invisible() or capture.output().)
– For example: for output from regression, the main things we would want to see are the estimates for each coefficient (with appropriate labels
of course) together with the computed OLS/linear regression standard errors and p-values.
• Code snippets that directly answer the questions can be included in your main homework document; ideally these should be preceded by comments
or text at least explaining what question they are answering. Extra code can be placed in an appendix.
• All plots produced in R should have appropriate labels on the axes as well as titles. Any plot should have explanation of what is being plotted given
clearly in the accompanying text.
• Plots and figures should be appropriately sized, meaning they should not be too large, so that the page length is not too long. (The arguments
fig.height and fig.width to knitr chunks can achieve this.)
Instructions: For this homework, you will analyze three data sets. Find the file hw6dat.rsav on the
course webpage. Load it into R by running load("path/hw6dat.rsav") where ‘path/hw6dat.rsav’ is
replaced by the full path on your hard drive to the file hw6dat.rsav (the syntax for which is operating
system dependent). The file contains three data objects: dat1,dat2, and dat3. Each dataset is a separate
homework question. The analysis for each dataset should begin on a new page and should have as label
the name of the dataset (dat1, dat2, dat3). Your tasks are different for the three datasets and they are as
follows.
1. For dat1: Your job is to fit the best SARIMA(p, d, q) × (P, D, Q)s model you can to the dataset.
2. For dat2:
(a) Compute the (unsmoothed) periodogram in R, ’by hand’, meaning that you should use the fast
Fourier transform and not use spec.pgram(). (You may use fft().) You should apply the FFT
to the centered or de-meaned data series.
(b) Write code to compute the averaged periodogram ’by hand’ (i.e., do not use spec.pgram()).
You will then use your code to make two plots (not on the log scale) of the averaged periodogram
with two different values of the smoothing parameter L: L = 5 and L = 17. Explain why it is
the case that we wish to plot the averaged periodogram with two L values.
[Note: you do not need to compute the averaged periodogram for any bandwidths that are too
close to 0 for the formula to be well-defined.]
(c) Plot the logged (base 10) periodogram with L = 17 from the previous part. (Do this ‘by hand’,
meaning: do not use someone else’s code or function that does periodogram plotting for you.)
Plot two dotted lines above and below the periodogram giving 95% confidence bounds for the
periodogram.1 State in your text the width of the confidence intervals.
3. For dat3: dat3 has two columns, yy and xx, which are both time series. Your job is to regress yy
on xx using the (“transfer modelling”) methodology we discussed in class.
Presentation/formatting rules: for questions 1 and 3, your output should be in the following format.
Points will be deducted if it is not. (This formatting requirement does not apply to the spectral analysis
of dat2.)
1You do not need to adjust for multiple comparisons.
• On the first page of output for each problem, you should first have a summary (labeled “Summary”)
that provides the model chosen, parameter estimates, standard errors, and p-values in that model.
Specify explicitly if you exclude a constant term. For example, “For the series Yt = X
an SARIMA(1, 2, 3) × (4, 5, 6)7 model, including intercept term. The parameter estimates were ...”.
If you believe the data cannot distinguish between two (or more) models you should describe both
(all) of them in this manner here. In the case of a regression model, you should explain the full
model, meaning which lags of which variables are included in the regression model as well as what
the ARMA model of the errors is.
• After the summary should be an explanation (labeled “Explanation”). Provide a clear explanation of
why you selected the model you selected. Refer to the output of your analysis, which will be below.
The model selection and diagnostic techniques we have discussed in class can be discussed here. You
do not need to (and should not) provide an exhaustive list of all possible models, but should rather
provide explanation for which models were reasonable contenders (and why), and which model (or
models) were the best out of those contenders (and why).
• After the explanation is the “Output” you refer to in your explanation. (The output may be plots
or output from various commands.) All of it should be clearly formatted, and labeled or described.
You do not need to provide exhaustive output from every command you have run, but you should
include enough to justify all the arguments you make in your summary.
Finally, in Questions 1 and 2 please refer to the original/raw (untransformed) time series as Xt in your
descriptions and as xx in your code. Refer to any transformed series as Zt in your descriptions and zz in
your code. In Question 3, the two series are named xx and yy and you should not rename them.
Contact Us - Email:99515681@qq.com    WeChat:codinghelp
Programming Assignment Help!