[go: up one dir, main page]

0% found this document useful (0 votes)
5 views4 pages

Sta37w1 Sick Test 1 2025

WSU STA test

Uploaded by

akhanyamzazi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

Sta37w1 Sick Test 1 2025

WSU STA test

Uploaded by

akhanyamzazi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

FACULTY OF NATURAL SCIENCES

DEPARTMENT OF MATHEMATICAL SCIENCES AND COMPUTING

SICK TEST 1
LINEAR REGRESSION & MULTIVARIATE STATISTICS
STA37W1

EXAMINER : MR. I. JUBANE

INTERNAL MODERATOR : DR O. BODHLYERA

DATE : 09 MAY 2025

MARKS : 78

This test consists of 9 pages including the cover page

INSTRUCTIONS

1. Write neatly and eligibly.


2. Once completed, scan all pages into a single PDF file.
3. Do not upload image files (e.g., JPG, PNG). Only one PDF file will be accepted.
4. Ensure that your scanned PDF is clear, and that all pages are visible. Use a well-lit environment
when using a camera.
5. Each page of your test must clearly display your full name and student number.
6. Begin your first page with a cover section that includes your full name, student number, and the
title of the test.
7. Clearly number your answers according to the question numbers (e.g., 1.1, 1.2, etc.).
8. Read each question carefully and answer only what is asked. Avoid adding unnecessary
information.
9. Unless otherwise specified, round all numerical answers to four decimal places.
QUESTION ONE [12 MARKS]

Consider a regression problem in which, for each value x of a certain variable X, the random variable
Y has the normal distribution with mean  x and variance  2 , where
the values of  and  2 are unknown. Suppose that n independent pairs of observations (xi, Yi) are
obtained.

1.1 Show that the M.L.E. of  is ˆ 


 xYi i i
[6]
x i
2
i

1.2 Given that the measured values of x and y are given in the table below,

x 1.9 0.8 1.1 0.1 -0.1 4.4 4.6 1.6 5.5 3.4
y 0.7 -1 -0.2 -1.2 -0.1 3.4 0 0.8 3.7 2

Determine the values of  and var(ˆ). [6]

QUESTION TWO [15 MARKS]

Let Y ~ MVNX ,  2I , where X is an n  p matrix with linearly independent columns. For least

squares estimation, recall that ˆ  (I - H)Y .

2.1 What is the distribution of (I  H)Y , where H is the projection onto the column space of X?
[6]
2.2 What is the distribution of Y(I  H)Y  ?
2
[6]
1
2.3 Evaluate E [( YAY )   2 )2 ] for A  (I  X(XX)1 X). [5]
n p 2

Derive
2.4 E [ˆ] [3]
2.5 cov[ˆ] [6]
2.6 cov ˆ, HY  [6]
QUESTION THREE [16 MARKS]

3.1 Let Yi  0  1xi  i , (i  1,, n ) where []  0 and var     2I . Show that the least
squares estimates of 0 and 1 are uncorrelated if and only if x  0. [4]

3.2 What does BLUE mean? Show that the β̂ is a BLU estimator β . [6]

3.3 Suppose that 1  2    p1  0. Express the F-statistic in terms of R 2 and hence show
that

p 1
[R 2 ]  .
n 1

Show all working. [6]

QUESTION FOUR [20 MARKS]

Let
2  1  2  1
1  21  22  2
1  1  2  3
0  21  32  4
where i iid N (0,  2 ), i  1,2,3, 4.

4.1 Write the regression model in matrix notation. [1]

4.2 Estimate βˆ . [5]

4.3 Write down the hat matrix, H and show that H is an idempotent matrix, hence test the
hypothesis H0 : 2  1, [8]

4.4 Derive the F-statistic for testing the hypothesis H0 : 1  2 . [6]


QUESTION FIVE [15 MARKS]

An experiment involved a quantitative analysis of factors found in high-density lipoprotein (HDL) in


a sample of human blood serum. The dataset consists of 22 observations in which three variables
thought to be predictive of or associated with HDL measurements were recorded: total cholesterol,
total triglyceride concentration, and the presence or absence of a sticky component called sinking
pre-beta (SPB). The data correspond specifically to samples where SPB was absent.

 963   0.6165839376 0.0017115921 0.0008212202 


   
Given that X Y   250611 , X X   0.0017115921 0.6165839376 8.951846106 
   
141832 0.0008212202 8.951846106 2.155576105 
   
and

F-statistic: 0.4394 on 2 and 19 DF, p-value: 0.6508

Calculate the R 2 value and obtain a 95% confidence interval for the mean HDL level when total
cholesterol is 250 and total triglyceride concentration is 100. [15]

You might also like