[go: up one dir, main page]

Academia.eduAcademia.edu
51 ISSN 1684 – 8403 Journal of Statistics Vol: 12, No.1 (2005) ______________________________________________________ An Empirical Study of some Unequal Probability Sampling Estimators Usman Ali Khan* Muhammad Qaiser Shahbaz** Abstract An empirical study has been carried out to decide about the performance of various estimators used in unequal probability sampling without replacement and a sample of size 2. The Hansen–Hurwitz estimator and simple random sampling method has also been compared in this study. Some suggestions have been given at the end. Key Words: Unequal probability sampling without replacement. 1. Introduction The unequal probability sampling has its emergence in early forties, when Hansen and Hurwitz (1943) first introduced the concept. The sampling design proposed by them was used with replacement sampling only. The estimator proposed by Hansen and Hurwitz to estimate the population total is given as: n yi 1 y ′HH = n i =1 pi ∑ (1.1) where p i is probability of selection of i-th unit. The sampling variance of Hansen – Hurwitz estimator has different forms given as: 52 Khan and Shahbaz ______________________________________________________ * AC-Nilson Aftab Associates, Lahore. ** Department of Statistics, Government College University, Lahore. / Var ( y HH ⎛ Yi Yj ⎞ 1 N PP − ) = i j⎜ ⎜ P P ⎟⎟ 2n iΣΣ =1 j =1 i j ⎠ ⎝ 2 j ≠i (1.2) 1 = n N ∑ P (Y − P Y ) 1 i i =1 2 i i i (1.3) The concept of unequal probability sampling without replacement was first introduced by Madow (1949) but no theoretical framework was given. Horvitz and Thompson (1952) gave the first theoretical framework of unequal probability sampling. They also proposed their selection procedure and an estimator of population total. The estimator proposed by Horvitz and Thompson is given as: Y y ′HT = ∑ i , i∈s πi (1.4) where π i is probability of inclusion of i-th unit in the sample. Horvitz and Thompson gave following variance formula for estimator (1.4). N N (π ij − π i π j ) ( 1− π i ) 2 V ( y ′HT ) = Yi + Yi Y j πi π iπ j i =1 i , j =1 ∑ ∑∑ j ≠i (1.5) an alternative expression, for fixed n, given by Sen (1953) and independently by Yates and Grundy (1953), is: V ( y ′HT ) = ΣΣ (π π N i i =1 j =1 j >i (1.6) j − π ij ) ⎛ Yi Y j ⎞ ⎟ ⎜ − ⎜πi π j ⎟ ⎠ ⎝ 2 53 Since the emergence of Horvitz and Thompson estimator a number of selection procedures have been developed that can be used with this estimator. Raj (1956a) introduced his estimator based on the order of selection of units. The estimator proposed by Raj (1956a) is given as: t mean = 1 n n ∑t (1.7) r r =1 where r −1 ⎞ ⎛ ⎜1 − pi ⎟ (1.8) ⎟ ⎜ i =1 i =1 ⎠ ⎝ The Raj (1956a) estimator for a sample of size 2 is given as: ⎤ y 1⎡y (1.9) t mean = ⎢ i (1 + pi ) + i 1 − p j ⎥ 2 ⎣ pi pi ⎦ the sampling variance of estimator given in (1.9) is given as: r −1 tr = ∑ yi + ∑ yr pr ( ) ( 1 N Var (t mean ) = Pi Pj 2 − Pi − Pj 8 i =1 j =1 ΣΣ j ≠i ⎛ Yi Y j ⎞ ⎜ − ⎟ ⎜ Pi Pj ⎟ ⎝ ⎠ ) 2 (1.10) Raj’s estimator has defect that it is based on order of the units in which they are selected. Murthy (1957) uses the idea of sufficiency to overcome the defect of Raj estimator. He symmetrized the Raj estimator to produce an un-ordered estimator. The estimator proposed by Murthy has general form: t symm = ( ) 1 P (s ) n ∑ P (s i ) y (1.11) i i =1 where P s i is the probability of obtaining a sample “s” given that ith unit has been already selected and P (s ) is the probability of obtaining a sample “s” The Murthy (1957) estimator for a sample of size 2 is given as: ⎡ yi ⎤ yj 1 (1 − pi )⎥ t symm = (1.12) ⎢ 1− p j + pj 2 − pi − p j ⎣⎢ pi ⎦⎥ The sampling variance for estimator given in (1.12) is given as: ( ) 54 Khan and Shahbaz ( Var t symm ( 1 N Pi Pj 1 − Pi − Pj = 2 i =1 j =1 2 − Pi − Pj ) ΣΣ ) ⋅ ⎛⎜ Yi − Y j ⎞⎟ 2 j ≠i ⎜ Pi ⎝ Pj ⎟⎠ (1.13) The estimators given so far, for unequal probability sampling without replacement, are very hard to apply for a sample of size more than 2. To overcome this defect Rao – Hartley and Cochran (1962) proposed an estimator that can be used with a sample of any size. The estimator proposed by them is given as: n π i y iT / (1.14) y RHC = p iT i =1 ∑ where p iT is the probability of T-th unit selected from the i-th group. Also π i = Ni ∑ n piT and T =1 ∑π i = 1. The sampling variance of i =1 estimator given in (1.14) is: ⎛ n ⎞ n⎜ N i2 − N ⎟ ⎜ ⎟ ⎡ n Ni YiT2 i =1 Y2⎤ ⎝ ⎠ / ⎥ ⋅⎢ − Var y RHC = N ( N − 1) n ⎥ ⎢⎣ i =1 T =1 n PiT ⎦ ( 2. ) ∑ ∑∑ (1.15) The Empirical Study In this section the empirical study has been given in order to decide about the performance of various estimators in unequal probability sampling without replacement. To carry out the study fifty natural populations have been used, which are given in standard texts on sampling techniques. The sampling variance of estimators given in section 1 has been obtained for all the populations. After evaluating the sampling variance, ranking has been done for each estimator according to the sampling variance. The average rank of each estimator has been calculated for various ranges of ranks of coefficient of variation of measure of size and correlation coefficient between actual variable of study and the measure of size. The average ranks have also been calculated for various actual ranges of the coefficient of variation and correlation coefficient. It should be noticed that an estimator with smaller 55 average rank will have better performance as compared to some other estimator with a larger average rank. The results of the empirical study have been given in following tables. Table 1: Average Ranks of Various Estimators with ranks of Coefficient of Variation. CV (Z) SRS HH HT (YG) 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 5.8 5.2 7.0 5.2 5.8 6.1 6.3 6.0 6.2 6.0 2.6 3.3 2.2 2.3 2.6 HT Brewer RHC Raj Murthy 3.3 3.1 3.8 4.1 3.5 2.8 3.6 3.4 4.4 4.5 4.7 4.1 3.5 3.5 3.5 2.4 2.4 2.1 2.3 2.1 Table 2: Average Ranks of Various Estimators with ranks of Correlation Coefficient. ρYZ SRS HH HT (YG) HT Brewer RHC Raj Murthy 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 4.0 5.8 6.4 5.8 7.0 6.5 6.2 6.1 6.2 5.6 2.0 2.0 3.2 2.0 3.8 3.7 4.2 2.7 3.7 3.5 3.9 3.6 4.1 3.8 3.3 4.9 3.9 3.4 3.9 3.2 2.9 2.3 2.1 2.5 1.5 Table 3: Average Ranks of Various Estimators with various ranges of Correlation Coefficient. CV (Z) SRS HH HT (YG) Less than 0.5 0.5<CV<1.0 Greater than 1 5.89 5.67 5.80 6.15 6.11 6.00 2.74 2.28 3.00 HT (Brewer) RHC 3.41 3.78 3.60 3.19 4.44 4.20 Raj Murthy 4.19 3.44 3.60 2.33 2.28 1.80 Table 4: Average Ranks of Various Estimators with various ranges of Coefficient of Variation. 56 Khan and Shahbaz ρYZ SRS HH HT (YG) HT (Brewer) RHC Raj Murth y ρYZ < 0.5 0.5 <ρYZ<0.9 0.9<ρYZ<1.0 2.71 6.14 6.45 6.71 6.14 5.91 2.00 2.67 2.73 4.14 3.38 3.55 4.14 3.67 3.68 5.00 3.81 3.55 3.14 2.19 2.05 3. Conclusions The empirical study of various estimators has been given in section 2. The results of this study have been given in Table–1 through Table–4. Table–1 contains the average ranks of various estimators along with the group ranks of coefficient of variation of measure of size. From this table we can see that the Murthy estimator clearly outperform other estimators and is closely followed by the Horvitz – Thompson estimator under the Yates–Grundy draw-by-draw procedure for all ranges of coefficient of variation. Table–2 contains the average ranks along with the group ranks of correlation coefficient. From this table we can see that for smaller range of correlation coefficient the Horvitz–Thompson estimator under Yates–Grundy draw-by-draw procedure performs reasonably well as compared to other estimators. For other ranges the Murthy estimator again outperforms other estimators. Table–3 contains the average ranks of various estimators along with the actual values of coefficient of variation. The value of coefficient of variation has been divided in three ranges, that is, less than 0.5, between 0.5 and 1.0 and greater than 1.0. From this table we can again see that Murthy estimator outperform all other estimators and is again closely followed by the Horvitz–Thompson estimator under the Yates–Grundy draw-by-draw procedure. Table – 4 contains the average ranks of various estimators along with the actual values of correlation coefficient. Again the value of correlation coefficient has been divided in three ranges, that is, less than 0.5, between 0.5 and 0.9 and between 0.9 and 1.0. From this table it can bee seen that for smaller values of correlation coefficient, that is less than 0.5, the Horvitz – Thompson estimator under Yates – Grundy draw-by-draw procedure clearly outperforms all other estimators 57 and is closely followed by the simple random sampling procedure. For other values of correlation coefficient the Murthy estimator outperforms all other estimators. In general, we can see that the Murthy estimator performs reasonably well then all other estimators for almost all criterions and hence this estimator should be used for estimation of population total. For populations that have smaller correlation coefficient between the measure of size and correlation coefficient, the Horvitz – Thompson estimator under Yates – Grundy draw-bydraw procedure can produce reasonably good results. The simple random sampling procedure can also produce efficient results for populations having smaller correlation coefficient between variable of study and measure of size. 58 Khan and Shahbaz References 1. 2. 3. 4. 5. 6. 7. 8. Brewer, K. R. W. (1963a) “A model of systematic sampling with unequal probabilities”, Aust. J. Stat. 5, 5 – 13. Das, A. C. (1951) “On two phase sampling and sampling with varying probabilities”, Bull. Intr. Stat. Inst., 33, Book 2, 105 – 112. Hansen, M. H. and Hurwitz, W. N. (1943) “On the theory of sampling from a finite population”, Ann. Math. Stat. 14, 333 – 362. Horvitz, D. G. and Thompson, D. J. (1952) “A generalization of sampling without replacement from a finite universe”, J. Amer. Stat. Assoc. 47, 663 – 685. Murthy, M. N. (1957) “Ordered and unordered estimators in sampling without replacement”, Sankhya, 18, 379 – 390. Raj, D. (1956a) “Some estimators in sampling with varying probabilities without replacement”, J. Amer. Stat. Assoc. 51, 269 – 284. Rao, J. N. K., Hartley, H. O. and Cochran, W. G. (1962) “On a simple procedure of unequal probability sampling without replacement”, J. Roy. Stat. Soc., B, 24, 482 – 491. Yates, F. and Grundy, P. M. (1953) “Selection without replacement from within strata with probability proportional to size”, J. Roy. Stat. Soc., B, 15, 153 – 161.