Monthly Archives: April 2015

MAP holds torch parade, Supports BBL

By Tu Alid Alfonso COTABATO City (April 29, 2015)—The participants of the ‘torch for peace parade’ organized by the Mindanao Alliance for Peace (MAP) have traversed streets of Sinsuat Avenue from Cotabato City State Polytechnic College (CCSPC) to Cotabato City Plaza and vice versa on the evening of Monday, 6:30pm-7:30pm, April 27, 2015. MAP Spokesperson

Parametric Inference: Likelihood Ratio Test by Example

Hypothesis testing have been extensively used on different discipline of science. And in this post, I will attempt on discussing the basic theory behind this, the Likelihood Ratio Test (LRT) defined below from Casella and Berger (2001), see reference 1.

Definition. The likelihood ratio test statistic for testing $H_0:thetainTheta_0$ versus $H_1:thetainTheta_0^c$ is begin{equation} label{eq:lrt} lambda(mathbf{x})=frac{displaystylesup_{thetainTheta_0}L(theta|mathbf{x})}{displaystylesup_{thetainTheta}L(theta|mathbf{x})}. end{equation} A likelihood ratio test (LRT) is any test that has a rejection region of the form ${mathbf{x}:lambda(mathbf{x})leq c}$, where $c$ is any number satisfying $0leq c leq 1$.

The numerator of equation (ref{eq:lrt}) gives us the supremum probability of the parameter, $theta$, over the restricted domain (null hypothesis, $Theta_0$) of the parameter space $Theta$, that maximizes the joint probability of the sample, $mathbf{x}$. While the denominator of the LRT gives us the supremum probability of the parameter, $theta$, over the unrestricted domain, $Theta$, that maximizes the joint probability of the sample, $mathbf{x}$. Therefore, if the value of $lambda(mathbf{x})$ is small such that $lambda(mathbf{x})leq c$, for some $cin [0, 1]$, then the true value of the parameter that is plausible in explaining the sample is likely to be in the alternative hypothesis, $Theta_0^c$.

Example 1. Let $X_1,X_2,cdots,X_noverset{r.s.}{sim}f(x|theta)=frac{1}{theta}expleft[-frac{x}{theta}right],x>0,theta>0$. From this sample, consider testing $H_0:theta = theta_0$ vs $H_1:theta

Solution:
The parameter space $Theta$ is the set $(0,Theta_0]$, where $Theta_0={theta_0}$. Hence, using the likelihood ratio test, we have $$ lambda(mathbf{x})=frac{displaystylesup_{theta=theta_0}L(theta|mathbf{x})}{displaystylesup_{thetaleqtheta_0}L(theta|mathbf{x})}, $$ where, $$ begin{aligned} sup_{theta=theta_0}L(theta|mathbf{x})&=sup_{theta=theta_0}prod_{i=1}^{n}frac{1}{theta}expleft[-frac{x_i}{theta}right]\ &=sup_{theta=theta_0}left(frac{1}{theta}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta}right]\ &=left(frac{1}{theta_0}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta_0}right], end{aligned} $$ and $$ begin{aligned} sup_{thetaleqtheta_0}L(theta|mathbf{x})&=sup_{thetaleqtheta_0}prod_{i=1}^{n}frac{1}{theta}expleft[-frac{x_i}{theta}right]\ &=sup_{thetaleqtheta_0}left(frac{1}{theta}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta}right]=sup_{thetaleqtheta_0}f(mathbf{x}|theta). end{aligned} $$ Now the supremum of $f(mathbf{x}|theta)$ over all values of $thetaleqtheta_0$ is the MLE (maximum likelihood estimator) of $f(x|theta)$, which is $bar{x}$, provided that $bar{x}leq theta_0$.

So that, $$ begin{aligned} lambda(mathbf{x})&=frac{left(frac{1}{theta_0}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta_0}right]} {left(frac{1}{bar{x}}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{bar{x}}right]},quadtext{provided that};bar{x}leq theta_0\ &=left(frac{bar{x}}{theta_0}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta_0}right]exp[n]. end{aligned} $$ And we say that, if $lambda(mathbf{x})leq c$, $H_0$ is rejected. That is, $$ begin{aligned} left(frac{bar{x}}{theta_0}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta_0}right]exp[n]&leq c\ left(frac{bar{x}}{theta_0}right)^nexpleft[-displaystylefrac{sum_{i=1}^{n}x_i}{theta_0}right]&leq c’,quadtext{where};c’=frac{c}{exp[n]}\ nlogleft(frac{bar{x}}{theta_0}right)-frac{n}{theta_0}bar{x}&leq log c’\ logleft(frac{bar{x}}{theta_0}right)-frac{bar{x}}{theta_0}&leq frac{1}{n}log c’\ logleft(frac{bar{x}}{theta_0}right)-frac{bar{x}}{theta_0}&leq frac{1}{n}log c-1. end{aligned} $$ Now let $h(x)=log x – x$, then $h'(x)=frac{1}{x}-1$. So the critical point of $h'(x)$ is $x=1$. And to test if this is maximum or minimum, we apply second derivative test. That is, $$ h”(x)=-frac{1}{x^2} LRT and its Critical Value

SAS®: Getting Started with PROC IML

Another powerful procedure of SAS, my favorite one, that I would like to share is the PROC IML (Interactive Matrix Language). This procedure treats all objects as a matrix, and is very useful for doing scientific computations involving vectors and matrices. To get started, we are going to demonstrate and discuss the following:

  • Creating and Shaping Matrices;
  • Matrix Query;
  • Subscripts;
  • Descriptive Statistics;
  • Set Operations;
  • Probability Functions and Subroutine;
  • Linear Algebra;
  • Reading and Creating Data;

Above outline is based on the IML tip sheet (see Reference 1). So to begin on the first bullet, consider the following code:

scalar
5
row_vec
1 2 3 4 5 6
col_vec
1
2
3
4
5
6
num_mat
1 2 3
4 5 6
chr_mat
Hello,
world! 😀
i_mat
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
mat_2
2
2
2
trow_vec
1
2
3
4
5
6
mat1
1 2
3 4
5 6

With the help of the comments in the code, it wouldn’t be difficult to comprehend what each line tries to tell us, so I will only explain line 33. In SAS, variables defined are not automatically stored into the workspace unless one stores it first, and then call it on other procedures by loading the storage, which we’ll see on the next entry — Math Query. Functions we’ll discuss in math query involve extracting number of columns, rows, and so on, below is the sample code of this,

 SYMBOL     ROWS   COLS TYPE   SIZE                     
------ ------ ------ ---- ------
CHR_MAT 2 1 char 9
COL_VEC 6 1 num 8
I_MAT 6 6 num 8
MAT1 3 2 num 8
MAT_2 3 1 num 8
NUM_MAT 2 3 num 8
ROW_VEC 1 6 num 8
SCALAR 1 1 num 8
TROW_VEC 6 1 num 8
Number of symbols = 10 (includes those without values)

nmat_row
2
nmat_col
3
nmat_dim
2 3
cmat_len
6
9
cmat_nlen
9
nmat_typ
N
cmat_typ
C

So to load all variables stored in the workspace, we use line 3. Succeeding lines are not that difficult to understand, and this what I love about SAS, the statements and functions are self-explanatory — a good excuse for us to proceed with subscripting on matrices, below is the code of it

NUM_MAT
1 2 3
4 5 6
n22_mat
5
nr1_mat
1 2 3
ir12_mat
1 0 0 0 0 0
0 1 0 0 0 0
ic12_mat
1 0
0 1
0 0
0 0
0 0
0 0
ngm_mat
3.5
ncm_mat
2.5 3.5 4.5
nrm_mat
2
5
ngs_mat
21
nrs_mat
17 29 45
ncs_mat
14
77
nss_mat
91
nrs_mat
17 29 45
ncs_mat
14
77

Line 17 computes the grand mean of the matrix by simply inserting : symbol inside the place holder of the subscript. So that if we have num_mat[:, 1], then mean is computed over the row entries, giving us the column mean, particularly for first column. The same goes for num_mat[1, :], where it computes the mean over the column entries, giving us the row mean. If we replace the symbol in the place holder of the subscripts to +, then we are interested in the sum of the entries. Further, if we use ## symbol, the returned value will be the sum of square of the elements. And reducing this to #, the returned value will be the product of the elements.

Now let’s proceed to the next bullet, which is about Descriptive Statistics.

csr_vec
1 3 6 10 15 21
csn_mat
1 3 6
10 15 21
mnr_vec
1
mnn_mat
1
mxr_vec
6
mxn_mat
6
smr_vec
21
smn_mat
21
ssr_vec
91
ssn_mat
91

To generate random numbers from say normal distribution and computing the mean, standard deviation and other statistics, consider the following:

x1
0.2642335
1.0747269
0.8179241
-0.552775
1.5401449
-1.233822
-0.141535
1.0420036
0.0657322
1.225259
-0.148304
0.2901233
-1.149394
-0.482548
-0.452974
0.2738675
-0.224133
0.218553
-0.420015
0.246356
x2
54.993687
58.167325
59.147705
40.74794
45.813645
53.460273
57.877839
51.98273
49.875743
52.570553
54.097005
46.936325
57.509082
50.463228
42.775346
39.376643
53.303455
54.494482
55.747821
44.512206
x12
0.2642335 54.993687
1.0747269 58.167325
0.8179241 59.147705
-0.552775 40.74794
1.5401449 45.813645
-1.233822 53.460273
-0.141535 57.877839
1.0420036 51.98273
0.0657322 49.875743
1.225259 52.570553
-0.148304 54.097005
0.2901233 46.936325
-1.149394 57.509082
-0.482548 50.463228
-0.452974 42.775346
0.2738675 39.376643
-0.224133 53.303455
0.218553 54.494482
-0.420015 55.747821
0.246356 44.512206
x12_cor
1 -0.001531
-0.001531 1
x12_cov
0.5645625 -0.006864
-0.006864 35.614684
x1_mu
0.1126712
x2_std
5.967804

Line 2 above sets the initial random seed for random numbers to be generated in line 8. Line 5 allocates a matrix of dimension 20 by 1 to x1 variable, and that’s done by using the j function. The number of rows of x1 represents the sample size of the random numbers needed. One can also set x1 to a row vector, where in this case, the number of columns represents the sample size needed. The two sets of random sample, x1 and x2, generated from the same family of distribution, Gaussian/Normal, are then concatenated column-wise (||) to form a matrix of size 20 by 2 in line 13. Using this new matrix, x12, we can then compute the correlation and covariance of the two columns using corr and cov functions, respectively, which from the above output tells us that there is almost no relation between the two.

SAS can also perform set operations, and it’s easy. Consider the following:

B_comp
a i m x
A_comp
e h r t y
AuB
a e h i m o r t x y
AnB
o
AB_unq
a e h i m o r t x y

Next bullet is all about Probability Functions and Subroutine. For example, consider an experiment defined by the random variable $X$ which follows an exponential distribution with mean $beta = .5$. What is the probability of $X$ to be at most 2, $mathrm{P}(Xleq 2)$? To solve this we use the CDF function, but note that the exponential density in SAS is given by $$f(x|beta)=frac{1}{beta}expleft[-frac{x}{beta}right].$$ So to compute the probability, we solve for the following integration, $$ mathrm{P}(Xleq 2)=int_{0}^{2}frac{1}{.5}expleft[-frac{x}{.5}right]operatorname{d}x = 0.9816844 $$ To confirm this in SAS, run the following

px
0.9816844

If we take the derivative of the Cumulative Distribution Function (CDF), the returned expression is what we call the Probability Density Function (PDF). And in SAS, we play on this using the PDF function. For example, we can confirm the above probability by integrating the PDF. And to do so, run the following

px
0.9816844

To end this topic, consider the inverse of the CDF, which is the quantile. To compute for the quantile of the popular level of significance $alpha = 0.05$, from a standard normal distribution, which is $z_{alpha} = -1.645$ for lower tail, run

z_a
-1.644854

Next entry is about Linear Algebra, the topic on which this procedure is based upon. Linear algebra is very useful in Statistics, especially in Regression, Nonlinear Regression, and Multivariate Analysis. To perform this in SAS, consider

xm_det
-1
xm_inv
1 -3 2
-3 3 -1
2 -1 4.441E-16
x_evl
11.344814
0.1709152
-0.515729
x_evc
0.3279853 0.591009 0.7369762
0.591009 -0.736976 0.3279853
0.7369762 0.3279853 -0.591009
x_coef
3
-4
2

Finally, one of the coolest capabilities of SAS/IML is to Read and Create SAS Data. The following code demos how to read SAS data set.

x_dat
Acura
Acura
Acura
Acura
Acura
Acura
Acura
Audi
Audi
Audi
hp_mean
215.88551

And to create a SAS data set, run


Obs COL1 COL2 COL3
1 1 2 3
2 4 5 6

To end this post, I want to say, I am loving SAS because of IML. There are still hidden capabilities of this procedure that I would love to explore and share to my readers, so stay tuned. Another great blog about SAS/IML is The DO Loop, whose author, Dr. Rick Wicklin, is also the principal developer of the said procedure and SAS/IML Studio, do check that out.

Reference

  1. SAS/IML Tip Sheet. Frequently Used SAS/IML Functions and Subroutines.
  2. SAS/IML 13.2 User Guide.
  3. Rick Wicklin. The DO Loop. How to numerically integrate a function in SAS.

Modesty

Modesty in Islam is one of the principles of faith. It is freedom from vanity and showiness. It is decency and moderation in speech, manner, dress and total attitude and behavior towards life. It is shyness, simplicity and humility about our abilities and accomplishments. (http://www.answers.com/Q/What_is_modesty_in_Islam) 

Ever since I transferred platforms and decided that it’s time to bring blogging into the next level, I made a promise to myself that I will post OOTDs or What I Wore as much as I can here in  my humble blog. And since majority of the people here in the Philippines are Catholic, Modest/Muslim Fashion is often neglected. To tell you honestly, I still get that stare at malls or public places, I don’t know if they’re in awe that a hijabi or a woman who is fully covered is a fashionista or that racist kind of look. But I’d like to think it’s the first one.

So…

Recently, I collaborated with Kai with her project, the Hijabi Mag. This is all about spreading awareness on Modest Fashion here in the Philippines and to show that wearing hijab is NOT an oppression but a choice.  In short it’s “Hey, you can be all covered up and be fashionista at the same time”. 😀 I’m just so excited and I can’t wait for the launching! Maybe it’ll be launched before or during Ramadan. Inshallah. 

I’m still new at this What I Wore posts and I apologise in advance, albeit I will do my best to make my every post catchy and genuine. For now, I am leaving you with a fun shoot I had with my dearest cousin and best friend plus a quote I love by Angelina Jolie.


“The sun doesn’t lose its beauty when covered by clouds. 
The same way your beauty doesn’t fade when covered by Hijab.”
-Angelina Jolie

White sand at Isidro’s Beach, Initao, Misamis Oriental

So aside from jumping off an 8-storey base jump platform tower, I also celebrated my birthday this year at my happy place – THE BEACH! I am a certified beach baby Since it was my Cebu-based brother’s last day in town, my family and I decided to just forego our Camiguin Island or Siargao Island plans and just look for a nearby municipality that had white sand beaches. In Misamis Oriental, the nearest one we could think of was Initao. Initao is about a little over an hour away from Cagayan de Oro and about 45 minutes away from Iligan City. Apart

Design Your Own Havaianas At Make Your Own Havaianas 2015 CDO

If there’s one fashion event I look forward to every year, it’s Make Your Own Havaianas! My first MYOH experience was in 2009 and it was a blast! A Make Your Own Havaianas event is where you can create and design your very own pair of Havaianas. You can choose your preferred sole, strap and pin. Check out what happened during my second time joining MYOH. The whole process is fun and easy, trust me. In fact, Havaianas even has this to let you know just how cool and easy it is: This year, Make Your Own Havaianas is celebrating its

Welcome to UPCM Family, Class 2020!

Alhamdulillah! (All praise be to Allah!) After the loooong wait, the list of admitted applicants who passed the rigorous application process in UPCM is now up! We now have the initial list of LU3 students (1st year Medicine Proper) for the school year …

Just ask why

When your mind is full of ideas but you lack the words to express them.I simply keep asking myself “Why?”Why? really WHy?#post-duty

Category: Uncategorized

UNYPAD members help residents in putting out fire in Datu Paglas

By: Rudy S. Lumapinet DATU PAGLAS, Maguindanao (April 19, 2015) – On Friday, April 10, some members of United Youth for Peace and Development (UNYPAD)-Datu Paglas Chapter helped residents in firefighting to prevent a fire from affecting other houses near a rural bank in the town market of Datu Paglas, Maguindanao. “At about 10:00PM, the

#myKaffahfromCC Giveaway!

Classic Crown celebrates its 1st year as the Kaffah’s official reseller in the Philippines! 
Time flies!

We are beyond blessed on what’s our online store has become to our sisters, 

by that, we are conducting a giveaway!

Excited? We are, too!

Just a brief explanation to those who aren’t yet familiar with Kaffah, it is an imported brand of hijabs/scarfs/shawls from Indonesia owned by Siti Juwariyah, a blogger/entrepreneur.

We just decided to resell the brand here in the country because we personally love and use their Kaffah Shawls especially their different lovely colors perfect for everyday use even to formal events. And most importantly, their perfect length that will be great for several hijab styles (Muslimahs will understand!). 🙂

My friend/sister and business partner Marjan of Singapore (A Kaffah user for few years and had a great collection of them :p) introduced this brand to me and everything just started.

Some of her collection 🙂

Since the day we released the brand in our store it has always been sold out in just few hours. As we try to provide more stocks, we didn’t expect we also gain more Kaffah users. We are so thankful for the patience and understanding of our loyal customers of Kaffah. And we assure that we will provide more in the future In shaa Allah.

So here are the details of this giveaway:

WHAT: #myKaffahfromCC Giveaway Promo

WHO: Open to all our loyal buyers of Kaffah Shawl and inners.

HOW:

Here are the few steps to follow:

1. Grab your Kaffah shawls or inners from your closet.
2. Think about your own concept, flatlay or OOTD entry with your Kaffah shawl/inner.
3. Snap a photo and share it with your Instagram or Facebook account.
4. Tag us and don’t forget your hashtag #myKaffahfromCC to be qualified. 

If your account is in private mode just comment to our posts for us to follow you!
5. We will extend our giveaway until April 25 for more chance of winning! #excited

*drums* 


Winner will be getting:



A Light Mint Kaffah Shawl and a Dark Brown Kaffah Inner

What are the criterias that we’re looking for?

We will choose the best photo, with the most unique and beautiful way of sharing her Kaffah shawls/inners bought in our store!

Tip:
A good quality photo is a must, perfect lighting and creativity!

Our own entries! Lookie!

We are so excited to browse your photos! Don’t hesitate to join! You will definitely enjoy!



Love,
Jamila

Finally jumped off the 8-storey Dahilayan Skyjump tower

I finally did it! I jumped off the 8-storey Dahilayan Skyjump Base Jump Tower, the Philippines’ tallest parajump platform. I did this just a few days before my birthday this month of April and it was one of the most exhilarating things I ever did! I previously wrote about the Dahilayan Skytower Base Jump and that post has been shared over 10,000 times on Facebook alone. You might want to read that The Dahilayan Skyjump is the latest attraction at Dahilayan Adventure Park, located at Barangay Dahilayan, Manolo Fortich, Bukidnon. The park is about an hour away from Cagayan de

The 2nd Dahilayan Off Road Duathlon And Trail Run

IT’S CONFIRMED! Mindanaoan.com can exclusively confirm that the Second Dahilayan Off Road Duathlon and Trail Run will be held at both Dahilayan Adventure Park and Forest Park Dahilayan on May 31, 2015! Organized by ProSport Productions, the said event is expected to gather hundreds of participants and is the most-awaited follow-up of last year’s very successful event. Dahilayan Adventure Park and Forest Park Dahilayan are both located in Barangay Dahilayan, Manolo Fortich, Bukidnon, which is roughly 45 minutes away from Cagayan de Oro City. Here’s how to get there. The 2nd Dahilayan Off Road Duathlon and Trail Run is organized

Food and None Food Items Distribution to the IDP’s in Maguindanao

Province of Maguindanao: On/about 25 February 2015, Armed Forces (AFP) Chief Gen. Gregorio Pio Catapang ordered an all-out offensive against the BIFF. Gen. Catapang issued his order amid fighting between the MILF and BIFF, which broke out in Pagalungan town in Maguindanao about two weeks ago. IDPs from the municipality of Shariff Saydona Mustapha, Maguindanao […]

Category: Uncategorized

Python and R: Basic Sampling Problem

In this post, I would like to share a simple problem about sampling analysis. And I will demonstrate how to solve this using Python and R. The first two problems are originally from Sampling: Design and Analysis book by Sharon Lohr.

Problems

  1. Let $N=6$ and $n=3$. For purposes of studying sampling distributions, assume that all population values are known.


    $y_1 = 98$ $y_2 = 102$ $y_3=154$
    $y_4 = 133$ $y_5 = 190$ $y_6=175$


    We are interested in $bar{y}_U$, the population mean. Consider eight possible samples chosen.


    Sample No. Sample, $mathcal{S}$ $P(mathcal{S})$
    1 ${1,3,5}$ $1/8$
    2 ${1,3,6}$ $1/8$
    3 ${1,4,5}$ $1/8$
    4 ${1,4,6}$ $1/8$
    5 ${2,3,5}$ $1/8$
    6 ${2,3,6}$ $1/8$
    7 ${2,4,5}$ $1/8$
    8 ${2,4,6}$ $1/8$


    1. What is the value of $bar{y}_U$?
    2. Let $bar{y}$ be the mean of the sample values. For each sampling plan, find
      1. $mathrm{E}bar{y}$;
      2. $mathrm{Var}bar{y}$;
      3. $mathrm{Bias}(bar{y})$;
      4. $mathrm{MSE}(bar{y})$;
  2. Mayr et al. (1994) took an SRS of 240 children who visisted their pediatric outpatient clinic. They found the following frequency distribution for the age (in months) of free (unassisted) walking among the children:


    Age (months) 9 10 11 12 13 14 15 16 17 18 19 20
    Number of Children 13 35 44 69 36 24 7 3 2 5 1 1


    Find the mean and SE of the age for onset of free walking.
  3. Table 1 gives the cultivated area in acres in 1981 for 40 villages in a region. (Theory and Method of Survey) Using the arrangement (random) of data in the table, draw systematic sample of size 8. Use r ((random start) = 2,


    Village $Y_j$ Village $Y_j$ Village $Y_j$ Village $Y_j$
    1 105 11 319 21 70 31 16
    2 625 12 72 22 249 32 439
    3 47 13 109 23 384 33 123
    4 312 14 91 24 482 34 207
    5 327 15 152 25 378 35 145
    6 230 16 189 26 111 36 666
    7 240 17 365 27 534 37 338
    8 203 18 70 28 306 38 624
    9 535 9 249 29 655 39 501
    10 275 20 384 30 102 40 962

Solutions

In order to appreciate the codes, I will share some theoretical part of the solution. But our main focus here is to solve this problem computationally using Python and R.

    1. The value of $bar{y}_U$ is coded as follows:

      Python Code R Code

    2. To obtain the sample using the sample index given in the table in the above question, we do a combination of population index of three elements, ${6choose 3}$, first. Where the first two combinations are the samples, ${1,2,3}$ and ${1,2,4}$, and so on. Then from this list of all possible combinations of three elements, we draw those that are listed in the above table as our samples, with first sample index ${1,3,5}$, having population units, ${98, 154, 190}$. So that the following is the code of this sampling design:

      Python Code R Code

      1. Now to obtain the expected value of the average of the sample data, we compute it using $mathrm{E}bar{y}=sum_{k}bar{y}_kmathrm{P}(bar{y}_k)=sum_{k}bar{y_k}mathrm{P}(mathcal{S}_k)$, $forall kin{1,cdots,8}$. So for $k = 1$, $$ begin{aligned} bar{y}_1mathrm{P}(mathcal{S}_1)&=frac{98+154+190}{3}mathrm{P}(mathcal{S}_1)\ &=frac{98+154+190}{3}left(frac{1}{8}right)=18.41667. end{aligned} $$ Applying this to the remaining $n-1$ $k$s, and summing up the terms gives us the answer to $mathrm{E}bar{y}$. So that the following is the equivalent of this:

        Python Code R Code From the above code, the output tells us that $mathrm{E}bar{y}=140$.

      2. Next is to compute for the variance of $bar{y}$, which is $mathrm{Var}bar{y}=mathrm{E}bar{y}^{2}-(mathrm{E}bar{y})^2$. So we need a function for $mathrm{E}bar{y}^2$, where the first term of this, $k=1$, is $bar{y}_1^2mathrm{P}(mathcal{S}_1)=left(frac{98+154+190}{3}right)^2mathrm{P}(mathcal{S}_1)=left(frac{98+154+190}{3}right)^2(frac{1}{8})=2713.3889$. Applying this to other terms and summing them up, we have following code:

        Python Code R Code So that using the above output, 20182.94, and subtracting $(mathrm{E}bar{y})^2$ to it, will give us the variance. And hence the succeeding code:

        Python Code: R Code: So the variance of the $bar{y}$ is $18.9444$.

    3. The $mathrm{Bias}$ is just the difference between the estimate and the true value. And since the estimate is unbiased ($mathrm{E}bar{y}=142$), so $mathrm{Bias}=142-142=0$.
    4. $mathrm{MSE}=mathrm{Var}bar{y}-(mathrm{Bias}bar{y})^2$, and since the $mathrm{Bias}bar{y}=0$. So $mathrm{MSE}=mathrm{Var}bar{y}$.
  1. First we need to obtain the probability of each Age, that is by dividing the Number of Children with the total sum of it. That is why, we have p_s function defined below. After obtaining the probabilities, we can then compute the expected value using the expectation function we defined earlier.

    Python Code R Code It should be clear in the data that the average age is about 12 months old, where the plot of it is shown below, For the code of the above plot please click here. Next is to compute the standard error which is just the square root of the variance of the sample,

    Python Code R Code So the standard variability of the Age is 1.920824.

  2. Let me give you a brief discussion on the systematic sampling to help you understand the code. The idea in systematic sampling is that, given the population units numbered from 1 to $N$, we compute for the sampling interval, given by $k = frac{N}{n}$, where $n$ is the number of units needed for the sample. After that, we choose for the random start, number between $1$ and $k$. This random start will be the first sample, and then the second unit in the sample is obtained by adding the sampling interval to the random start, and so on. There are two types of systematic sampling namely, Linear and Circular Systematic Samplings. Circular systematic sampling treats the population units numbered from $1$ to $N$ in circular form, so that if the increment step is more than the number of $N$ units, say $N+2$, the sample unit is the $2^{nd}$ element in the population, and so on. The code that I will be sharing can be used both for linear and circular, but for this particular problem only. Since there are rules in linear that are not satisfied in the function, one of which is if $k$ is not a whole number, despite that, however, you can always extend it to a more general function.

    Python Code R Code You may notice in the output above, that the index returned in Python is not the same with the index returned in R. This is because Python index starts with 0, while that in R starts with 1. So that’s why we have the same population units sampled between the two language despite the differences between the index returned.

Reference

  1. Lohr, Sharon (2009). Sampling: Design and Analysis. Cengage Learning.