SU PHE5020 Projects Latest 2021 July (Full)
PHE5020 Biostatistical Methods
Week 1 Project

Having Trouble Meeting Your Deadline?
Get your assignment on SU PHE5020 Projects Latest 2021 July (Full) completed on time. avoid delay and – ORDER NOW
Hypothesis Testing and Inference
This assignment focuses on estimation and hypothesis testing with one-sample and two-sample inferences.
The essence of parametric testing is the use of standard normal distribution tables of probabilities. For each exercise, there will be a sample problem that shows how the calculations are done and at least one problem for you to work out.
For the first assignment, you will not need any statistical software. However, you will use a standardized normal distribution table (a z-score table) provided in the course textbook (Table 3—The normal distribution—in the Tables section in APPENDIX) to obtain your responses.
Click here to access the standardized normal distribution table from your course textbook.
Problem 1: Probability Using Standard Variable z and Normal Distribution Tables
Variables are the things we measure. A hypothesis is a prediction about the relationship between variables. Variables make up the words in a hypothesis.
In the attention-deficit/hyperactivity disorder’s (ADHD’s) hypothetical example provided in the tables below, the research question was: What is the most effective therapy for ADHD? One of the variables is type of therapy. Another variable is change in ADHD-related behavior, given exposure to therapy. You might measure change in the mean seconds of concentration time when children read. This experiment is designed to obtain children’s concentration times while they read a science textbook and to find out whether the therapy used worked on any of the children.
Use the stated µ and σ to calculate probabilities of the standard variable z to get the value of p (up to three decimal places). In addition, respond to the following questions for each pair of parameters:
Which child or children, if any, appeared to come from a significantly different population than the one used in the null hypothesis?
What happens to the “significance” of each child’s data as the data are progressively more dispersed?
In addition to the above, write a formal statement of conclusion for each child in APA style. A report template is provided for submission of your work.
Note: Tables 1 and 2 are practice tables with answers. Tables 3 and 4 are the assignment tables for you to work on.
Table 1 (µ = 100 seconds and σ = 10)
Table 1 (µ = 100 seconds and σ = 10)
Child |
Mean seconds of concentration in an experiment of reading |
z-score |
p-value |
1 |
75 |
-2.50 |
0.0 |
2 |
81 |
-1.90 |
0.0 |
3 |
89 |
-1.10 |
0.1 |
4 |
99 |
-0.10 |
0.4 |
5 |
115 |
1.50 |
0.0 |
6 |
127 |
2.70 |
0.0 |
7 |
138 |
3.80 |
<0.0 |
8 |
139 |
3.90 |
<0.0 |
9 |
142 |
4.20 |
<0.0 |
10 |
148 |
4.80 |
<0.0 |
Table 2 (µ = 100 seconds and σ = 20)
Child |
Mean seconds of concentration in an experiment of reading |
z-score |
p-value |
1 |
75 |
-1.25 |
0.1 |
2 |
81 |
-0.95 |
0.1 |
3 |
89 |
-0.55 |
0.2 |
4 |
99 |
-0.05 |
0.4 |
5 |
115 |
0.75 |
0.2 |
6 |
127 |
1.35 |
0.0 |
7 |
138 |
1.90 |
0.0 |
8 |
139 |
1.95 |
0.0 |
9 |
142 |
2.10 |
0.0 |
10 |
148 |
2.40 |
0.0 |
Table 3 (µ = 100 seconds and σ = 30)
Child |
Mean seconds of concentration in an experiment of reading |
z-score |
p-value |
1 |
75 |
-0.83 |
|
2 |
81 |
-0.63 |
|
3 |
89 |
-0.37 |
|
4 |
99 |
-0.03 |
|
5 |
115 |
0.50 |
|
6 |
127 |
0.09 |
|
7 |
138 |
1.27 |
|
8 |
139 |
1.30 |
|
9 |
142 |
1.40 |
|
10 |
148 |
1.60 |
|
Table 4 (µ = 100 seconds and σ = 40)
Child |
Mean seconds of concentration in an experiment of reading |
z-score |
p-value |
1 |
75 |
-0.63 |
|
2 |
81 |
-0.48 |
|
3 |
89 |
-0.28 |
|
4 |
99 |
-0.03 |
|
5 |
115 |
0.38 |
|
6 |
127 |
0.68 |
|
7 |
138 |
0.95 |
|
8 |
139 |
0.98 |
|
9 |
142 |
1.05 |
|
10 |
148 |
1.20 |
|
Click here for a template to provide your answers and submit the assignment.
Refer to the Assignment Resources on this page for Two Independent Samples of t-Test to view an example of probability using standard variable and normal distribution tables. The same resource is also available under lecture Estimation and Hypothesis Testing.
Submission Details:
Name your document SU_PHE5020_W1_A3a_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
Problem 2: Two-Sample Inferences
A two-sample inference deals with dependent and independent inferences. In a two-sample hypothesis testing problem, underlying parameters of two different populations are compared. In a longitudinal (or follow-up) study, the same group of people is followed over time. Two samples are said to be paired when each data point in the first sample is matched and related to a unique data point in the second sample.
This problem demonstrates inference from two dependent (follow-up) samples using the data from the hypothetical study of new cases of tuberculosis (TB) before and after the vaccination was done in several geographical areas in a country in sub-Saharan Africa. Conclusion about the null hypothesis is to note the difference between samples.
The problem that demonstrates inference from two dependent samples uses hypothetical data from the TB vaccinations and the number of new cases before and after vaccination.
Table 5: Cases of TB in Different Geographical Regions
Geographical regions |
Before vaccination |
After vaccination |
1 |
85 |
11 |
2 |
77 |
5 |
3 |
110 |
14 |
4 |
65 |
12 |
5 |
81 |
10 |
6 |
70 |
7 |
7 |
74 |
8 |
8 |
84 |
11 |
9 |
90 |
9 |
10 |
95 |
8 |
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following:
Construct a one-sided 95% confidence interval for the true difference in population means.
Test the null hypothesis that the population means are identical at the 0.05 level of significance.
Click here to install Minitab Software.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
Name your Minitab output file SU_PHE5020_W1_A3b_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W1_A3c_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
Problem 3: Cross-Sectional Study
In a cross-sectional study, the participants are seen at only one point of time. Two samples are said to be independent when the data points in one sample are unrelated to the data points in the second sample.
The problem that demonstrates inference from two independent samples will use hypothetical data from the American Association of Poison Control Centers.
There are two groups of independent data collected in different regions, which also calls for a t-test. The numbers represent the number of recorded cases of poisoning with chemicals in the homes of 100,000 people in two regions.
Table 6: Cases of Poisoning With Chemicals
Year Region 1 Region 2
1 150 11
2 160 10
3 132 14
4 110 12
5 85 10
6 45 11
7 123 9
8 180 11
9 143 10
10 150 14
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following:
Formulate a null and an alternative hypothesis for a two-sided test.
Conduct the test at the 0.05 level of significance.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
Name your Minitab output file SU_PHE5020_W1_A3d_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W1_A3e_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
PHE5020 Biostatistical Methods
Week 2 Project
Instructions
Week 2: Project Assignment
This assignment focuses on nonparametric methods. When a researcher is not in a situation to be able to assume parametric statistical methods requirements, known distribution, or dealing with small sample size, then nonparametric statistical methods need to be used, which make fewer assumptions about the distributional shape.
Click here to install Minitab Software.
Nonparametric Methods
In this assignment, we will use the following nonparametric methods:
The Wilcoxon signed-rank test: The Wilcoxon signed-rank test is the nonparametric test analog of the paired t-test.
The Wilcoxon rank-sum test or the Mann-Whitney U test: The Wilcoxon rank-sum test is an analog to the two-sample t-test for independent samples.
For each exercise, there will be a sample problem that shows how the calculations are done and the problems for you to work on.
Part 1: Wilcoxon Signed-Rank Test
Let’s take a hypothetical situation. The World Health Organization (WHO) wants to investigate whether building irrigation systems in an African region helped reduce the number of new cases of malaria and increased the public health level.
Data was collected for the following variables from ten different cities of Africa:
The number of new cases of malaria before the irrigation systems were built
The number of new cases of malaria after the irrigation systems were built
Table 1: Cases of Malaria
City |
Before |
After |
1 |
110 |
55 |
2 |
240 |
75 |
3 |
68 |
15 |
4 |
100 |
10 |
5 |
120 |
21 |
6 |
110 |
11 |
7 |
141 |
41 |
8 |
113 |
5 |
9 |
112 |
13 |
10 |
110 |
8 |
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following:
Run a sample Wilcoxon signed-rank test to show whether there is a statistically significant difference between the number of cases before and after the irrigation systems were built.
Obtain the rank-sum.
Determine the significance of the difference between the groups.
Determine whether building these systems helped reduce new cases of malaria.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format. Refer to the Assignment Resources: Wilcoxon Signed-Rank Test Example to view an example of the Wilcoxon signed-rank test. The same resource is also available under lecture Nonparametric Methods.
Submission Details:
Name your Minitab output file SU_PHE5020_W2_A2a_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W2_A2b_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
Part 2: Wilcoxon Rank-Sum Test
Let us consider another hypothetical situation. The WHO wants to compare the mortality rates of children under the age of five years of underdeveloped and developed regions of the world. There were two independent samples of ten countries from each of the groups drawn at the same time, and the yearly mortality rates of children under the age of five years (per 100,000) inhabitants were reported (MRate1 and MRate2).
Table 2: Mortality Rates of Children
Country |
MRate1 |
MRate2 |
1 |
120 |
11 |
2 |
110 |
9 |
3 |
105 |
13 |
4 |
61 |
11 |
5 |
45 |
14 |
6 |
114 |
11 |
7 |
118 |
10 |
8 |
138 |
8 |
9 |
85 |
6 |
10 |
70 |
6 |
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following:
Run the Wilcoxon rank-sum test to show whether there is a statistically significant difference between the mortality rates of children under the age of five years of the regions. Results may be used in making decisions regarding which region needs to receive help to improve the public health issues of morality.
Obtain the difference in the mortality rates and whether there is a statistically significant difference.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
Name your Minitab output file SU_PHE5020_W2_A2c_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W2_A2d_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
PHE5020 Biostatistical Methods
Week 3 Project
Week 3: Project Assignment
Statistics for Categorical Data: Odds Ratios and Chi-Square
This assignment focuses on categorical data. Two of the statistics most often used to test hypotheses about categorical data are odds ratios (ORs) and the chi-square. The disease-OR refers to the odds in favor of disease in the exposed group divided by the odds in favor of the unexposed group. Chi-square statistics measure the difference between the observed counts and the corresponding expected counts. The expected counts are hypothetical counts that would occur if the null hypothesis were true.
Part 1: ORs
A study conducted by López-Carnllo, Avila, and Dubrow (1994) investigated health hazards associated with the consumption of food local to a particular geographic area, in this case chili peppers particular to Mexico. It was a population-based case-control study in Mexico City on the relationship between chili pepper consumption and gastric cancer risk. Subjects for the study consisted of 213 incident cases and 697 controls randomly selected from the general population. Interviews produced the following information regarding chili consumption:
Table 1: Chili Pepper Consumption and Gastric Cancer Risk
Chili pepper consumption |
Case of gastric cancer |
Controls |
Yes |
A = 204 |
B = 552 |
No |
C = 9 |
D = 145 |
Reference:
López-Carnllo, L., Avila, M. H., & Dubrow, R. (1994). Chili pepper consumption and gastric cancer in Mexico: A case-control study. American Journal of Epidemiology, 139(3), 263–271.
Note: You do not need to use the Minitab software to complete this assignment.
In a Microsoft Excel worksheet, calculate the odds of having gastric cancer.
In addition, provide a written interpretation of your results in APA format.
Refer to the Assignment Resources: Odds Ratio to view an example of odds ratio. The same resource is also available under lecture Testing Hypotheses.
Submission Details:
Name your worksheet SU_PHE5020_W3_A2a_LastName_FirstInitial.xls.
Name your document SU_PHE5020_W3_A2b_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
Part 2: Chi-Square
Bain, Willett, Hennekens, Rosner, Belanger, and Speizer (1981) conducted a study of the association between current postmenopausal hormone use and risk of nonfatal myocardial infarction (MI), in which 88 women reporting a diagnosis of MI and 1,873 healthy control subjects were identified from a large population of married female registered nurses aged thirty to fifty-five years. To test the hypothesis that there is no association between use of postmenopausal hormones and risk of MI, chi-square statistics need to be calculated.
The data are presented as follows:
Table 2: Association between Postmenopausal Hormone Use and Risk of Nonfatal MI
|
Cases |
Controls |
Total |
Currently use |
32 |
825 |
857 |
Never use |
56 |
1,048 |
1,104 |
Total |
88 |
1,873 |
1,961 |
Reference:
Bain, C., Willett, W., Hennekens, C. H., Rosner, B., Belanger, C., & Speizer,
F. E. (1981). Use of postmenopausal hormones and risk of myocardial
infarction. Circulation, 64(1), 42–46.
Using the Minitab procedure, enter the data, perform appropriate procedures, and provide calculations from the table.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Submission Details:
Name your Minitab output file SU_PHE5020_W3_A2c_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W3_A2d_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
PHE5020 Biostatistical Methods
Week 4 Project
Regression and Correlation Methods: Correlation, ANOVA, and Least Squares
This is another way of assessing the possible association between a normally distributed variable y and a categorical variable x. These techniques are special cases of linear regression methods. The purpose of the assignment is to demonstrate methods of regression and correlation analysis in which two different variables in the same sample are related.
The following are three important statistics, or methodologies, for using correlation and regression:
Pearson’s correlation coefficient
ANOVA
Least squares regression analysis
In this assignment, solve problems related to these three methodologies.
Part 1: Pearson’s Correlation Coefficient
For the problem that demonstrates the Pearson’s coefficient, you will use measures that represent characteristics of entire populations to describe disease in relation to some factor of interest, such as age; utilization of health services; or consumption of a particular food, medication, or other products. To describe a pattern of mortality from coronary heart disease (CHD) in year X, hypothetical death rates from ten states were correlated with per capita cigarette sales in dollar amount per month. Death rates were highest in states with the most cigarette sales, lowest in those with the least sales, and intermediate in the remainder. Observation contributed to the formulation of the hypothesis that cigarette smoking causes fatal CHD. The correlation coefficient, denoted by r, is the descriptive measure of association in correlational studies.
Table 1: Hypothetical Analysis of Cigarette Sales and Death Rates Caused by CHD
State |
Cigarette sales |
Death rate |
1 |
102 |
5 |
2 |
149 |
6 |
3 |
165 |
6 |
4 |
159 |
5 |
5 |
112 |
3 |
6 |
78 |
2 |
7 |
112 |
5 |
8 |
174 |
7 |
9 |
101 |
4 |
10 |
191 |
6 |
Using the Minitab statistical procedure:
Calculate Pearson’s correlation coefficient.
Create a two-way scatter plot.
In addition to the above:
Explain the meaning of the resulting coefficient, paying particular attention to factors that affect the interpretation of this statistic, such as the normality of each variable.
Provide a written interpretation of your results in APA format.
Refer to the Assignment Resources: Dot Plots and Correlation and Resources: Performing Regression Analysis to view an example of Pearson’s correlation coefficient. This same resources are also available under lecture Correlation and Regression Methods.
Submission Details:
Name your Minitab output file SU_PHE5020_W4_A2a_LastName_FirstInitial.mtw.
Name your document SU_PHE5020_W4_A2b_LastName_FirstInitial.doc.
Submit your document to the Submissions Area by the due date assigned.
Part 2: ANOVA
Let’s take hypothetical data presenting blood pres