# STATISTICAL LABORATORY

**Academic Year 2024/2025**- Docente:

**ANTONIO PUNZO**

## Risultati di apprendimento attesi

1. ** Knowledge and
understanding. **The objectives aim to introduce the knowledge of the R
language for statistical data analysis with a special focus on descriptive statistics,
probability distributions, statistical inference, and statistical modeling.

2. ** Applying knowledge and
understanding. **After finishing the course, the student will have the
capability to use the R language for: i) providing basic statistical analyses
of data; ii) simulating data according to given probability distributions; and iii)
applying main methods of statistical inference.

3. **Making
judgements.** Upon finishing the course, the student will
have the ability to extract insights from data by utilizing statistical
analyses in R.** **

4. **Communication
skills. **After finishing the course, the student will have
the ability to effectively communicate the outcomes of statistical analyses implemented
via the R statistical software.** **

5. **Learning
skills. **Upon finishing the course, students will acquire
the skills to utilize the statistical software R for conducting basic data
analyses and statistical modeling.

## Course Structure

The course will include lectures delivered through slides and R code demonstrations. We will use the freely available R statistical software extensively. Practical activities and data analysis sessions in R will also be organized.

## Required Prerequisites

## Attendance of Lessons

## Detailed Course Content

**Getting started with R and RStudio**

**Descriptive Statistics. **Simple
Statistical Distributions. Data tables. Frequency distributions. Main summary
statistics: arithmetic mean, geometric mean, harmonic mean. Median and
percentiles. Variance, standard deviation, relative variation. Graphical
representations. Multiple Statistical Distributions. Contingency Tables. Joint
distributions, marginal and conditional distributions. Covariance and
correlation.

**Probability. **Random
number generation and data modeling according to different probability
distributions: uniform, binomial, Poisson, and Gaussian.

**Statistical inference. **Sample
distributions: Student-t, chi-square. Confidence estimation. Confidence level.
Confidence bounds for means, variances, and proportions. Hypothesis testing.
Null hypotheses and alternative hypotheses. P-values. Statistical tests for
means, variances, proportions, comparison of means, and comparison of
proportions.

**Statistical models. **The
simple regression model. Goodness of fit. Residual analysis. Inference on the parameters
of a linear regression model.

## Textbook Information

·

·

·

## Course Planning

Subjects | Text References | |
---|---|---|

1 | Syllabus: illustration and explanation. Getting started with R and RStudio. Why use R? How to install R. | Slide |

2 | RStudio. RStudio orientation. Console. R script. Source. Run button. Environment/History/Connections. Files/Plots/Packages/Help/Viewer. | Slide |

3 | R packages (CRAN packages and GitHub packages). Using packages. | Slide |

4 | Projects in RStudio. Directory structure. File names. R style guide. Citing R. | Slide |

5 | Some R basics. Objects in R. Errors and warnings. Naming objects. | Slide |

6 | The use of the directory. Getting help. Set the number of digits to display. | Slide |

7 | Operators in R. Using functions in R. Assignment of objects. | Slide |

8 | Vectors. Different ways to create vectors. Extracting elements from a vector. Replacing elements. Search for elements within a vector | Slide |

9 | Workspace content and manipulation. Saving in R. Data types. Missing data. | Slide |

10 | Matrices and algebraic operations. Reserved words. Arrays. | Slide |

11 | Lists. Data frames. Attach and detach. | Slide |

12 | Frequency distributions. Contingency tables. Box-plot. | Slide |

13 | Graphical representations. Empirical distribution function. Basic statistics. Concentration index and Lorenz curve. | Slide |

14 | Sampling and ad hoc generators of discrete random variables. Q-Q plot. | Slide |

15 | Univariate constrained optimization with optimize(). Multivariate unconstrained optimization with optim(). Maximum likelihood estimation method. | Slide |

16 | Chi-square test of goodness of fit. Kolmogorov-Smirnov test (goodness-of-fit and distributional comparison between 2 samples). Chi-square test of independence. | Slide |

17 | Univariate and multivariate linear regression model. Nonparametric regression. Changes in scale. | Slide |

18 | Generalized linear models. Logistic regression. Poisson regression. Regression models with qualitative covariates. 1-way ANOVA. | Slide |

## Learning Assessment

### Learning Assessment Procedures

The exam aims to evaluate the achievement of the learning objectives. It is carried out through a practical test concerning the writing of a convenient R code to solve a statistical problem in R and interpret the output produced by well-known functions in R.

### Examples of frequently asked questions and / or exercises

· Writing an R code to find the maximum likelihood estimates of the parameters of the log-normal distribution

· Writing an R code to find the maximum likelihood estimates of the parameters of a linear model with covariates both on the mean and on the variance of the normal distribution for the error

·

**ENGLISH VERSION**