STATISTICAL LABORATORY

Academic Year 2023/2024 - Teacher: ANTONIO PUNZO

Expected Learning Outcomes

1.     Knowledge and understanding. The objectives aim to introduce the knowledge of the R language for statistical data analysis with a special focus on descriptive statistics, probability distributions, and statistical inference.

2.    Applying knowledge and understanding. On completion, the student will be able to utilize the R language for i) providing basic statistical analyses of data; ii) simulating data according to given probability distributions; iii) applying main methods of statistical inference.

3.      Making judgements. On completion, the student will be able to extract knowledge from data through statistical analyses in R.

4.    Communication skills. On completion, the student will be able how to present the results from the statistical analyses, based on the use of the statistical software R.

5.     Learning skills. On completion, students will be able to utilize the statistical software R for basic data analysis and modeling.ills (Capacità di apprendimento). On completion, the student will be able to understand the structure of unsupervised learning.

Course Structure

Lectures and practical activities and data analysis in R.

Required Prerequisites

Basic notions in statistics, linear algebra, and computing.

Attendance of Lessons

Highly recommended.

Detailed Course Content

Getting started with R and RStudio

Descriptive Statistics. Simple Statistical Distributions. Data tables. Frequency distributions. Main summary statistics: arithmetic mean, geometric mean, harmonic mean. Median and percentiles. Variance, standard deviation, relative variation. Graphical representations. Multiple Statistical Distributions. Contingency Tables. Joint distributions, marginal and conditional distributions. Covariance and correlation.

Probability. Random number generation and data modeling according to different probability distributions: uniform, binomial, Poisson, Gaussian.

Statistical inference. Sample distributions: Student-t, chi-square. Confidence estimation. Confidence level. Confidence bounds for means, variances, and proportions. Hypothesis testing. Null hypotheses and alternative hypotheses. P-values. Statistical tests for means, variances, proportions, comparison of means, and comparison of proportions.

Statistical models. The simple regression model. Goodness of fit. Residual analysis. Inference on the parameters of a linear regression model.

Textbook Information

·         Dalgaard, P. (2008). Introductory Statistics with R. Germany: Springer New York.

·         Venables, W. N., Smith, D. M. (2009). An Introduction to R: A Programming Environment for Data Analysis and Graphics. United Kingdom: Network Theory.

·         Verzani, J. (2018). Using R for Introductory Statistics. United States: CRC Press.

Learning Assessment

Learning Assessment Procedures

The exam aims to evaluate the achievement of the learning objectives. It is carried out through an oral exam that includes questions related to the program in addition to the discussion of a report concerning a real data analysis performed using both the methodologies treated during the course and the R statistical software.

Examples of frequently asked questions and / or exercises

·         Writing an R code to find the maximum likelihood estimates of the parameters of the log-normal distribution

·         Writing an R code to find the maximum likelihood estimates of the parameters of a linear model with covariates both on the mean and on the variance of the normal distribution for the error

VERSIONE IN ITALIANO