logo-polimi
Loading...
Risorse bibliografiche
Risorsa bibliografica obbligatoria
Risorsa bibliografica facoltativa
Scheda Riassuntiva
Anno Accademico 2019/2020
Scuola Scuola di Ingegneria Industriale e dell'Informazione
Insegnamento 052911 - APPLIED STATISTICS
Docente Secchi Piercesare
Cfu 5.00 Tipo insegnamento Monodisciplinare
Didattica innovativa L'insegnamento prevede  1.0  CFU erogati con Didattica Innovativa come segue:
  • Blended Learning & Flipped Classroom

Corso di Studi Codice Piano di Studio preventivamente approvato Da (compreso) A (escluso) Insegnamento
Des (Mag.)(ord. 270) - BV (1092) DESIGN DEGLI INTERNI - INTERIOR DESIGN*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1097) DESIGN FOR THE FASHION SYSTEM - DESIGN PER IL SISTEMA MODA*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1159) PRODUCT SERVICE SYSTEM DESIGN - DESIGN PER IL SISTEMA PRODOTTO SERVIZIO*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1160) DESIGN DEL PRODOTTO PER L'INNOVAZIONE*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1162) DESIGN DELLA COMUNICAZIONE*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1163) DESIGN PER IL SISTEMA MODA*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1164) PRODUCT SERVICE SYSTEM DESIGN*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1260) INTERIOR AND SPATIAL DESIGN*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1261) INTEGRATED PRODUCT DESIGN*AZZZZ053791 - APPLIED STATISTICS
Des (Mag.)(ord. 270) - BV (1262) DIGITAL AND INTERACTION DESIGN*AZZZZ053791 - APPLIED STATISTICS
Ing Ind - Inf (Mag.)(ord. 270) - BV (479) MANAGEMENT ENGINEERING - INGEGNERIA GESTIONALE*AZZZZ052911 - APPLIED STATISTICS
099433 - APPLIED STATISTICS FOR ENG4SD

Obiettivi dell'insegnamento

The course covers new approaches in the areas of statistical modeling and data analysis, using ideas that bridge the gap between statistics and computer science and developing tools for the statistical mining of big data. The focus is on predictive learning, with particular emphasis on recent advances in regression and classification. The course takes advantage of a blended learning approach, making extensive use of the Statistical Learning MOOC by Hastie and Tibshirani referenced in the Bibliography.

The course fits into the overall program curriculum pursuing some of the defined general learning goals. In particular, the course contributes to the development of the following capabilities:

  • Design solutions applying a scientific and engineering approach (Analysis, Learning, Reasoning, and Modeling capability deriving from a solid and rigorous multidisciplinary background) to face problems and opportunities in a business and industrial environment

  • Interact in a professional, responsible, effective and constructive way in a working environment, also motivating group members


Risultati di apprendimento attesi

- At the end of the course students are expected to be able to design and run with R a data driven analysis aimed at a classification problem, both supervised or unsupervised, or at the fitting of a regression model, handling classical (OLS, Logistic regression) or more modern approaches to model building and selection (ridge regression and lasso, CART, random forests). Leveraging on their engineering forma mentis and on the skills in data analysis acquired in the course, students are expected to be able to evaluate the practical and statistical significance of the final result of their data analysis, to quantify its uncertainty, e.g. by applying cross-validation procedures, and to diagnose its potential shortcomings, either when used to provide an empirical explanation of the industrial or scientific problem under study or when its main goal is to formulate predictions.  

- To prepare for responsible and efficient interactions in a working enviroment, every student is required to take part in a real data analysis project developed by an independently formed team of 2-4 members. The work in progress of the projects will be collectively discussed during general meetings scheduled along the course; final analyses and results will be presented in a workshop which will take place at the end of the course.

 


Argomenti trattati

Program:

1) Introduction to statistical learning.

2) Dimension reduction. Principal Component Analysis.

3) Linear Models. Simple and multiple linear regression. Estimating the coefficients, assessing the accuracy of the coefficient estimates, assessing the accuracy of the model. Qualitative predictors. Model selection and regularization: subset selection, shrinkage methods (ridge regression and lasso), dimension reduction methods.

4) Supervised classification. Logistic regression. Linear and Quadratic discriminant analysis.

5) Unsupervised classification. Hierarchical clustering, K-means clustering.

6) Resampling methods. Cross-validation. The bootstrap.

7) Tree-based methods. Classification and regression trees. Bagging, random forests, boosting. 

 

Following a blended learning approach, the course will make extensive use of the Statistical Learning MOOC by Hastie and Tibshirani referenced in the Bibliography. All methods will be illustrated using applications from marketing, finance, biology and other areas; the R free software environment for statistical computing and graphics (downloadable at www.r-project.org ) will be used and illustrated throughout the course and its lab sessions.

 

Through the course, students are required to work in team on a real data analysis project whose progress will be shown periodically to the class.

 

 

 


Prerequisiti

A basic course in Statistics for Engineers.


Modalità di valutazione

The exam consists of two parts:

(a) A written exam. The written exam is made up of of a few (usually four or five) data analysis problems to be solved with R. The use of a personal computer is allowed as well as that of books, personal notes etc.

(b) Team Project evaluation. Projects will be collectively evaluated by the teachers of the course and by the students participating to a final workshop at the end of the course. 

To pass the exam the student must pass each part of the exam with a score greater than or equal to 18/30; the final score is then obtained as the weighted average of the two scores, with weights respectively equal to 0.6 for the written exam and 0.4 for the project evaluation.


Bibliografia

Software utilizzato
Nessun software richiesto

Forme didattiche
Tipo Forma Didattica Ore di attività svolte in aula
(hh:mm)
Ore di studio autonome
(hh:mm)
Lezione
25:00
37:30
Esercitazione
0:00
0:00
Laboratorio Informatico
25:00
37:30
Laboratorio Sperimentale
0:00
0:00
Laboratorio Di Progetto
0:00
0:00
Totale 50:00 75:00

Informazioni in lingua inglese a supporto dell'internazionalizzazione
Insegnamento erogato in lingua Inglese
Disponibilità di materiale didattico/slides in lingua inglese
Disponibilità di libri di testo/bibliografia in lingua inglese
Possibilità di sostenere l'esame in lingua inglese
Disponibilità di supporto didattico in lingua inglese
schedaincarico v. 1.8.3 / 1.8.3
Area Servizi ICT
03/12/2023