logo-polimi
Loading...
Risorse bibliografiche
Risorsa bibliografica obbligatoria
Risorsa bibliografica facoltativa
Scheda Riassuntiva
Anno Accademico 2017/2018
Scuola Scuola di Ingegneria Industriale e dell'Informazione
Insegnamento 097383 - BUSINESS INTELLIGENCE
Docente Vercellis Carlo
Cfu 5.00 Tipo insegnamento Monodisciplinare

Corso di Studi Codice Piano di Studio preventivamente approvato Da (compreso) A (escluso) Insegnamento
Ing Ind - Inf (Mag.)(ord. 270) - BV (479) MANAGEMENT ENGINEERING - INGEGNERIA GESTIONALE*AZZZZ097383 - BUSINESS INTELLIGENCE

Programma dettagliato e risultati di apprendimento attesi

Objectives

Is it possible to extract useful knowledge for decision making from the huge amount of data available in the data warehouses of companies and public administrations?

Business Intelligence and big data analytics is a broad category of methods and technologies for gathering, providing access to, and analyzing data for the purpose of helping enterprise users make better business decisions. The term implies having a comprehensive knowledge of all factors that affect a business, such as customers, competitors, business partners, economic environment, and internal operations, therefore enabling optimal decisions to be made.

This course provides students with a detailed coverage and a practical guidance to mathematical models and analysis methodologies of business intelligence and big data analytics. It covers all the hot topics such as big data revolution, data warehousing, data mining and its applications, machine learning. It provides a systematic and rigorous treatment of each concept, combined by an extensive use of examples and numerous real-life case studies.

Syllabus

Business Intelligence and Big Data Analytics

Effective and timely decisions. Data, information and knowledge. Development of business intelligence and big data architectures. Decision support systems. Decision-making process. Data warehousing. Data quality. OLAP and multidimensional analysis. Data mining, classical statistics and OLAP. Applications of data mining. Representation of input data. Data mining process.

Data preparation and exploratory analysis

Data validation. Data transformation. Feature extraction. Data reduction. Sampling. Feature selection. Principal component analysis. Data discretization. Univariate analysis: graphical analysis, measures of central tendency, dispersion, relative location, identification of outliers, measures of heterogeneity, analysis of the empirical density. Bivariate analysis: graphical analysis, measures of correlation, contingency tables. Multivariate analysis: graphical analysis, measures of correlation.

Regression

Structure of regression models. Simple linear regression. Multiple linear regression. Assumptions on the residuals.

Treatment of categorical attributes. Ridge regression. Generalized linear regression. Validation of regression models: normality and independence of the residuals, significance of the coefficients, analysis of variance, coefficient of determination, coefficient of linear correlation, multicollinearity, confidence and prediction limits. Selection of predictive variables.

Classification

Taxonomy of classification models. Evaluation of classification models: holdout method, repeated random sampling, cross-validation, confusion matrices, ROC curve charts, cumulative gain and lift charts. Classification trees: splitting rules, stopping criteria and pruning rules. Bayesian methods: naive Bayesian classifiers, Bayesian networks. Logistic regression. Neural networks: Rosenblatt perceptron, multi-level feed-forward networks. Support vector machines: structural risk minimization, maximal margin hyperplane for linear separation, nonlinear separation.

Association rules

Motivation and evaluation of association rules. Single-dimension association rules. Apriori algorithm. Generation of frequent itemsets, generation of strong rules. General association rules.

Clustering

Taxonomy of clustering methods. Affinity measures. Partition methods: K-means, K-medoids. Hierarchical methods: agglomerative methods, divisive methods. Evaluation of clustering models.

Applications and business case studies

Applications in relational marketing: lifetime value analysis, acquisition, retention, cross-selling and up-selling, market basket analysis. Web mining. Social market analysis. Text mining. Fraud and anomaly detection.


Note Sulla Modalità di valutazione

The exam is is in written form. Notice that only students officially registered for a given session will be allowed to take the examination in the session.


Bibliografia
Risorsa bibliografica obbligatoriaCarlo Vercellis, Business Intelligence: Data Mining and Optimization for Decision Making, Editore: Wiley, Anno edizione: 2009, ISBN: 9780470511381 http://onlinelibrary.wiley.com/book/10.1002/9780470753866

Software utilizzato
Nessun software richiesto

Mix Forme Didattiche
Tipo Forma Didattica Ore didattiche
lezione
32.0
esercitazione
16.0
laboratorio informatico
0.0
laboratorio sperimentale
0.0
progetto
0.0
laboratorio di progetto
0.0

Informazioni in lingua inglese a supporto dell'internazionalizzazione
Insegnamento erogato in lingua Inglese
Disponibilità di materiale didattico/slides in lingua inglese
Disponibilità di libri di testo/bibliografia in lingua inglese
Possibilità di sostenere l'esame in lingua inglese
Disponibilità di supporto didattico in lingua inglese
schedaincarico v. 1.8.3 / 1.8.3
Area Servizi ICT
28/11/2023