logo-polimi
Loading...
Risorse bibliografiche
Risorsa bibliografica obbligatoria
Risorsa bibliografica facoltativa
Scheda Riassuntiva
Anno Accademico 2019/2020
Scuola Scuola di Ingegneria Industriale e dell'Informazione
Insegnamento 089183 - DATA BASES 2
Docente Braga Daniele Maria
Cfu 5.00 Tipo insegnamento Monodisciplinare

Corso di Studi Codice Piano di Studio preventivamente approvato Da (compreso) A (escluso) Insegnamento
Ing - Civ (Mag.)(ord. 270) - MI (495) GEOINFORMATICS ENGINEERING - INGEGNERIA GEOINFORMATICA*AZZZZ051891 - DATA BASES 2
089183 - DATA BASES 2
051267 - DATA BASES 2 AND SOFTWARE ENGINEERING 2
Ing Ind - Inf (Mag.)(ord. 270) - MI (263) MUSIC AND ACOUSTIC ENGINEERING*MZZZZ089183 - DATA BASES 2
Ing Ind - Inf (Mag.)(ord. 270) - MI (471) BIOMEDICAL ENGINEERING - INGEGNERIA BIOMEDICA*MZZZZ089183 - DATA BASES 2
Ing Ind - Inf (Mag.)(ord. 270) - MI (474) TELECOMMUNICATION ENGINEERING - INGEGNERIA DELLE TELECOMUNICAZIONI*MZZZZ089183 - DATA BASES 2
Ing Ind - Inf (Mag.)(ord. 270) - MI (481) COMPUTER SCIENCE AND ENGINEERING - INGEGNERIA INFORMATICA*MZZZZ089183 - DATA BASES 2

Obiettivi dell'insegnamento

The goal of the course is

  • to enable students to deepen the knowledge of the architecture and functionalities of relational database management systems (internal structure, modularity, capabilities, ...)
  • to achieve a well-established awareness of the implications of the transactional ACID properties on concurrency and reliability
  • to show to the students how relational data are stored within a DBMS and how the availability of suitable indexing structures and data statistics profoundly affect query execution
  • to learn how a rule-based computational paradigm can help to build incrementally additional features for a DBMS such as business rules, checking of general constraints, view maintenance of derived data, etc. via the trigger concept
  • to learn new data models (semistructured, ordered) and new languages to handle data expressed in such models (e.g., XPath, XQuery).

All in all, the main purpose of this course is to build on consolidated data management technologies to provide the learners with the ability to understand the opportunities and limitations of the plethora of new emerging technologies for data management. 


Risultati di apprendimento attesi

Dublin Descriptors

Expected learning outcomes

Knowledge and understanding

Students will learn:

  • The internal structure of the "black-box" that implements the SQL language
  • New data models (semistructured, ordered) and language paradigms (rule-based programming and functional data languages)
  • How abstract ACID transactional properties translate to system requirements

Applying knowledge and understanding

Given practical scenarios, students will be able to:

  • Undertake strategical suitable actions to improve the performances of a database management system w.r.t. a specific goal (e.g., speed-up of a specific family of queries)
  • Tune relational database system requirements to the actual needs of a data-centric application
  • Formulate data transformations tasks in different languages and over different data models

Making judgements

Students will be able to:

  • Evaluate whether a specific application or scenario requires the power and complexity of a full-fledged relational DBMS or can instead benefit of simpler, light-weighted alternative solutions 
  • Analyze and understand the probable causes of inefficiency in a complex multi-layered application built on a relational system (identifying bottlenecks and critical configurations) 
  • Compare and balance pros and cons of different solutions for the same problem

Lifelong learning skills

The course:

  • Provides a solid, time-tested theoretical foundations that capture the essentials of many technological solutions given over time to classical data management problems
  • Develops the ability to understand the opportunities and limitations of the plethora of new emerging technologies for data management
  • Contributes to broaden the mental agility and flexibility that is required by a data engineer to envision and design alternative solutions to a given problem based on the constraints of the environment in which the system is expected to be deployed

Argomenti trattati
  • Transactional Systems. Why transactional systems are relevant. Examples of transactional systems: financial applications, banks, online order-entry (e-commerce), online booking, "wall-street applications". Notion of "transaction". ACID properties: atomicity, consistency, isolation, durability.
  • Concurrency control theory. Histories (or schedules), serializability, various notions of equivalence, complexity of testing. View-serializability, conflict-serializability. Two phase locking. Hierarchical locking. Deadlock analysis and resolution. Timestamp-based concurrency control. Multi-version concurrency control. Implementation of locking in commercial systems.
  • Reliability control theory. Notion of: stable storage, logging, checkpointing, write-ahead log rule, commit rule. Recovery protocols: warm restart, cold restart. Implementation of reliability control in commercial systems. Commit protocols, theory of two-phase-commit, presumed-abort and read-only optimisations, non-blocking protocols (3 and 4 phase commit protocols). Implementation of commit protocols with heterogeneous DB servers in the X-open standard.
  • Database architectures. Distributed databases. Notion of fragmentation, allocation, transparent access. Query optimisation. Distributed transactions. Parallelism in database servers. Shared-memory vs shared-nothing approaches. Scale-up, speed-up, benchmarking of performance. Replicated databases. Synchronous vs asynchronous methods. Symmetric vs primary-secondary approaches. 
  • Internal structure of a database server. Buffer Management. Page management. Data organisations according to the sequential, direct, and indexed data structures. B and B+ trees. Hashing functions. Access methods: scans, ordering, joins. Join methods. Query optimisation fundamentals. Cost models and optimal query plan selection (branch&bound method for execution plan selection). Database administration in commercial systems. Hints to physical database design (index selection, primary storage method selection).
  • XML Databases. XML as a data modeling paradigm. Native vs relational storage. Query languages for XML. XPath, XQuery.
  • Active databases: The ECA Paradigm (event/condition/action) and data management. Execution methods for active databases. Trigger languages and systems. Formal properties of active rule sets. Termination, confluence, observable determinism. Rule analysis. Design of active rules for integrity maintenance, automatic data derivation, application business rules, etc.
  • Other advanced topics, such as: 
  •    Data analysis: OLTP vs OLAP. Data warehouses. Multidimensional model. Data cubes. Data analysis operators: cube, rollup, pivot.
  •    Hints to evolutions of database technology: noSQL, data streams.

Prerequisiti

In order to fully understand and be ready to apply the notions of the course, students are required to possess: 

  • Programming skills in any general-purpose language (design of data structure, simple classical algorithms, recursion, stack of activation records) 
  • Very good knowledge of the relational data model and of the SQL language (table creation, formulation of complex queries, update commands, integrity constraints)
  • Basic knowledge of computing system architectures (memory management, caching algorithms, file system) and operating systems
  • Reading Entity-Relationship diagrams 

 

 

 

 


Modalità di valutazione

The exam consists in a written verification covering all the topics of the course.

 

Type of assessment

Description

Dublin descriptor

Written test

Exercises focusing on comprehension 
  • Estimation of query execution costs  
  • Theoretical open question

Exercises focusing on design aspects

  • Design of trigger-based systems
  • Formulation of data transformations in XPath/XQuery
  • Design of data distribution/allocation schemes

 

1,2,3

 

1, 2, 3, 5

 

 

 


Bibliografia
Risorsa bibliografica facoltativaP. Atzeni, S. Ceri, P. Fraternali, S. Paraboschi, R. Torlone, Basi di dati, Editore: McGraw-Hill Italia, Anno edizione: 2018, ISBN: 978-8838694455
Risorsa bibliografica facoltativaP. Atzeni, S. Ceri, S. Paraboschi, R. Torlone, Database systems, Editore: McGraw Hill, Anno edizione: 1999, ISBN: 978-0077095000
Note:

English version of an older edition of the Italian textbook. PDF available at http://dbbook.dia.uniroma3.it/

Risorsa bibliografica facoltativaR. Elmasri, S. Navathe, Database systems, Editore: Pearson, Anno edizione: 2010, ISBN: 978-0132144988
Note:

General database textbook, for students who are not able to read the textbook in Italian. Many other textbooks can be used.


Forme didattiche
Tipo Forma Didattica Ore di attività svolte in aula
(hh:mm)
Ore di studio autonome
(hh:mm)
Lezione
30:00
45:00
Esercitazione
20:00
30:00
Laboratorio Informatico
0:00
0:00
Laboratorio Sperimentale
0:00
0:00
Laboratorio Di Progetto
0:00
0:00
Totale 50:00 75:00

Informazioni in lingua inglese a supporto dell'internazionalizzazione
Insegnamento erogato in lingua Inglese
Disponibilità di materiale didattico/slides in lingua inglese
Disponibilità di libri di testo/bibliografia in lingua inglese
Possibilità di sostenere l'esame in lingua inglese
Disponibilità di supporto didattico in lingua inglese
schedaincarico v. 1.6.1 / 1.6.1
Area Servizi ICT
14/12/2019