Risorse bibliografiche
Risorsa bibliografica obbligatoria
Risorsa bibliografica facoltativa
Scheda Riassuntiva
Anno Accademico 2018/2019
Scuola Scuola di Ingegneria Industriale e dell'Informazione
Insegnamento 099329 - WEB SCIENCE
Docente Brambilla Marco
Cfu 5.00 Tipo insegnamento Monodisciplinare

Corso di Studi Codice Piano di Studio preventivamente approvato Da (compreso) A (escluso) Insegnamento

Obiettivi dell'insegnamento

The objective of the Web Science course focuses on the study of large-scale socio-technical systems associated with the World Wide Web.

It considers the relationship between people and technology, the ways that society and technology complement one another and the way they impact on broader society.

It allows students to learn how to apply in practice the analysis techniques they learn in other courses. These analyses are inherently associated with Big Data management issues.

Risultati di apprendimento attesi

Dublin Descriptors

Expected learning outcomes

Knowledge and understanding

Students will learn how to:

  • Identify problems that can be addressed with Web data analysis
  • The basic technologies for big data analysis applicable to Web-related prolems

Applying knowledge and understanding

Given specific project cases, students will be able to:

  • Define and implement the whole data science pipeline for the problem
  • Apply it on real datasets

Making judgements

Given specific project cases, students will be able to:

  • Learn how to decide which technique to apply and how to evaluate this decision


Students will learn to:

  • Write a report on a project describing and motivating the decisions taken and the results obtained
  • Present their work in front of their colleagues and teachers

Lifelong learning skills

  • Students will learn how to develop a realistic Web and data science project in all its phases

Argomenti trattati

 The course is organised in four parts.

1. Syntax

In the first part, the course introduces the basis of content analysis. If focuses on the syntactic aspects, covering the fundamentals of natural language processing and text mining. It describes the structure and typical characteristics of the different web sources, spanning search results, social media contents, social network structures, Web APIs, and so on. It also provides an overview of the basic Web analysis techniques applied in Web search and Web recommendation.

2. Semantics

In the second part, the course presents semantic technologies. These technologies are very important nowadays because they allow to treat the "variety" dimension of Big Data, i.e., they enable integration of multiple and diverse sources of information, which is typical on the modern Web platform. Covered topics include:

  • RDF - a flexible data model to represent heterogeneous data
  • OWL - a flexible ontological language to model heterogeneous data sources
  • SPARQL - a query language for RDF.

It shows how to put all the pieces together in order to achieve interoperability among heterogeneous information sources

3. Time

The third part covers the realm of temporal-dependent data. The topics covered here allow to treat the "velocity" dimension of Big Data. It shows the importance for many Big Data analysis scenarios to process data stream, coming for instance from Internet of Things (IoT)  and Social Media sources; and it describes how to apply semantic and syntactic techniques in the context of time-dependent information. For instance, it shows how to extend RDF to model RDF streams, how to extend SPARQL to continuously process RDF streams and how to reason on those RDF Streams

4. Applications

In the fourth part, the course focuses on specific application scenarios and presents the typical settings and problems where the presented techniques can be applied. This part discusses settings such as: big data analysis for smart cities; data analytics for brand monitoring (marketing) and event monitoring; data analysis for trend detection and user engagement; and so on.


Exercise and Laboratory Classes

Exercise and laboratory classes describe how to use all those ingredients together in practice, and how to fuse and analyse data coming from multiple sensor networks (e.g. IoT), social network APIs, and information crawled from the Web and from mobile applications (e.g., through social login and log analysis).



Students are expected to know the basics about: Web application design and implementation and database management.

Modalità di valutazione

The exam consist in a practical part (project work) and a theoretical part (written exam with possible oral discussion). 

The practical part consist in solving a realistic problem in web science / data science, based on real or realistic dataset publicly available , accessible via Web API, or provided by the teachers.

The written exam is composed of a mix of theoretical questions regarding any of the course subjects, and excercises, regarding the technical content and how to apply it in practice.

The oral examination consists of a discussion about the written test and the practical part of the exam. It can include also questions on any subject of the course.

Type of assessment


Dublin descriptor

Written test

  • Theoretical questions
  • Exercises focusing on big data, data analysis, and data processing aspects


1, 2, 3

Assessment of project artefacts

  • Assessment of the design and experimental work developed by students in groups

2, 3, 5

Oral presentation

  • Assessment of the presentation of the work developed by students in groups

2, 3, 4, 5


Risorsa bibliografica facoltativaGrigoris Antoniou, Paul Groth, Frank van van Harmelen, Rinke Hoekstra, A Semantic Web Primer (Third Edition), Editore: Cambridge, Mass. : MIT Press, Anno edizione: 2012, ISBN: 0262018284
Risorsa bibliografica obbligatoriaStefano Ceri, Alessandro Bozzon, Marco Brambilla, Emanuele Della Valle, Piero Fraternali, Silvia Quarteroni, Web Information Retrieval, Editore: Springer, Anno edizione: 2013, ISBN: 978-3642393136
Risorsa bibliografica obbligatoria http://www.opengeospatial.org/
Risorsa bibliografica obbligatoria http://www.w3.org/RDF/
Risorsa bibliografica obbligatoria http://www.w3.org/TR/owl-overview/
Risorsa bibliografica obbligatoria http://www.w3.org/TR/sparql11-overview/
Risorsa bibliografica obbligatoria http://sioc-project.org/
Risorsa bibliografica obbligatoria http://streamreasoning.org/

Software utilizzato
Nessun software richiesto

Forme didattiche
Tipo Forma Didattica Ore di attività svolte in aula
Ore di studio autonome
Laboratorio Informatico
Laboratorio Sperimentale
Laboratorio Di Progetto
Totale 50:00 75:00

Informazioni in lingua inglese a supporto dell'internazionalizzazione
Insegnamento erogato in lingua Inglese
Disponibilità di materiale didattico/slides in lingua inglese
Disponibilità di libri di testo/bibliografia in lingua inglese
Possibilità di sostenere l'esame in lingua inglese
Disponibilità di supporto didattico in lingua inglese
schedaincarico v. 1.8.3 / 1.8.3
Area Servizi ICT