L'insegnamento prevede 1.0 CFU erogati con Didattica Innovativa come segue:

Blended Learning & Flipped Classroom

MOOC

Corso di Studi

Codice Piano di Studio preventivamente approvato

Da (compreso)

A (escluso)

Insegnamento

Ing Ind - Inf (Mag.)(ord. 270) - BV (478) NUCLEAR ENGINEERING - INGEGNERIA NUCLEARE

*

A

ZZZZ

052496 - ALGORITHMS AND PARALLEL COMPUTING

Ing Ind - Inf (Mag.)(ord. 270) - MI (487) MATHEMATICAL ENGINEERING - INGEGNERIA MATEMATICA

*

A

ZZZZ

052496 - ALGORITHMS AND PARALLEL COMPUTING

Obiettivi dell'insegnamento

This course provides the students with all the skills necessary to write efficient algorithms and large-scale software, so as to be able to solve complex and big data problems on parallel systems. The emphasis is on teaching concepts applicable across a wide variety of problem domains (including scientific modelling, statistics, machine learning, operations research and quantitative finance), and transferable across a broad set of frameworks. The goal of the course is to enable students to master object oriented, parallel programming and big data analysis methods that nowadays are necessary to produce software systems with industrial strength quality. The course covers Object Oriented Programming (OOP) principles, C++, OpenMPI for parallel programming, Python and Spark for big data processing. Each topic is treated both theoretically and practically. Through the course, the flipped class innovative teaching method will be exploited during lab sessions to allow students to acquire a good level of autonomy in learning new topics.

Risultati di apprendimento attesi

Dublin Descriptors

Expected learning outcomes

Knowledge and understanding

Students will learn how to:

Develop computational thinking skills to support the modelling, solution and analysis of advanced mathematical models

Structure a problem solution through OOP methods

Develop and compare parallel algorithms

Analyze software code written by others, use software libraries

Implement advanced methods and solve or analyze mathematical models in a parallel programming framework

Applying knowledge and understanding

Given specific project cases or a complex problem, students will be able to:

Identify and evaluate software architectural choices

Apply complexity methods to evaluate multiple algorithms or data structure implementations

Develop and test code fulfilling problem requirements

Analyze and understand the goals, assumptions and requirements associated with a problem in the scientific computing, statistics or quantitative finance domains

Develop and structure software at large scale

Communication

Students will learn to:

Organize code in packages and classes for readability and reuse

Present their work in front of their colleagues during project labs

Argomenti trattati

Description

Historically, parallel computing has been considered to be the high end of computing and has been used to model difficult problems in many areas of science and engineering. Today, commercial applications provide an equal or greater driving force in the development of faster programs. These applications require the processing of large amounts of data in sophisticated ways. Data-intensive applications such as data mining, recommender systems, financial modelling and multimedia processing have implications on the design of algorithms and provide a new challenge for the modern generation of computing platforms. Parallel processing is the only cost-effective method for the fast solution of these big data problems. The emergence of inexpensive parallel computers such as commodity desktop multiprocessors, graphic processors and clusters of entry level servers has made parallel methods generally applicable, as have software standards for portable parallel programming.

Please refer to the course official web site for further details.

Content

The course is structured in four parts.

The first part of the course covers modern OOP and introduces the fundamentals of the C++11 programming language. C++11 is used as the reference language throughout the rest of the course. Students will gain experience in designing simple but powerful object-oriented applications and in writing code using the C++11 language. Example problems cover both traditional computer science algorithms (sorting, searching, lists) as well as simple scientific computing algorithms (matrix computations, gradient descent). This initial part provides the principles for developing at scale industry software.

The second part covers the main aspects of parallel computing: parallel architectures, programming paradigms, parallel algorithms. Parallel architectures range from inexpensive commodity multicore desktops, to general purpose graphic processors, to clusters of computers, to massively parallel computers containing tens of thousands of processors. Students learn how to analyze and classify these architectures in terms of their components (processor architecture, memory organization, and interconnection network). Pros and cons of different parallel programming paradigms (e.g., functional programming, shared memory, message passing) are evaluated by means of simple case studies.

The third part covers data-intensive algorithms for information retrieval and data-mining problems and will focus on Spark, the new open source framework for in memory big data computations, which includes also an extensive machine learning library and provides primitives for streaming data analytics. Python will be used as reference Spark language.

The fourth part of the course introduces MPI, one of the most widely used standards for writing portable parallel programs. This part includes a significant programming component in which students face concrete examples from big data and scientific domains, machine learning, and operations research.

The flipped class innovative method will be exploited during lab sessions to help students to discuss their solution and to identify pros and cons with respect to the ones provided by the instructors and colleagues. The project laboratory will be supervised by the course instructor and by some tutors.

A project is an optional part of the course. The objective of projects is to help students in applying the approaches and principles taught during classes and to gain experience in team working.

Prerequisiti

Students are required to know programming principles and to have a good background in the C programming language or at least another computer language. The knowledge of concepts such as data types and variables, functions and parameter passing, pointers and memory management at runtime are considered preconditions to access this course.

Prerequisites will be overviewed by the instructors in a one-week crash course of about 10-12 hours during the week before the official course start. Students that cannot attend the pre-course can access a subset of video-lectures (in Italian) provided within the “Fondamenti di Informatica 1” MOOC of the “Laurea in Ingegneria Informatica Online” through the BeeP Politecnico di Milano system. Interested students are invited to contact the teacher.

Modalità di valutazione

The assessment will be based on a written exam at the end of the course and on the projects developed by teams including two or three students. Projects will be assigned after the exam only if the grade is greater than or equal to 27 and will provide an increase of up to 4 points. Deadlines for providing project artifacts will be negotiated with the instructors. The evaluation of projects will be based on the produced code and on a presentation.

The following table provides a detailed overview of the elements that will be considered in the various assessment activities.

Type of assessment

Description

Dublin descriptor

Written test

Exercises focusing on design aspects

· Design of data structures and algorithms

· Identification and evaluation of alternative algorithms to solve a given problem

· Definition of the high level architecture and main classes for a software system according to the OOP paradigm

· Definition of code fragments in C++, in OpenMPI for parallel programs and in Python and Spark for big data queries to fulfill specific requirements

1,2

Assessment of laboratorial and projct artefacts

Assessment of the presentation of the work developed as part of laboratory activities or as a project developed by students either individually or in groups