logo-polimi
Loading...
Risorse bibliografiche
Risorsa bibliografica obbligatoria
Risorsa bibliografica facoltativa
Scheda Riassuntiva
Anno Accademico 2014/2015
Scuola Scuola di Ingegneria Industriale e dell'Informazione
Insegnamento 088946 - NATURAL LANGUAGE PROCESSING
Docente Sbattella Licia
Cfu 5.00 Tipo insegnamento Monodisciplinare

Corso di Studi Codice Piano di Studio preventivamente approvato Da (compreso) A (escluso) Insegnamento
Ing Ind - Inf (Mag.)(ord. 270) - CO (482) COMPUTER SCIENCE AND ENGINEERING - INGEGNERIA INFORMATICA* AZZZZ088946 - NATURAL LANGUAGE PROCESSING

Programma dettagliato e risultati di apprendimento attesi

Course content and goals

Computational processing of written and spoken natural language (Natural Language Processing - NLP) refers to the analysis, interpretation, and production of natural language clauses. NLP is a growing, interdisciplinary research field, highly interesting from both the theoretical and the practical point of view.

Technical innovation is shifting both speculative and applicative attention towards heterogeneous language forms, such as verbal, iconic, and gestural.

After decades of theoretical and applied research a vast collection of symbolic and stochastic models exist. Such models enable the development of applications in several fields: human-machine interaction; document analysis, search, and authoring (even in distributed settings); multimodal and multilingual linguistic authoring; etc.    

Problems, models, and methodologies faced by NLP are also quite interesting for the study of communication, expression, and interaction processes among human beings, as such topics are common to several disciplines (for example, think to communication, instruction, computer science, electronics, cognitive sciences, psychology, or medicine).

History of NLP -full of huge efforts, intense discussions, and many failures- shows the complexity of the topic. Symbolic models, initially prevalent in the area, turned out to be unable to capture the intrinsic complexity of natural language. Today, such models are often augmented by means of stochastic models, especially about morphology, lexicon, syntax, semantics, pragmatics, and prosody. Moreover, stochastic models are also useful for management, search and retrieval of knowledge, whenever it is expressed by means of natural language (as verbal, iconic or gestural representations).

Current research directions in the NLP field, as in modern linguistics, tend to emphasise relationships between production and interpretation of written, spoken, iconic, and gestural languages.

 

Objectives

  • Introduce students to problems and suitable solution methodologies (with their strengths, and weaknesses) related to the analysis and production of natural language clauses, both written and spoken.
  • Present current role of stochastic models; enlighten new opportunities of combining traditional, formal analysis based models with stochastic models, for morphology, syntax, semantics, pragmatics, discourse and dialog analysis.
  • Provide hands-on, tutored practice sessions, where students can test models and techniques presented during classes. Applications will include: analysis of language based, human-machine and human-human interaction (for written, spoken, iconic, and gestural languages);  linguistic and prosodic production, with prediction; pattern search and recognition; complexity analysis for both texts and generic communication events; definition of user profiles including preferences about expression modalities. 

 

Course structure and topics 

 

Introduction

  • Mind models and linguistic / expressive / interactive competencies:

o      Development of expressive competencies, by means of verbal (both written and spoken), iconic, and gestural languages.

o      Linguistic competencies and the act of thinking.

o      Language, pragmatics, and interaction.

  • Natural language representation: levels and their complexity: computational linguistics as a representation of human linguistic competencies, as a model, and as a solution to specific and well defined problems.
  • Roles of symbolic and stochastic models in: morphologic, syntactic, semantic, and pragmatic analysis; spoken language, phonologic, and prosodic analysis; linguistic prediction; complexity evaluation; pattern recognition.

-          Trends in research and development: model composition and integration; definition of different criteria for model selection and composition/integration, given a language representation and a problem to cope with.

 

Models and techniques for written natural language processing

  • Morphologic analysis and ambiguity resolution: lexicons, corpora and dictionaries
  • Syntactic and structural analysis:

o      Symbolic approaches

o      Stochastic approaches

o      Hybrid approaches

  • Semantic and discourse analysis: using integrated approaches; analysis of different representation levels.

 

Models and techniques for spoken natural language processing

  • Components and characteristics of vocal expression and interaction: feature extraction, classification of vocal characteristics, voice profile definition, vocal expression and interaction model. 
  • Models for the description of: tone and prosody, time scheduling, forms, interactions, and complex dialogues, expressivity.
  • High quality text-to-speech (TTS ) and speech recognition (ASR). Analysis strategies and models for emotional and affective components in both TTS and ASR.
  • Models and tools supporting an integrated analysis of verbal expressions, and supporting the enhancement of linguistic competencies in contexts of communication, formative or clinical relationship, and artistic expression.

 

Hands-on and tutoring sessions about applications and tools

  • Human-machine and human-human interaction.
  • Analysis and elaboration of linguistic-expressive resources on the net.
  • Supporting the analysis of communication and dialogue.
  • Supporting text authoring with prediction.
  • Supporting text complexity analysis.
  • Defining linguistic user profiles for verbal (both written and spoken), iconic, and gestural languages.

Note Sulla Modalità di valutazione

The students will be asked:

1) to pass a written proof

2) to prepare and discuss a project


Bibliografia

Mix Forme Didattiche
Tipo Forma Didattica Ore didattiche
lezione
30.0
esercitazione
20.0
laboratorio informatico
0.0
laboratorio sperimentale
0.0
progetto
0.0
laboratorio di progetto
0.0

Informazioni in lingua inglese a supporto dell'internazionalizzazione
Insegnamento erogato in lingua Inglese
Disponibilità di materiale didattico/slides in lingua inglese
Disponibilità di libri di testo/bibliografia in lingua inglese
Possibilità di sostenere l'esame in lingua inglese
Disponibilità di supporto didattico in lingua inglese
schedaincarico v. 1.6.1 / 1.6.1
Area Servizi ICT
28/02/2020