Course content and goals Computational processing of written and spoken natural language (Natural Language Processing - NLP) refers to the analysis, interpretation, and production of natural language clauses. NLP is a growing, interdisciplinary research field, highly interesting from both the theoretical and the practical point of view. Technical innovation is shifting both speculative and applicative attention towards heterogeneous language forms, such as verbal, vocal, iconic, and gestural. After decades of theoretical and applied research a vast collection of symbolic and stochastic models exist. Such models enable the development of applications in several fields: human-machine interaction; document analysis, search, and authoring (even in distributed settings); multimodal and multilingual linguistic authoring; machine translation; etc. Problems, models, and methodologies faced by NLP are also quite interesting for the study of communication, expression, and interaction processes among human beings, conversation and dialogue analyses; sentiment analysis and langugae rehabilitation as such topics are common to several disciplines (for example, think to communication, instruction, cognitive sciences, psychology, or medicine). History of NLP -full of huge efforts, intense discussions, and many failures- shows the complexity of the topic. Symbolic models, initially prevalent in the area, turned out to be unable to capture the intrinsic complexity of natural language. Today, such models are often augmented by means of stochastic models, especially about morphology, lexicon, syntax, semantics, pragmatics, and prosody. Moreover, stochastic models are also useful for management, search and retrieval of knowledge, whenever it is expressed by means of natural language (as verbal, iconic or gestural representations). Current research directions in the NLP field, as in modern linguistics, tend to emphasise relationships between production and interpretation of written and spoken communication. Objectives Introduce students to problems and suitable solution methodologies (with their strengths, and weaknesses) related to the analysis and production of natural language clauses, both written and spoken. Present current role of stochastic models; enlighten new opportunities of combining traditional, formal analysis based models with stochastic models, for morphology, syntax, semantics, pragmatics, voice, prosody, discourse and dialog analysis, sentiment analysis. Provide hands-on, tutored practice sessions, where students can test models and techniques presented during classes. Applications will include: analysis of language based, human-machine and human-human interaction (for written, spoken, iconic, and gestural languages); linguistic and prosodic production and rehabilitation; pattern search and recognition for sentiment analysis in critical interaction; complexity analysis for both texts and generic communication events; definition of user profiles including preferences about expression modalities (in forensic, educative and clinical context). Course structure and topics Introduction Mind models and linguistic / expressive / interactive competencies: o Development of expressive competencies, by means of verbal (both written and spoken), iconic, and gestural languages. o Linguistic competencies and the act of thinking. o Language, pragmatics, and interaction. Natural language representation: levels and their complexity: computational linguistics as a representation of human linguistic competencies, as a model, and as a solution to specific and well defined problems. Roles of symbolic and stochastic models in: morphologic, syntactic, semantic, and pragmatic analysis; sentiment analysis; spoken language, phonologic, and prosodic analysis; linguistic prediction; complexity evaluation; pattern recognition. Trends in research and development: model composition and integration; definition of different criteria for model selection and composition/integration, given a language representation and a problem to cope with. Models and techniques for written natural language processing Morphologic analysis and ambiguity resolution: lexicons, corpora and dictionaries Syntactic and structural analysis: o Symbolic approaches o Stochastic approaches
o Deep Learing approaches
o Hybrid approaches
Semantic and discourse analysis: using integrated approaches; analysis of different representation levels. Models and techniques for spoken natural language processing Components and characteristics of vocal expression and interaction: feature extraction, classification of vocal characteristics, voice profile definition, vocal expression and interaction model. Models for the description of: tone and prosody, time scheduling, forms, interactions, and complex dialogues, expressivity. High quality text-to-speech (TTS ) and speech recognition (ASR). Analysis strategies and models for emotional and affective components in both TTS and ASR. Models and tools supporting an integrated analysis of verbal expressions, and supporting the enhancement of linguistic competencies in contexts of communication, forensic, educative and clinical relationship, and artistic performance. Hands-on and tutoring sessions about applications and tools Human-machine and human-human interaction. Analysis and elaboration of linguistic-expressive resources on the net. Supporting the analysis of communication and dialogue. Supporting text authoring with prediction and summarization. Supporting text complexity analysis. Supporting speech and prosodic analysis
Supporting sentiment analysis in critical interaction
NLP for language rehabilitation
Defining linguistic user profiles for verbal (both written and spoken) languages.
|