Student Activities and Resources

Publications

2025

Is It Worth Using LLMs for Unfair Clause Detection in Terms of Service?
Marco Panarelli, Andrea Galassi, Francesca Lagioia, Rūta Liepiņa, Marco Lippi, Przemysław Pałka, Giovanni Sartor
20th International Conference on Artificial Intelligence and Law (ICAIL), pp. 139-149, 2025
🏆 Awarded the “Peter Jackson” Award for Best Innovative Application Paper
DOI | PDF

2024

MAMKit: A Comprehensive Multimodal Argument Mining Toolkit.
Eleonora Mancini, Federico Ruggeri, Stefano Colamonaco, Andrea Zecca, Samuele Marro, and Paolo Torroni. 2024.
In Proceedings of the 11th Workshop on Argument Mining (ArgMining 2024), pages 69–82, Bangkok, Thailand. Association for Computational Linguistics.
DOI | PDF

2023

TeamUnibo at SemEval-2023 Task 6: A transformer based approach to Rhetorical Roles prediction and NER in Legal Texts
Yuri Noviello, Enrico Pallotta, Flavio Pinzarrone, and Giuseppe Tanzi.
17th International Workshop on Semantic Evaluation (SemEval-2023), pp. 275–284, 2023.
DOI | PDF

2022

Fast Vocabulary Transfer for Language Model Compression
Leonidas Gee, Andrea Zugarini, Leonardo Rigutini, Paolo Torroni
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP): Industry Track, 2022
DOI | PDF

Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts
Sezen Perçin, Andrea Galassi, Francesca Lagioia, Federico Ruggeri, Piera Santin, Giovanni Sartor, Paolo Torroni.
Workshop on Natural Legal Language Processing (NLLP@EMNLP), pp. 47–52, 2022.
DOI | PDF

A Sentiment and Emotion Annotated Dataset for Bitcoin Price Forecasting Based on Reddit Posts.
Pavlo Seroyizhko, Zhanel Zhexenova, Muhammad Zohaib Shafiq, Fabio Merizzi, Andrea Galassi, and Federico Ruggeri. 2022.
In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pages 203–210, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
DOI | PDF

SubjectivITA: An Italian Corpus for Subjectivity Detection in Newspapers
Francesco Antici, Luca Bolognini, Matteo Antonio Inajetovic, Bogdan Ivasiuk, Andrea Galassi, Federico Ruggeri
12th Conference and Labs of the Evaluation Forum (CLEF), pp. 40–52, 2021
DOI

Master Theses

2024

Enhancing Document Parsing And Question Answering Through Optimized Table Parsing
Marta Stella.
July 2024.

Development of an LLM-based System for the Generation of Multiple-choice English Grammar Exercises
Matteo Periani.
July 2024.

An Artificial Intelligence System for the Autonomous Pagination of Newspapers
Enab Muneer.
July 2024.
PDF

Example Sentence Suggestion for Learners of Japanese as a Second Language Using Pretrained Language Models
Enrico Benedetti.
March 2024.
PDF

Fine-Tuning Neural Codec Language Models from Feedback with Reinforcement Learning
Lorenzo Pratesi.
March 2024.
PDF

Diving into Song Lyrics with Large Language Models: Unveiling Metadata Insights and Fueling Video Lyrics Generation
Simona Scala.
March 2024.
PDF

Automating Test Case Generation for Automotive Industry using Large Language Models
Giuseppe Tanzi.
March 2024.
PDF

Developing and Comparing Machine Reasoning Models to Humans in NLP Tasks
Mohammad Reza Ghasemi Madani.
February 2024.
PDF

2023

Comprehensive study of clinical entity extraction and classification using Large Language Models
Michele Faedi.
December 2023.
PDF

Addressing Misinformation Challenges in War Scenario: Russo-Ukrainian War
Bogdan Ivasiuk.
December 2023.
PDF

Neural-Symbolic Learning: challenges and benchmarks
Vincenzo Collura.
October 2023.
PDF

From text to knowledge: Large Language Models-based methods for knowledge extraction
Gianmarco Pappacoda.
October 2023.
PDF

Automatic Terminology Coding for the Biomedical Domain
Emanuele Bollino.
October 2023.
PDF

On the use of Prompting for Fine-Tuning Neural models for Speech Processing
Stefano Ciapponi.
October 2023.
PDF

Empathic Voice: Enabling Emotional Intelligence in Virtual Assistants
Ildebrando Simeoni.
October 2023.
PDF

Argument Mining into Active Learning Systematic Reviews: unlocking the synergy between MARGOT and ASReview
Elisa Ancarani.
October 2023.
PDF

A Two-Step LLM-Augmented Distillation Method For Passage Reranking
Davide Baldelli.
October 2023.
PDF

Design and Implementation of a Neural Machine Translation Engine for Computer-Assisted Translations
Rooshan Saleem Butt.
October 2023.
PDF

Leveraging Large Language Models for content analysis and generation for podcast transcriptions
Michael Magdy Nasr Zaki Ghaly.
October 2023.
PDF

Design and implementation of a privacy-preserving dialogue system based on argumentation
Lorenzo Borelli.
July 2023.
PDF

Emotion Recognition for Human-Centered Conversational Agents
Luca Bolognini.
March 2023.
PDF

Knowledge graph embedding enhancement using ontological knowledge in the biomedical domain
Lorenzo Niccolai.
March 2023.
PDF

Prompting techniques for Natural Language Generation in the Medical Domain
Martina Rossini.
March 2023.
PDF

Voice conversion with pre-trained representations for audio anonymization
Marco Costante.
February 2023.

2022

SynBA: A contextualized Synonim-Based adversarial Attack for text classification
Giuseppe Murro.
December 2022.
PDF

Voice Cloning: Increasing Expressivity of Italian Text-to-Speech with Phonemization
Martino Mare Lakota Pulici.
December 2022.

Royalty-Management Smart Contracts With Graph Neural Network-Based Artist Recommendations
Nicola Amoriello.
July 2022.

Intermediate Linguistic Task Fine-tuning On Multilingual Models
Luca Rispoli.
July 2022.

Vocabulary Transfer and Knowledge Distillation for Language Model Compression
Gee Jun Hui Leonidas Yunani.
July 2022.
PDF

A Critical Survey Of Text-to-image Synthesis By Generative Adversarial Networks: Concepts, Methods, And Evaluations
Luca Bandini.
March 2022.

Using semantic entities to improve the distillation of transformers
Riccardo Cozzi.
March 2022.
PDF

Graph Neural Networks for Recommender Systems
Oleksandr Olmucci Poddubnyy.
February 2022.
PDF

2021

Disruptive Situations Detection on Public Transports through Speech Emotion Recognition
Eleonora Mancini.
December 2021.
PDF

A Neurosymbolic Framework for Markov Logic Networks
Arcangelo Alberico.
October 2021.
PDF

Small transformers for Bioinformatics tasks
Luca Salvatore Lorello.
July 2021.
PDF

Advanced techniques for cross-language annotation projection in legal texts
Francesco Antici.
July 2021.
PDF

Automatic extraction of scientific articles based on user queries expressed in natural language
Mauro Rondina.
February 2021.
PDF

Bachelor Theses

2020

Development of an argumentative chatbot in python
Federico Spurio.
July 2020.

Resources

International contests, benchmarks, and challenges

There are many other international challenges held every year. Developing models and techniques to tackle past or ongoing challenges may be good proposals for project works.

EVALITA

Suite of tasks in Italian language. Examples: hate speech detection, sentiment analysis, identification of memes, “la ghigliottina” game.

CLEF Labs

The CLEF conference includes many multilingual and multi-modal activity proposals. Examples: math question answering, prediction of mental health issues, text simplification of scientific topics, retrieval of arguments, fact checking.

Competition on Legal Information Extraction and Entailment (COLIEE)

A set of tasks and challenges regarding NLP and legal documents. The intention is to build a community of practice regarding legal information processing and textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results. The tasks change every year.

CliC-it Italian Conference on Computational Linguistics

The spirit of the conference is inclusive. Recognizing the multifaceted nature of language phenomena and the need for interdisciplinary expertise, CLiC-it aims to bring together researchers from different fields including Computational Linguistics and Natural Language Processing, Linguistics, Cognitive Science, Machine Learning, Computer Science, Knowledge Representation, Information Retrieval, and Digital Humanities. CLiC-it welcomes contributions focusing on all languages, with a particular emphasis on Italian.

SemEval International Workshop on Semantic Evaluation

SemEval is a series of international natural language processing (NLP) research workshops whose mission is to advance the current state of the art in semantic analysis and to help create high-quality annotated datasets in a range of increasingly challenging problems in natural language semantics. Each year’s workshop features a collection of shared tasks in which computational semantic analysis systems designed by different teams are presented and compared.

Workshops

Academic workshops discussing topics and proposing shared tasks of our interest.

ArgMining

Argument mining (also known as “argumentation mining”) is a well-established research area in computational linguistics that focuses on the automatic identification of argumentative structures, such as premises, conclusions, and inference schemes. Since its beginnings, the focus has been on the development of large-scale argumentation dataset and tasks like argument quality assessment, argument persuasiveness, and the synthesis of argumentative texts, spanning various domains, such as legal, social, medical, political, and scientific settings.

LUHME

The “Language Understanding in the Human-Machine Era” (LUHME) workshop aims to reignite, retrieve, resume, and refocus the enduring debate about the role of understanding in natural language use and related applications. Specifically, it seeks to elucidate the nature of language understanding and ascertain whether it is indispensable for computational natural language tasks such as automated translation and natural language generation. Furthermore, it aims to provide insight into the role played by language professionals (e.g., linguists, professional translators, interpreters, language educators) in computational natural language understanding. It will, therefore, convene researchers interested in the intersection of language understanding and the effective use of language technologies in human-machine interaction.

ASAIL

The ASAIL workshop series and interest group serves as a platform for researchers and practitioners working on natural language processing of legal text. Its goals include (i) Organising regular peer-reviewed workshop events for presentation and discussion of research and practical implementations around legal NLP; (ii) Facilitating communication and collaboration among academic researchers as well as practitioners from industry, government, and the public sector, and other interested individuals and organisations; (iii) Providing an entry point into the research field and community.

AMELR

The AMELR workshop focuses on Legal Argument Mining (LAM) - using NLP to automatically detect legal arguments. Recent developments in NLP and LAM have provided legal scholars with a powerful tool for studying reasoning patterns, interpretative theories, and biases across jurisdictions and legal systems. The workshop gathers experts in computer science, AI & Law, legal theory, and empirical legal studies to address key challenges of LAM: creating training datasets, developing reliable models, establishing reproducibility standards, and integrating LAM into legal research. The workshop aims to strengthen the emerging field of LAM and its role in empirical legal studies by sharing latest implementations, addressing core challenges, and establishing best practices.

Publications#

2025#

2024#

2023#

2022#

Master Theses#

2024#

2023#

2022#

2021#

Bachelor Theses#

2020#

Resources#

International contests, benchmarks, and challenges#

Workshops#

Publications

2025

2024

2023

2022

Master Theses

2024

2023

2022

2021

Bachelor Theses

2020

Resources

International contests, benchmarks, and challenges

Workshops