QMUL-SDS @ DIACR-ITA: Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian
Published in EVALITA Evaluation of NLP and Speech Tools for Italian. Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian Final Workshop., 2020
Recommended citation: Alkhalifa, R., Tsakalidis, A., Zubiaga, A., & Liakata, M. (2020). QMUL-SDS@ DIACR-Ita: Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian. Proceedings of the 7th evaluation campaign of Natural Language Processing and Speech tools for Italian (EVALITA 2020), Online. CEUR. org. https://www.aaccademia.it/scheda-libro?aaref=1423
In this paper, we present the results and main findings of our system for the DIACR-Ita 2020 Task. Our system focuses on using variations of training sets and different semantic detection methods. The task involves training, aligning and predicting a word’s vector change from two diachronic Italian corpora. We demonstrate that using Temporal Word Embeddings with a Compass C-BOW model is more effective compared to different approaches including Logistic Regression and a Feed Forward Neural Network using accuracy. Our model ranked 3rd with an accuracy of 83.3%.