Word-embeddings Italian semantic spaces: A semantic model for psycholinguistic research
University of Milano-Bicocca
distributional semantics, word embeddings, Italian, semantic similarity, psycholinguistic resources
Marelli, M. (in press). Word-embeddings Italian semantic spaces: A semantic model for psycholinguistic research. Psihologija.
- Distributional semantics provides automatic and cognitively plausible estimates of semantic relations between words.
- However, for many languages such resources are not available in an easy-to-use format.
- We release two models for Italian, based on state-of-the-art techniques and accessible through a web interface.
- The model simulations are shown to be in line with psycholinguistic data.
Distributional semantics has been for long a source of successful models in psycholinguistics, permitting to obtain semantic estimates for a large number of words in an automatic and fast way. However, resources in this respect remain scarce or limitedly accessible for languages different from English. The present paper describes WEISS (Word-Embeddings Italian Semantic Space), a distributional semantic model based on Italian. WEISS includes models of semantic representations that are trained adopting state-of-the-art word-embeddings methods, applying neural networks to induce distributed representations for lexical meanings. The resource is evaluated against two test sets, demonstrating that WEISS obtains a better performance with respect to a baseline encoding word associations. Moreover, an extensive qualitative analysis of the WEISS output provides examples of the model potentialities in capturing several semantic phenomena. Two variants of WEISS are released and made easily accessible via web through the SNAUT graphic interface.