A Study on Word2Vec on a Historical Swedish Newspaper Corpus

A Study on Word2Vec on a Historical Swedish Newspaper Corpus

Abstract

Detecting word sense changes can be of great interest in the field of digital humanities. Thus far, most investigations and automatic methods have been developed and carried out on English text and most recent methods make use of word embeddings. This paper presents a study on using Word2Vec, a neural word embedding method, on a Swedish historical newspaper collection. Our study includes a set of 11 words and our focus is the quality and stability of the word vectors over time. We investigate if a word embedding method like Word2Vec can be effectively used on texts where the volume and quality is limited.

Publication
In the Digital Humanities in the Nordic Countries 3rd Conference, DHN2018
Date
Avatar
Nina Tahmasebi
Associate Professor in Natural Language Processing