Task-dependent Optimal Weight Combinations for Static Embeddings

Nathaniel Robinson; Nathaniel Carlson; David Mortensen; Elizabeth Vargas; Thomas Fackrell; Nancy Fulda

doi:10.3384/nejlt.2000-1533.2022.4438

Authors

Nate Robinson Carnegie Mellon University
Nate Carlson Brigham Young University
David Mortensen Carnegie Mellon University
Elizabeth Vargas Brigham Young University
Thomas Fackrell Brigham Young University
Nancy Fulda Brigham Young University

DOI:

https://doi.org/10.3384/nejlt.2000-1533.2022.4438

Abstract

A variety of NLP applications use word2vec skip-gram, GloVe, and fastText word embeddings. These models learn two sets of embedding vectors, but most practitioners use only one of them, or alternately an unweighted sum of both. This is the first study to systematically explore a range of linear combinations between the first and second embedding sets. We evaluate these combinations on a set of six NLP benchmarks including IR, POS-tagging, and sentence similarity. We show that the default embedding combinations are often suboptimal and demonstrate 1.0-8.0% improvements. Notably, GloVe’s default unweighted sum is its least effective combination across tasks. We provide a theoretical basis for weighting one set of embeddings more than the other according to the algorithm and task. We apply our findings to improve accuracy in applications of cross-lingual alignment and navigational knowledge by up to 15.2%.

Task-dependent Optimal Weight Combinations for Static Embeddings

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

Make a Submission