EMNLP 2021 HYPMIX, Hyperbolic Interpolative Data Augmentation

EMNLP Data Augmentation Riemannian Hyperbolic

We are looking forward to present our paper “HYPMIX: Hyperbolic Interpolative Data Augmentation” from Ramit Sawhney, Megh Thakkar, Shivam Agarwal, Di Jin, Diyi Yang, Lucie Flek at EMNLP 2021.

In this paper we propose HypMix, a novel model-, data-, and modality-agnostic interpolative data augmentation technique operating in the hyperbolic space, which captures the complex geometry of input and hidden state hierarchies better than its contemporaries.

We devise a novel Möbius Gyromidpoint Label Estimation (MGLE) method to predict soft labels for unlabeled data, and extend HYPMIX to a hyperbolic semi-supervised learning method. We evaluate HypMix on benchmark and low resource datasets across speech, text, and vision modalities, , including semi-supervised settings for Urdu and Arabic tasks. HypMix outperforms several strong baselines and Euclidean counterparts across these tasks.