CAISA Lab

Lucie Flek

Professor

My research interests lie in machine learning applications in the field of Natural Language Processing (NLP), with a core expertise in the area of user modeling and stylistic variation. I have been investigating how various individuals and sociodemographic groups differ in their language usage, and how this variation can be in return used in machine learning tasks to predict in-group behavior of interest. This also led me to a broader interest in the bias that the NLP field is subject to, in stereotype exaggeration, ethics issues, performance of machine learning models on underrepresented groups, and subsequently domain adaptation of the machine learning models.

My PhD was focused on lexical semantics - examining to which extent word ambiguity and context plays a role in document classification tasks. When is the provided context sufficient for the task at hand, following distributional hypothesis, and when does explicit word sense disambiguation, concept graphs or semantic ontologies become beneficial? Do these findings hold with the rise of deep learning architectures? Does explicitly supplied lexical semantic information still improve classification tasks in scenarios with limited training data?

I have been further pursuing the limited training data paradigm in industry, leading projects related to multilingual and multitask learning, as well as various other bootstrapping efforts for limited in-domain labeled data. I am a big fan of cross-disciplinary collaborations, publishing together with educational researchers, psychologists, sociologists, physicists, and visual analysts, among others.