Events

Reading Group

Tool Preferences in Agentic LLMs are Unreliable

Akbar Karimi

2025-10-21

Large Language Models — the Future of Fundamental Physics?

Shivam Rawat

2025-10-15

Shangrui Nie

2025-10-01

Deep Think with Confidence

Frederik Labonte

2025-09-17

CINEMETRIC: A Framework for Multi-Perspective Evaluation of Conversational Agents using Human-AI Collaboration

Vahid Sadiri Javadi

2025-09-10

Exploring LLM Priming Strategies for Few-Shot Stance Classification

Wei-Fan Chen

2025-08-27

Multiple LLM Agents Debate for Equitable Cultural Alignment

Ipek Baris

2025-08-13

Normative conflicts and shallow AI alignment

Florian Mai

2025-07-23

Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models

David Kaczér

2025-07-16

Optimising your training data using model-led iterative confidence-based sample selection

Frederik Labonte

2025-07-09

Audio-Based Classification and Geographic Regression of Austrian Dialects

Lea Fischbach

2025-07-02

Are Reasoning Models More Prone to Hallucination?

Shangrui Nie

2025-06-25

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Akbar Karimi

2025-06-18

Reasoning Models Can Be Effective Without Thinking

Vahid Sadiri Javadi

2025-06-11

Do LLM Evaluators Prefer Themselves for a Reason?

Wei-Fan Chen

2025-06-04

Learning to Reason without External Rewards

Florian Mai

2025-05-28

Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia

Ipek Baris

2025-05-21

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

David Kaczér

2025-05-07

ARITHMETIC WITHOUT ALGORITHMS: LANGUAGE MODELS SOLVE MATH WITH A BAG OF HEURISTICS

Akbar Karimi

2025-04-23

On Calibration of Speech Classification Models: Insights from Energy-Based Model Investigations

Lea Fischbach

2025-04-16

TokenSkip: Controllable Chain-of-Thought Compression in LLMs

Shangrui Nie

2025-04-09

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Frederik Labonte

2025-04-02

AGILE: A Novel Reinforcement Learning Framework of LLM Agents

Vahid Sadiri Javadi

2025-03-12

How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Wei-Fan Chen

2025-03-05

Training Large Language Models to Reason in a Continuous Latent Space

Florian Mai

2025-02-26

How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

Ipek Baris

2025-01-29

Deliberation in Latent Space via Differentiable Cache Augmentation

David Kaczér

2025-01-15

Adversarial Attacks on Hyperbolic Networks

Akbar Karimi

2025-01-08

Assessing Social Alignment: Do Personality-Prompted Large Language Models Behave Like Humans?

Christian Nickel

2024-12-11

Are Large Language Models Capable of Generating Human-Level Narratives?

Vahid Sadiri Javadi

2024-12-04

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Frederik Labonte

2024-11-27

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

David Kaczér

2024-11-20

Can Machine Unlearning Reduce Social Bias in Language Models?

Shangrui Nie

2024-11-13

Machine Unlearning of Pre-trained Large Language Models

Akbar Karimi

2024-11-06

KG-Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter-Efficient Fine-Tuning

Shaina Ashraf

2024-10-30

Do language models practice what they preach? Examining language ideologies about gendered language reform encoded in LLMs

Wei-Fan Chen

2024-10-23

Learning to Plan for Language Modeling from Unlabeled Data

Florian Mai

2024-10-09

Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering

Ipek Baris

2024-10-02

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Tianyi Zhang

2024-09-25

Mixture-of-Agents Enhances Large Language Model Capabilities

David Kaczér

2024-09-18

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Frederik Labonte

2024-09-11

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Shangrui Nie

2024-08-28

Extreme Miscalibration and the Illusion of Adversarial Robustness

Akbar Karimi

2024-08-21

LLM-based NLG Evaluation: Current Status and Challenges

Wei-Fan Chen

2024-08-14

ClaimVer: Explainable Claim-Level Verification and Evidence Attribution of Text Through Knowledge Graphs

Shaina Ashraf

2024-08-07

USER-LLM: Efficient LLM Contextualization with User Embeddings

Mounika Marredy

2024-07-24

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shangrui Nie

2024-07-17

Stop! In the Name of Flaws: Disentangling Personal Names and Sociodemographic Attributes in NLP

Akbar Karimi

2024-07-10

Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation

Wei-Fan Chen

2024-07-03

Player-Driven Emergence in LLM-Driven Game Narrative

Vahid Sadiri Javadi

2024-06-26

Chain-of-knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources

Shaina Ashraf

2024-06-19

Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models

Allison Lahnala

2024-05-29

Can Large Language Models Provide Useful Feedback on Research Papers? A Large-Scale Empirical Analysis

Mounika Marredy

2024-05-22

Language Imbalance Can Boost Cross-lingual Generalisation

Shangrui Nie

2024-05-15

Chain-of-Thought Reasoning Without Prompting

Akbar Karimi

2024-04-24

ReAct: Synergizing Reasoning and Acting in Language Models

Wei-Fan Chen

2024-04-17

How Far Can We Extract Diverse Perspectives from Large Language Models?

Joan Plepi

2024-04-10

Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from Knowledge Graphs

Shaina Ashraf

2024-04-03

Evaluating Large Language Models as Generative User Simulators for Conversational Recommendation

Vahid Sadiri Javadi

2024-03-27

Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting

Charlie Welch

2024-03-20

Text Alignment Is An Efficient Unified Model for Massive NLP Tasks

Mounika Marredy

2024-03-06

TRAK: Attributing Model Behavior at Scale

Akbar Karimi

2024-02-28

A Comparative Multidimensional Analysis of Empathetic Systems

Allison Lahnala

2024-02-21

Editing Factual Knowledge in Language Models

Wei-Fan Chen

2024-02-07

Tuning Language Models by Proxy

Joan Plepi

2024-01-31

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language

Shaina Ashraf

2024-01-24

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

Vahid Sadiri Javadi

2024-01-17

Persona-Guided Planning for Controlling the Protagonist’s Persona in Story Generation

Charlie Welch

2024-01-10

Large Language Models of Code Fail at Completing Code with Potential Bugs

Mounika Marredy

2023-12-20