Program in Applied Mathematics Colloquium

How do transformer networks encode linguistic knowledge?

When

3 p.m., Jan. 31, 2020

Pre-trained transformer networks combine a neural self-attention mechanism with a training objective that is compatible with large unlabeled datasets. Models such as BERT and XL-NET have followed this approach to produce new state-of-the-art results across a wide range of natural language processing tasks. But why exactly these models are so successful is not well understood. In this talk, I will present work that looks more closely at the internals of these models and investigates how they acquire and represent linguistic knowledge.

Applied Math Colloquium

Main Menu

Program in Applied Mathematics Colloquium

When

Upcoming Events

Graduate Student Brown Bag Seminar

Modeling and Computation Seminar

Applied Mathematics Colloquium

Upcoming Events

Graduate Student Brown Bag Seminar

Modeling and Computation Seminar

Applied Mathematics Colloquium