43 - Reinforced Video Captioning with Entailment Rewards, with Ramakanth and Mohit

Published: Dec. 4, 2017, 8:19 p.m.

EMNLP 2017 paper by Ramakanth Pasunuru and Mohit Bansal\n\nRam and Mohit join us to talk about their work, which uses reinforcement learning to improve performance on a video captioning task. They directly optimize CIDEr, a popular image/video captioning metric, using policy gradient methods, then use a modified version of CIDEr that penalizes the model when it fails to produce a caption that is _entailed_ by the correct caption. In our discussion, we hit on what video captioning is, what typical models look like for this task, and how the entailment-based reward function is similar to other attempts to be smart about handling paraphrases when evaluating or training language generation models.\n\nUnfortunately, due to some technical issues, the audio recording is a little worse than usual for this episode. Our apologies.\n\nhttps://www.semanticscholar.org/paper/Reinforced-Video-Captioning-with-Entailment-Reward-Pasunuru-Bansal/0d11977afa1a6ce90dc3b1f26694492c2ab04773