44. Andreas Stephan - University of Vienna - Weak Superversion in NLP

Published: Dec. 27, 2023, 2:13 p.m.

b"

# Summary

\\n

I am sure that most of you are familiar with the training paradigm of supervised and unsupervised learning. Where in the case of supervised learning one has a label for each training datapoint and in the unsupervised situation there are no labels.

\\n

Although there can be exceptions, everyone is well advise to perform supervised training when ever possible. But where to get those labels for your training data if traditional labeling strategies, like manual annotations are not possible?

\\n

Well often you might not have perfect labels for your data, but you have some idea what those labels might be.

\\n

And this, my dear listener is exactly the are of weak supervision.

\\n

Today on the show I am talking to Andreas Stephan who is doing is PhD in Natural Language Processing at the University of Vienna in the Digital Text Sciences group led by Professor\\xa0Benjamin Roth.

\\n

Andreas will explain about his recent research in the area of weak supervision as well how Large Language Models can be used as weak supervision sources for image classification tasks.

\\n


\\n

# TOC

\\n

00:00:00 Beginning

\\n

00:01:38 Weak supervision a short introduction (by me)

\\n

00:04:17 Guest Introduction

\\n

00:08:48 What is weak supervision?

\\n

00:16:02 Paper: SepLL: Separating Latent Class Labels from Weak Supervision Noise

\\n

00:26:28 Benefits of priors to guide model training

\\n

00:29:38 Data quality & Data Quantity in training foundation models

\\n

00:36:10 Using LLM's for weak supervision

\\n

00:46:51 Future of weak supervision research

\\n


\\n

# Sponsors

\\n

- Quantics: Supply Chain Planning for the new normal - the never normal - https://quantics.io/

\\n

- Belichberg GmbH: We do digital transformations as your innovation partner - https://belichberg.com/

\\n


\\n

# References

\\n

- Andreas Stephan - https://andst.github.io/

\\n

- Stephan et al. "SepLL: Separating Latent Class Labels from Weak Supervision Noise" (2022) - https://arxiv.org/pdf/2210.13898.pdf

\\n

- Gunasekar et al. "Textbooks are all you need" (2023) - https://arxiv.org/abs/2306.11644

\\n

- Introduction into weak supervision: https://dawn.cs.stanford.edu/2017/07/16/weak-supervision/

"