In-Context Pretraining: Language Modeling Beyond Document Boundaries

17/10/2023 18 min

Listen "In-Context Pretraining: Language Modeling Beyond Document Boundaries"

Episode Synopsis

In this paper, the authors propose in-context pretraining for language models, where models are pretrained on sequences of related documents to encourage reasoning across document boundaries. They introduce algorithms for finding related documents and constructing coherent input contexts, and show that in-context pretraining improves performance on various tasks.

https://arxiv.org/abs//2310.10638

YouTube: https://www.youtube.com/@ArxivPapers

TikTok: https://www.tiktok.com/@arxiv_papers

Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016

Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

More episodes of the podcast Arxiv Papers