NLP in Practice: From Corpus Linguistics to RAG with Python

Bridge the gap between traditional corpus linguistics and modern Retrieval-Augmented Generation (RAG) systems. In this workshop, researchers and developers will learn how classical NLP techniques—corpus analysis, tokenization, and annotation—can inform and improve RAG implementations. We'll use Python to build a pipeline that takes a text corpus from raw collection through linguistic analysis to a queryable RAG system, demonstrating how academic NLP foundations enhance practical AI applications.

Want to know more?

Join PyCon Colombia newsletter and get a complete overview of our events, speakers and community participation.