In this hands-on workshop you will train and use your own language model from scratch - with just pen and paper and a bit of dice rolling. Through interactive exercises and guided discussions, you’ll see how language models (even large language models like ChatGPT) are fundamentally information processing machines, turning language inputs into language outputs. The workshop builds from basic principles to more complex applications through an exploration of language modeling as a probabilistic process of predicting “what token comes next” in a sequence, and ends with a poetry slam (for real).
This is one of 4 Cybernetics for social impact KNoTs offered.
---
However, if you'd like to get a head start on some of the ideas we'll be covering (or if you're just the curious type and you like reading interesting stuff) here are a few free resources.
Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models covers some of the historical background. https://spectrum.ieee.org/andrey-markov-and-claude-shannon-built-the-first-language-generation-modelsThe Dummy's Guide to Modern LLM Sampling covers the "sampling" part of the process in detail, but also has some helpful definitions at the top. https://rentry.co/samplers
Brendan O'Conner's slides on N-Gram Language Models are interesting if you want to go further with the maths (although this is much more maths than we'll use - this session really is for everyone).