name: inverse layout: true class: center, middle, inverse --- # What's inside ChatGPT? And what does it mean for robotics/robotists? .footnote[Marek Ć uppa
Fablab 2023] --- layout: false # `$ whomai` - "Principal Data Scientist/Engineer" at Slido (now part of Cisco) -- - Lecturer at Matfyz (ML, NLP) -- - RoboCupJunior Exec --- layout: false # Short history of Neural Network approaches to sequence processing - *2001*: Neural Language Models - *2013*: Word Embeddings - *2014*: Sequence-to-Sequence models - *2015*: Attention - *2016*: Neural Machine Translation boom - *2017*: Transformers - *2018*: Pretrained Contextualized Word Embeddings (ELMo) - *2019*: Massive Transformer Models (BERT, GPT-2, ...) - *2020*: GPT-3 - *2021*: Large Language Models trained on Code (Codex) - *2022*: ChatGPT? - *2023+*: Current Frontiers --- layout: false # Short history of Neural Network approaches to sequence processing - *2001*: Neural Language Models - *2013*: Word Embeddings - *2014*: Sequence-to-Sequence models - *2015*: Attention .red[*] - *2016*: Neural Machine Translation boom - *2017*: Transformers .red[*] - *2018*: Pretrained Contextualized Word Embeddings (ELMo) - *2019*: Massive Transformer Models (BERT, GPT-2, ...) - *2020*: GPT-3 .red[*] - *2021*: Large Language Models trained on Code (Codex) - *2022*: ChatGPT? .red[*] - *2023+*: Current Frontiers .footnote[.red[*] Our today's (loose) agenda] --- class: center, middle, inverse ## Attention and Transformers --- ## History of Deep Learning Milestones ![:scale 70%](images/timeline.png) .footnote[ From [Deep Learning State of the Art (2020)](https://www.youtube.com/watch?v=0VH1Lim8gL8) by Lex Fridman at MIT] --- class: middle ## The perils of seq2seq modeling
Your browser does not support the video tag.
-- Aren't we throwing out a bit too much? .footnote[.font-small[Videos from https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/]] --- class: middle ## The fix Let's use the full encoder output!
Your browser does not support the video tag.
-- But how do we combine all the hidden states together? --- class: middle ## The mechanics of Attention
Your browser does not support the video tag.
--- class: middle ## The mechanics of Attention II
Your browser does not support the video tag.
--- class: middle ## The mechanics of Attention III
Your browser does not support the video tag.
--- class: middle ## Getting alignment with attention
Your browser does not support the video tag.
--- ## Attention visualized .center[![:scale 60%](images/attention_sentence.png)] See nice demo at https://distill.pub/2016/augmented-rnns/ --- class: middle # What if we only used attention? --- class: middle .center[![:scale 100%](images/attention_is_all_you_need.png)] .center[.small[[Attention is All You Need (2017)](https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf)]] --- class: middle ## The Transformer architecture .center[![:scale 90%](images/The_transformer_encoder_decoder_stack.png)] .footnote[.font-small[Images from https://jalammar.github.io/illustrated-transformer/]] --- class: middle ## The Transformer's Encoder .center[![:scale 100%](images/encoder_with_tensors_2.png)] --- ## What's Self Attention? .center[ *The animal didn't cross the street because it was too tired.* ] What does "it" refer to? -- .center[![:scale 50%](images/transformer_self-attention_visualization.png)] --- ## Self Attention mechanics .center[![:scale 70%](images/self-attention-output.png)] --- ## Multi-headed Self Attention .center[![:scale 100%](images/transformer_multi-headed_self-attention-recap.png)] --- ## The full Transformer seq2seq process I .center[![:scale 100%](images/transformer_decoding_1.gif)] --- ## The full Transformer seq2seq process II .center[![:scale 100%](images/transformer_decoding_2.gif)] --- ## Intermezzo: implement it yourself .center[To actually understand what's going on, there is no better approach.] -- - [A walkthorugh of the Transformer architecture](https://github.com/markriedl/transformer-walkthrough) -- - [The Annotated Transformer](http://nlp.seas.harvard.edu/annotated-transformer/) -- - [Transformers from scratch](https://peterbloem.nl/blog/transformers) [[video]](https://www.youtube.com/playlist?list=PLIXJ-Sacf8u60G1TwcznBmK6rEL3gmZmV) --- ## Big Transformers Wins: GPT-2 .center[![:scale 100%](images/gpt2-sizes.png)] Try it yourself at https://transformer.huggingface.co/doc/gpt2-large --- ## Big Transformer Wins: Huggingface `transformers` - A very nicely done library that allows anyone with some Python knowledge to play with pretrained state-of-the-art models (more in the [docs](https://huggingface.co/transformers/)). .center[![:scale 80%](images/huggingface_screenshot.png)] --- ## Big Transformer Wins: Huggingface `transformers` II - A small example: English to Slovak translator in about 3 lines of Python code: .red[*] ```python from transformers import pipeline en_sk_translator = pipeline("translation_en_to_sk") print(en_sk_translator("When will this presentation end ?")) ``` .footnote[.font-small[Works with many other languages as well -- the full list is [here](https://huggingface.co/Helsinki-NLP)]] --- ## Attention and Transformers: Recap -- - Attention was a fix for sequence models that did not really work too wel -- - It turned out it was all that was needed for (bounded) sequence processing -- - Transformer is an encoder-decoder architecture that is "all the rage" now -- - It has no time-depencency due to self-attention and is therefore easy to paralelize -- - Well known models like BERT and GPT-* took the world of NLP by storm -- - Very helpful in many tasks, easy to play with thanks to the Huggingface `transformers` library --- class: center, middle, inverse ## GPT3 and ChatGPT --- ## GPT-2 vs GPT-3 .center[![:scale 100%](images/gpt2-sizes.png)] -- .center[![:scale 80%](images/gpt2-vs-gpt3.png)] --- ## GPT3 - Basically the same architecture as GPT2 -- - The sheer size is astounding (power-law of model/dataset/computation size) -- - It would take [355 years of Tesla V100 GPU time](https://lambdalabs.com/blog/demystifying-gpt-3) to train - The training would cost about [$4.6M at retail prices](https://lambdalabs.com/blog/demystifying-gpt-3) to train -- - It was so expensive to train they didn't even fix the bugs they themselves found: -- .center[![:scale 100%](images/gpt3-cost-training.png)] --- ## [Language Models Are Few-Shot Learners](https://arxiv.org/abs/2005.14165) .center[![:scale 80%](images/gpt3-in-context-learning.png)] --- ## Q: Ok, so is ChatGPT simply a wrapper around GPT-3? - **Very short A**: It depends -- - **Short A**: It depends on who you ask -- - **A**: It depends on who you ask. OpenAI's Docs probably wouldn't agree. -- - **Actual A**: We don't really know. It's behind an API, we don't really have ways of proving this one way or the other. --- .center[![:scale 100%](images/gpt3.5-openai-docs.png)] --- ## The "potential" ChatGPT training procedure .center[![:scale 100%](images/instructgpt.png)] .footnote[.font-small[InstructGPT: [Training language models to follow instructions with human feedback (2022)](https://arxiv.org/pdf/2203.02155.pdf)]] --- ## Supervised FineTuning (SFT) Model .center[![:scale 50%](images/sft.png)] - The compilation of prompts from the OpenAI API and hand-written by labelers resulted in 13,000 input / output samples to leverage for the supervised model. --- ## Reward Model Training .center[![:scale 50%](images/rlm.png)] --- ## Reward Model Training II .center[![:scale 90%](images/reward-model.png)] .footnote[.center[.font-small[[Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf)]]] --- ## Fine-tuning with RL .center[![:scale 80%](images/rlhf.png)] .footnote[.center[.font-small[[Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf)]]] --- ## The "potential" ChatGPT training procedure .center[![:scale 100%](images/instructgpt.png)] .footnote[.font-small[InstructGPT: [Training language models to follow instructions with human feedback (2022)](https://arxiv.org/pdf/2203.02155.pdf)]] --- ## InstructGPT: Summary - The outputs generated by a small (**1.3B**) InstructGPT model were prefered to those of GPT3 -- - The rewards model was also "rather small" (**6B**) -- - We don't know how large the model behind ChatGPT is, but chances are it's this "small" .center[![:scale 80%](images/rlhf2.png)] --- class: center, middle, inverse ## Implications --- .center[![:scale 50%](images/slido-snoop-dog.png)] --- .center[![:scale 100%](images/chatgpt_history_professor.png)] .footnote[.center[.font-small[https://old.reddit.com/r/ChatGPT/comments/117gtom/my_friend_is_in_university_and_taking_a_history/]]] --- .center[![:scale 50%](images/prompt_injection.png)] .footnote[.center[.font-small[https://mobile.twitter.com/goodside/status/1598253337400717313]]] --- .center[![:scale 50%](images/flag_chatgpt.png)] .footnote[.center[.font-small[https://mobile.twitter.com/goodside/status/1599873570431434752]]] --- .center[![:scale 100%](images/chatgpt-line.jpeg)] .footnote[.center[.font-small[https://www.linkedin.com/posts/chatgpt-generative-ai_the-art-of-chatgpt-prompting-a-guide-to-activity-7036549105573060608-Xy1O/]]] --- .center[![:scale 80%](images/jailbreak-chat.png)] .footnote[.center[.font-small[https://www.jailbreakchat.com/]]] --- .center[![:scale 50%](images/pi_twitter.png)] --- .center[![:scale 50%](images/prompt-injection-leak.jpg)] --- .center[![:scale 80%](images/bing_injection.png)] .footnote[.center[.font-small[https://greshake.github.io/]]] --- .center[![:scale 100%](images/openai_outages.jpeg)] .footnote[.center[.font-small[https://www.linkedin.com/posts/petercotton_timeseries-prediction-activity-7036475253652381696-4u1]]] --- .center[![:scale 65%](images/chatgpt_robotics.png)] .footnote[.center[.font-small[https://www.microsoft.com/en-us/research/group/autonomous-systems-group-robotics/articles/chatgpt-for-robotics/]]] --- .center[![:scale 100%](images/chatgpt_main.jpg)] -- .center[![:scale 100%](images/steps.png)] --- ## Three rules for using (things like) ChatGPT 1. Let it do things you will do a manual check on anyway 2. Have it draft things you'll rewrite anyway 3. Assume the first response will be far from final .footnote[.center[.font-small[Inspired by https://vickiboykis.com/2023/02/26/what-should-you-use-chatgpt-for/]]] --- class: center, middle, inverse ## marek@mareksuppa.com --- .center[![:scale 100%](images/compute-trends.png)] https://epochai.org/blog/compute-trends