Skip to content

2023

Transformer models: an introduction and catalog — 2023 Edition

Link: Transformer models: an introduction and catalog — 2023 Edition: "Author" "I have a terrible memory for names. In the past few years we have seen the meteoric appearance of dozens of models of the Transformer family, all of which have funny, but not self-explanatory, names. The goal of this post is to offer a short and simple catalog and classification of the most popular Transformer models. In other words, I needed a Transformers cheat-sheet and couldn’t find a good enough one online, so I thought I’d write my own. I hope it can be useful to you too""

What Is ChatGPT Doing … and Why Does It Work?

Link: What Is ChatGPT Doing … and Why Does It Work?: "This is a pretty amazing article. Even though it's "non technical" and I read all of it, I think I only understood about 75%: "Stephen Wolfram explores the broader picture of what's going on inside ChatGPT and why it produces meaningful text. Discusses models, training neural nets, embeddings, tokens, transformers, language syntax.""

GPT in 60 Lines of NumPy | Jay Mody

Link: GPT in 60 Lines of NumPy | Jay Mody: "Implementing a GPT model from scratch in NumPy. This is a detailed article including python source code. Ive skimmed it and it does explain a lot. But even without my being a neural net, ai, machine learning, gpt, chatgpt expert, i think i would learn a lot. Also the article is very well written presented."