Table of Contents

I. Transformers

II. How Feedforward Transformers Work Compared to the Brain

III. Predictive Coding in Neuroscience

IV. Vector Context and Tokens

V. Model Reasoning and Basins of Attraction

VI. AI–Neural Correspondence

VII. The Role Emotions Play in Decision Making

VIII. Memory

IX. AI History: The Neural Network

X. Symbolists vs Connectionists

XI. Miscellaneous


TRANSFORMERS

LLM transformer architectures share almost no design heritage with biological neural circuits. The convergences were empirical surprises.

A Transformer is a type of neural network architecture designed to process sequences (text, code, audio, DNA, etc.) by letting every element in the sequence look at every other element and decide what matters. The key mechanism is attention to determine which parts of the input matter for predicting the next output.

Instead of reading information strictly left-to-right like older models, a Transformer can consider the entire context at once and weigh which parts are relevant.


Core Concept

A Transformer does two main things: