Leap Nonprofit AI Hub

Tag: transformer architecture

Rotary Position Embeddings and ALiBi: How Modern LLMs Handle Position Without Learned Embeddings

Rotary Position Embeddings and ALiBi are the two leading methods modern LLMs use to handle sequence position without learned embeddings. They enable longer context, better extrapolation, and faster training-replacing old positional encoding techniques entirely.

Read More

From Markov Models to Transformers: A Technical History of Generative AI

This article traces the technical evolution of generative AI from early probabilistic models like Markov chains to modern transformer architectures. Learn how breakthroughs in neural networks, GANs, and attention mechanisms shaped today's AI capabilities-and the challenges still ahead.

Read More

Large Language Models: Core Mechanisms and Capabilities Explained

Large language models power today’s AI assistants by using transformer architecture and attention mechanisms to process text. Learn how they work, what they can and can’t do, and why size isn’t everything.

Read More