Words as points in space
The foundation of language AI is a remarkable idea: every word can be represented as a position in an abstract, high-dimensional space. A word embedding takes the entirety of how a word has been used across billions of sentences and distils it into a single point in a space of roughly 300 dimensions. Two words used in similar ways end up close together. "Doctor" and "physician" are neighbours. "Berlin" and "Paris" are neighbours. "Run" in the athletic sense and "run" in the business sense occupy different regions. Not a lookup table; not a dictionary. A numerical representation of meaning itself, derived from nothing but patterns of usage. The fact that this works at all is extraordinary. Meaning, something we think of as uniquely human, leaves a measurable trace in how we use language, and that trace can be captured in numbers.
From meaning to generation
A language model builds on word embeddings to do something deceptively simple: given a sequence of words, it predicts the most probable next word. One token at a time, billions of times, to produce text. The result looks and feels like a thinking machine. In a real sense, that is not accidental. Our language carries meaning, structure, logic, argument. A model trained on enough language absorbs those structures. When it generates a coherent paragraph, it draws on patterns that reflect how humans reason, explain, and persuade. The model has no beliefs or intentions; its outputs are shaped by the accumulated meaning embedded in human language. The output is useful because it is built from the same material that human thought is expressed in.
Bias is embedded too
Embeddings capture real patterns of usage, which means they also capture patterns we might not endorse. If the training data consistently pairs "CEO" with male pronouns and "assistant" with female ones, the model learns that association. If certain professions or identities are described more positively or more frequently than others, those imbalances become part of the model's representation of the world. Not a bug; a faithful reflection of the data. The same mechanism that makes embeddings powerful (capturing meaning from usage) means they absorb the biases present in that usage. Your job is to recognise where the model's defaults come from and steer against them when they do not match your intent.
The context window
Every interaction with a model happens within a context window, a fixed-size space that holds everything the model can process at once: your current prompt, the conversation history, any system instructions, and any documents you have provided. The model's output is shaped by all of this simultaneously. If earlier parts of your conversation asked the model to write formally, and you now want casual text, the formal instructions are still influencing the output. Understanding the context window is the single most important concept for working effectively with language AI. Everything else, from prompting to retrieval to agent design, is ultimately about controlling what goes into this window and in what form.
Confident, not correct
The same mechanism that produces remarkably useful text also produces text that is wrong. Because the model generates probable continuations rather than retrieving verified facts, it can produce a fluent, well-structured summary of a regulation where two of the four cited paragraphs do not exist. It can generate a citation with correct format, a plausible author, and a real journal, but for a paper that was never published. The output is confident because the model is very good at producing text that looks right. Whether it is right depends on the domain, and that judgment remains with you. Every output must be reviewed by a human who knows the subject.
Examples
Bias in generated content
You ask a model to generate profiles of three professionals aged 25, 43, and 60. The model produces three conventionally attractive people who all look under 35. The training data over-represents youth and beauty. To get realistic output, you must explicitly steer against this default by specifying "ordinary appearance" or even exaggerating the age to push the model away from its norm.
Confident fabrication
You ask the model to summarise a specific regulation. It produces a fluent, well-structured summary with paragraph numbers, dates, and legal terminology. But two of the four paragraphs it cites do not exist. The summary reads perfectly because the model is very good at producing text that looks like legal summaries. The content must always be verified against the actual source.