studio.

Everything the model sees

When you interact with a language model, the model does not remember previous conversations, does not have access to the internet (unless specifically connected), and does not maintain any state between sessions. All it has is what is currently in the context window: your prompt, the system instructions, and any documents or conversation history included in this session. The quality of the output is determined entirely by what is in this window. If the right information is there, the model can work with it. If it is missing, the model will fill the gap with its best guess, which may be wrong.

System instructions

Most model interactions include a system prompt that the user does not see: a set of instructions that shapes the model's behaviour, its tone, its priorities, its constraints. When you use a chat interface, the system prompt was set by whoever built that interface. It might say "be helpful, harmless, and honest" or "respond in formal English" or "always include a disclaimer about legal advice". This invisible layer explains behaviours you did not ask for. The model is not deciding to add a disclaimer; it was instructed to. When you build your own tools, you write your own system prompts and control this layer directly.

Temperature and determinism

When a model predicts the next token, it does not always pick the single most probable one. A parameter called temperature controls how much randomness is introduced. At temperature 0, the model always picks the most probable next token, producing deterministic, consistent, but sometimes repetitive output. At higher temperatures, the model samples from a wider distribution, producing more varied and creative output, but with a higher risk of incoherence. For tasks where consistency matters (classification, extraction, scoring), low temperature is appropriate. For tasks where variety matters (brainstorming, creative writing), higher temperature is useful. Understanding this parameter lets you tune the model's behaviour to the task.

Conversation history as context

In a multi-turn conversation, every previous message (yours and the model's) is included in the context window. Early instructions continue to influence later responses. The context window also fills up as the conversation progresses. In a long conversation, early messages may be dropped to make room for newer ones; the model loses access to instructions or context you provided at the start. Long conversations drift not because the model forgets but because it is literally no longer seeing the information. For important instructions, repeat them or use system prompts that persist throughout the session.

Providing documents as context

You can include documents, data, or reference material directly in the context window alongside your prompt. When you paste a report into a chat and ask the model to summarise it, the report becomes part of the context. The model reads it, processes it, and generates output based on it. The simplest form of "giving the model access to your data": you literally put the data in the window. The limitation is size. If the document exceeds the context window, you must either use a model with a larger window, split the document, or use retrieval techniques to select only the most relevant parts.

Examples

Drifting conversation

You start a chat by instructing the model to respond in bullet points with no more than three sentences per point. For the first five exchanges, it complies. By exchange twelve, it has reverted to long paragraphs. The original instruction has scrolled out of the context window. The fix is to include the formatting instruction in the system prompt, where it persists regardless of conversation length.

Temperature for different tasks

You use the same model for two tasks: classifying emails and brainstorming product names. For classification, you set temperature to 0 because you want the same email classified the same way every time. For brainstorming, you set temperature to 0.9 because you want varied, unexpected suggestions. Same model, same prompt structure, different temperature, different behaviour.