Foundations — what is Claude Code, and why should I care?

Foundations, before your first session

Six foundational explainers for anyone who has never used before — what an actually is, why matter, what makes an different from a chatbot, what Claude Code is, how it differs from ChatGPT, and why we picked it over the alternatives. Read this first if any of those questions feels uncertain. Audience: complete beginner · ~20 min read · 16 curated external resources → ready to install? Skip to Day 1 → already installed? Try a Use Case

Read straight through, or skip to the section that matches the question you’re sitting on right now. The order goes concept → product → choice: what the underlying AI is, what it can and can’t remember, what changes when you give it tools, what the specific product is, how it relates to its siblings, and why we picked it.

What is a large language model (LLM)?

Cut through the word “AI” first. AI is a marketing umbrella. It covers image generation, speech recognition, self-driving cars, recommendation engines, robotics — and the thing we’re actually here to talk about. On its own the word is almost useless.

The specific thing that exploded in late 2022, the thing reshaping how knowledge workers do their jobs — including ours — is one narrow slice called a Large Language , or . Other kinds of AI exist and matter. Throughout this hub, when we say AI, we mean LLM. Everything else is a different conversation.

An LLM is a program. A very large one. It was trained on a huge amount of text — books, websites, code, documentation, forums, manuals. Most of the public internet. During training it learned one thing, and only one thing, very well: given a sequence of words, predict the next word. That’s it. That’s the whole trick.

Now do that prediction billions of times, with hundreds of billions of internal parameters, and what emerges is a system that can answer questions, write code, summarise a document, draft an email — because all of those, when you look closely, are “what comes next” problems. You ask “what’s the capital of France” — the most statistically likely next words are “The capital of France is Paris.” You ask “write a Python function that sorts a list” — the most likely next are valid Python. You paste a five-page contract and ask for a summary — the most likely next words are a summary.

1. It is statistical, not thinking. The LLM has no beliefs, no opinions, no memory between conversations, no model of truth versus falsehood. It is optimising for plausibility — what looks like a reasonable next word — not correctness. Most of the time plausibility and correctness overlap. When you ask “what’s two plus two,” the plausible answer and the correct answer are both “four.” But they can diverge.

2. That’s why it hallucinates. A is when the model confidently produces something that sounds right and isn’t. A made-up function name. A fictional court case. A wrong date. A library that doesn’t exist. It’s not lying. Lying requires knowing the truth. It’s pattern-matching to what a plausible answer looks like. If you ask for a citation, citations have a shape — author, year, title, journal — and it will produce something with that shape, even if the citation doesn’t exist. The fix is not to trust it. The fix is to verify. Always.

3. A token is the unit. When the LLM predicts “the next word” it’s actually predicting the next token. A token is roughly a word, or a piece of a word. “Hello” is one token. “Hallucination” might be three. Punctuation is its own token. You’ll see token counts everywhere — pricing is per token, context limits are in tokens. A 200,000-token is roughly 150,000 words.

4. in, out. Everything you type — your question, the files Claude reads, the instructions you give — is the prompt. Everything it produces back is the completion. That’s the whole vocabulary.

5. It has no memory between sessions. Close the , reopen it tomorrow, the model has zero recollection of yesterday. Everything Claude “knows” about your project, your conventions, your past conversations is what’s currently in its context window (next step). Anything outside that, it cannot see. The knowledge baked into the model itself was also frozen at some point in the past — its “training cutoff.” Ask it about events after that date and it’ll either say it doesn’t know, or — worse — make something up that sounds right.

This is a wildly compressed summary. If you want to actually understand this, the resources at the end of this section are some of the best free explanations ever produced.

Watch / read 3 curated

Video ~10 min

Large Language Models Explained Briefly

3Blue1Brown — The single best 10-minute explainer ever produced. Visual, accurate, no jargon. If you only watch one thing on this page, watch this.

Surface	What it is	When to reach for it	What you pay
Claude.ai	The chatbot at claude.ai (and the desktop/mobile apps). A web tab with a text box.	Quick questions. Drafting text. Brainstorming. “Explain this concept.” Doesn’t touch your files.	Free tier, then ~$20/month (Pro), ~$100-200/month (Max)
	A tool. Runs in your . Has filesystem access, runs commands, edits code.	Anything that needs to change files in a project. Refactors. New features. Debugging. CI/.	Included in Claude.ai Pro+. Or pay-per- via the API
Anthropic API	Raw programmatic access — your code calls Claude over .	You’re building something that uses Claude inside it. Custom tools, internal .	Pay-per-token.
Claude Cowork	A variant of Claude Code aimed at non-developers. Research preview.	Non-technical folks who still want hands-on capability.	Same Claude.ai subscriptions.

Tool	What it is	Why you might pick it	Why you might not
Claude Code	’s agent, Claude .	Deepest reasoning. 200K context delivered reliably. Strong autonomous multi-step. -first composes with any editor.	Locked to Claude (no GPT/Gemini switching). Subscription cost can climb on heavy use.
Cursor	VS Code fork with built-in agent.	Visual IDE feel. Can switch between models. Fast inline edits.	Reports of context truncation under hood. IDE lock-in if you preferred your existing setup.
Cline	VS Code extension, full agent loop.	Lives inside your existing VS Code. Step-by-step approval mode. Open source.	Pay your own model bills. Less “deeply integrated” feel.
Aider	Terminal pair-programmer. Model-agnostic.	Git-native (every edit is a ). -efficient. Works with any model.	No visual interface. Smaller blast radius — less autonomous than Claude Code on big tasks.
Copilot	Inline + a separate chat.	Already on most dev machines. Tight VS Code integration.	Completion-shaped, not agent-shaped. Less suited to multi-file refactors.

Foundations — what is Claude Code, and why should I care?

Foundations, before your first session

Day 1 — your first session