Chapter 2 — Project setup with uv
In Chapter 1 we agreed, in words and equations, what we are going to build: a Python package called mygpt that produces probability distributions over tokens, and a training loop that adjusts its parameters. In this chapter we create the package — empty for now — and learn the three commands you will type for the rest of the tutorial.
By the end of this chapter you will:
- have
uvinstalled and verified on your machine, - have a fresh
mygptpackage created viauv init, - have replaced the auto-generated stub with our first real piece of code: the four-token vocabulary from Chapter 1,
- have run two programs end-to-end — one as a package entry-point, one as a standalone experiment script.
There is no maths in this chapter. It is pure setup. We get it out of the way once.
2.1 Why uv?
Real Python projects need three things working in concert: a Python interpreter of a known version, a virtual environment that isolates the project’s libraries from the rest of your system, and a dependency manager that records which libraries the project uses and at which versions. The standard library tools (venv, pip) handle each of these separately, with their own files, commands, and edge cases.
uv is a single tool that does all three. It is fast (written in Rust), it produces a reproducible lockfile, and — most relevantly for a tutorial — it lets you run a project’s code with a single command, uv run, that takes care of installing the right Python version, creating the venv, and syncing dependencies behind the scenes. You will type uv run a lot in the next 17 chapters.
We assume nothing about whether you have used uv before. Three commands are enough for everything in this tutorial:
uv init— create a new project.uv add <package>— add a dependency.uv run <command>— run a command inside the project’s environment.
That’s it. There are more commands; we’ll meet them when we need them.
2.2 Install uv
On macOS and Linux, install uv with:
curl -LsSf https://astral.sh/uv/install.sh | sh
On Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Or, if you already use Homebrew:
brew install uv
After installing, open a new terminal so the updated PATH takes effect, then verify the installation:
uv --version
Expected output (your version may be newer; the parenthetical is the build commit hash and date):
uv 0.8.0 (0b2357294 2025-07-17)
The exact version number does not matter for this tutorial; anything 0.4 or newer will work. The parenthetical after the version is the git commit and build date of the binary you installed — it changes from build to build and you can ignore it.
If uv --version errors with command not found, your shell’s PATH does not include ~/.local/bin (or wherever the installer placed uv). Re-open the terminal, or follow the installer’s printed instructions.
2.3 Initialise the mygpt project
Pick a directory you keep code in (~/dev, ~/code, whatever). From a shell, cd into it and run:
uv init mygpt --package
Expected output (the path will differ on your machine):
Initialized project `mygpt` at `/Users/you/dev/mygpt`
This command creates a directory called mygpt and populates it with the standard layout for a Python package. The --package flag tells uv we want a real installable package (not a single-script project); concretely, it lays out the source under src/mygpt/ and adds a [project.scripts] section to pyproject.toml.
cd into the new project. Every command from this point onward is run from inside mygpt/ unless we say otherwise.
cd mygpt
Inspect what was just created:
find . -type f -not -path '*/.venv/*' -not -path '*/.git/*' | sort
Expected output:
./.gitignore
./.python-version
./pyproject.toml
./README.md
./src/mygpt/__init__.py
Five files. Let’s quickly look at the two that matter most.
2.3.1 pyproject.toml
cat pyproject.toml
Expected output (the authors field will reflect your local git config):
[project]
name = "mygpt"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
authors = [
{ name = "your-name", email = "you@example.com" }
]
requires-python = ">=3.12"
dependencies = []
[project.scripts]
mygpt = "mygpt:main"
[build-system]
requires = ["uv_build>=0.8.0,<0.9"]
build-backend = "uv_build"
Three blocks worth understanding:
[project]— metadata.nameis what we type to install and import.requires-python = ">=3.12"says we need Python 3.12 or later; if your system Python is older,uvwill fetch a compatible interpreter for you.dependencies = []is the list we’ll grow in Chapter 3.[project.scripts]— declares a console entry-point.mygpt = "mygpt:main"means “when the user runs the commandmygpt, callmygpt.main()“. We’ll use this immediately.[build-system]— howuvshould build the package. We never edit this.
2.3.2 src/mygpt/__init__.py
cat src/mygpt/__init__.py
Expected output:
def main() -> None:
print("Hello from mygpt!")
Two-line stub. We’ll replace it in §2.5.
2.4 Run the auto-generated package
uv already gave us a runnable program. Try it:
uv run mygpt
Expected output (the first time you run this — the build/install lines may differ):
Using CPython 3.12.11
Creating virtual environment at: .venv
Building mygpt @ file:///Users/you/dev/mygpt
Built mygpt @ file:///Users/you/dev/mygpt
Installed 1 package in 4ms
Hello from mygpt!
What just happened? uv run mygpt saw that this project did not yet have a virtual environment, so it:
- Picked a Python interpreter that satisfies
requires-python = ">=3.12"(installing one from upstream if needed). - Created a virtual environment at
./.venv/. - Built our package and installed it into that venv in editable mode (so future edits to the source take effect without reinstalling).
- Looked up the script
mygptin[project.scripts], found that it maps tomygpt:main, imported the package, and calledmain().
If you run it again, it skips steps 1–3 and just calls main():
uv run mygpt
Expected output:
Hello from mygpt!
That is the basic edit-run loop. Edit a file under src/mygpt/, type uv run mygpt, see the change.
2.5 Make the package about our project
The auto-generated Hello from mygpt! is a placeholder. Let’s replace it with something we actually need: the four-token vocabulary from Chapter 1, plus a main function that prints it.
Replace the contents of 📄 src/mygpt/__init__.py with:
"""mygpt — a tiny GPT-2-like language model, built one chapter at a time.
This file holds the package-level constants used in every chapter.
For now there is only one: the four tokens that form our running example.
"""
VOCAB: tuple[str, ...] = ("I", "love", "AI", "!")
"""The four tokens used as the running example throughout this tutorial."""
def main() -> None:
print("Vocabulary:", VOCAB)
print(f"Vocabulary size V = {len(VOCAB)}")
Three things to notice:
VOCABis a module-level constant. Any code that doesfrom mygpt import VOCABwill get this exact tuple. We will extend the package with more constants and classes in later chapters; they will all live alongside this one.- We use a
tuple, not alist, because the vocabulary is fixed for the lifetime of the program — tuples make that intent explicit and prevent accidental mutation. main()is still the functionpyproject.tomlpoints itsmygptscript at. We kept the name the same souv run mygptstill works without editingpyproject.toml.
Run it:
uv run mygpt
Expected output:
Vocabulary: ('I', 'love', 'AI', '!')
Vocabulary size V = 4
You have just shipped your first piece of mygpt.
2.6 Your first experiment script
The package itself should only contain code that the final user of mygpt would care about. Code that exists only to demonstrate or explore something belongs outside the package, in experiments.
We follow this convention throughout the tutorial. By the end of the book you will have one experiment per chapter under experiments/, all of which import from mygpt but none of which are part of mygpt.
Create the experiments directory and our first script:
mkdir -p experiments
Save the following to 📄 experiments/01_hello_mygpt.py:
"""Experiment 01 — Hello from mygpt.
Confirms the package is installed and importable from a script outside
`src/mygpt/`. Prints the vocabulary along with each token's integer id.
"""
from mygpt import VOCAB
def main() -> None:
print(f"Vocabulary size V = {len(VOCAB)}")
for token_id, token in enumerate(VOCAB):
print(f" id {token_id}: {token!r}")
if __name__ == "__main__":
main()
Run it with uv run. Note: we now invoke python <script> rather than the mygpt console script, because this is just a regular Python file, not a registered entry-point.
uv run python experiments/01_hello_mygpt.py
Expected output:
Vocabulary size V = 4
id 0: 'I'
id 1: 'love'
id 2: 'AI'
id 3: '!'
This is the token-id mapping from §1.3 of Chapter 1, now actually running on your machine. The integers 0, 1, 2, 3 are the only form in which these tokens will enter any neural-network operation in the rest of this tutorial.
Why does the import work?
from mygpt import VOCAB works because uv run already installed the package into the project’s .venv in editable mode (back in §2.4). Editable mode means the venv contains a pointer to src/mygpt/, so any change you make to the source is picked up immediately by anything that imports mygpt. You will rely on this many times.
2.7 Experiments
Try these. None of them count as “doing it wrong” — they are how you build intuition.
- Add a fifth token. Edit
src/mygpt/__init__.pysoVOCABbecomes("I", "love", "AI", "GPT", "!"). Re-runuv run python experiments/01_hello_mygpt.py. Notice that no reinstall was required: editable mode picks up the change immediately. Confirm the new size is 5 and the id of"!"shifts from 3 to 4. - Break the entry-point on purpose. Rename
mainto_mainin__init__.py. Runuv run mygpt. You will see a PythonImportErrorsayingcannot import name 'main' from 'mygpt', ending withDid you mean: '_main'?. Concretely:uvhad installed a tiny shim at.venv/bin/mygptwhose contents includefrom mygpt import main— renaming the function broke that one import line.cat .venv/bin/mygptif you want to see the shim. Rename_mainback tomainbefore continuing. This is the feedback loop the script declaration is for: we wire up entry points once, then the file system tells us when we break them. - Inspect the lockfile. Run
cat uv.lock | head -30. You will see a small TOML fileuvhas produced to record the exact versions of every dependency. Right now there are no dependencies, so it is short — butuv.lockis what makes the build reproducible across machines. We commit it to git in §2.9.
After each experiment, restore your VOCAB and main to the version in §2.5 before moving on. The next chapter assumes that exact starting state.
2.8 Exercises
-
Read the script declaration. Open
pyproject.tomland locate[project.scripts]. The line ismygpt = "mygpt:main". In one English sentence, write what each side of the=means. -
Add a second entry-point. Add a line
hello-mygpt = "mygpt:main"directly underneath the existing one. Save. Runuv run hello-mygpt. What happens? Why diduvneed to rebuild the package? (Hint:pyproject.tomlis part of the package’s metadata, so changing it changes what gets installed.) -
Find the venv. Run
ls .venv/lib/python*/site-packages/ | head. Among the entries you will see two that belong to our package:- a file called
mygpt.pth(about 60 bytes) — this is the editable-mode pointer. Cat it (cat .venv/lib/python*/site-packages/mygpt.pth) and you will see a single absolute path: the path to yoursrc/directory. That one line is what makesfrom mygpt import VOCABfind your code. - a folder called
mygpt-0.1.0.dist-info/— this holds the install metadata (METADATA,RECORD,entry_points.txt, etc.). It is whatuvwrites for the package to count as “installed”, even though the source lives elsewhere.
You will also see entries beginning with
_virtualenvand a__pycache__/folder; those belong to the virtual environment itself, not to our package. The interesting fact is that nothing insite-packages/contains a copy ofsrc/mygpt/— only a pointer to it. Why does that pointer-only layout make the edit-run loop you used in §2.5 work without any reinstall? - a file called
There are no “answers” for these — the goal is to build a feel for what uv actually arranged on your filesystem.
2.9 Putting the project under version control
Strictly optional, but recommended. Note that uv init mygpt --package already ran git init for you and wrote a .gitignore — there is no need to run git init yourself. (If you do, git will harmlessly reply Reinitialized existing Git repository and leave your repo unchanged.) Skip straight to staging and committing:
git add .gitignore pyproject.toml README.md src/ experiments/ uv.lock
git commit -m "initial mygpt scaffold"
If git refuses the commit with Author identity unknown, set your name and email globally first:
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
Note we explicitly do not add .venv/ (the auto-generated .gitignore excludes it). We do add uv.lock — it is the file that lets a collaborator reproduce your exact dependency versions later.
2.10 What’s next
We have a package. We have an experiment script. We have a vocabulary of four tokens, mapped to ids 0, 1, 2, 3. We have zero mathematics inside mygpt so far — the package is, in effect, four strings.
In Chapter 3 we add PyTorch. We learn what a tensor is, how PyTorch automatically computes derivatives via autograd, and what an nn.Module is. After Chapter 3, we will be able to talk about the model in terms of vectors, matrices, and gradients — and from Chapter 4 onward, every chapter will add real neural-network components to mygpt.
Looking ahead — what to remember from this chapter
uv init mygpt --packagecreates the package skeleton;cd mygptand stay there.uv run mygptruns the entry point declared inpyproject.toml.uv run python <script>runs any script inside the project’s environment.- The package source lives in
src/mygpt/. Experiments live inexperiments/and import from the package.- Editing files under
src/mygpt/requires no re-install — editable mode already pointed the venv at the source.
On to Chapter 3 — PyTorch in 20 minutes (coming soon).