Skip to the content.

PyTorch

Information

PyTorch is an open-source machine learning framework developed by Meta AI. It uses dynamic computation graphs (define-by-run), making debugging natural since the graph is built at runtime. PyTorch is the dominant framework for deep learning research and is increasingly used in production.

Key components:

Installation

Rocky Linux / Fedora — CPU only

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

Rocky Linux / Fedora — GPU (CUDA 12.x)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Always verify the correct install command for your CUDA version at pytorch.org/get-started.

Debian / Ubuntu — CPU only

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

FreeBSD

PyTorch does not have official FreeBSD packages. Use a Linux VM or container.

Windows

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Verify installation

import torch
print(torch.__version__)
print(torch.cuda.is_available())   # True if GPU is accessible
print(torch.cuda.get_device_name(0))  # GPU name

Configuration

Device selection

import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Move model and data to device
model = MyModel().to(device)
inputs = inputs.to(device)

Reproducibility

import torch, random, numpy as np

torch.manual_seed(42)
torch.cuda.manual_seed_all(42)
random.seed(42)
np.random.seed(42)
torch.backends.cudnn.deterministic = True

Usage, tips and tricks

Tensor basics

import torch

# Create tensors
x = torch.tensor([1.0, 2.0, 3.0])
z = torch.zeros(3, 4)
o = torch.ones(2, 2)
r = torch.rand(3, 3)        # uniform [0, 1)
n = torch.randn(3, 3)       # standard normal

# Operations
y = x * 2
dot = torch.dot(x, x)
mm = torch.mm(r, r)         # matrix multiply

Autograd

x = torch.tensor(2.0, requires_grad=True)
y = x ** 3
y.backward()
print(x.grad)   # dy/dx = 3x^2 = 12

Defining a model

import torch.nn as nn

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 10)
        )

    def forward(self, x):
        return self.layers(x)

Training loop skeleton

model = MLP().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(10):
    model.train()
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

Save and load model

# Save
torch.save(model.state_dict(), "model.pt")

# Load
model = MLP()
model.load_state_dict(torch.load("model.pt", map_location=device))
model.eval()

See also