Lesson 2: PyTorch Basics and Neural Networks
Introduction
Welcome to Lesson 2! Today, we're diving into PyTorch, a powerful library for machine learning, and exploring the basics of neural networks. By the end of this lesson, you'll be able to create and train your own simple neural network!
Don't worry if some concepts seem challenging at first. We'll break everything down step-by-step and use practical examples to help you understand.
PyTorch Basics: Tensors
In PyTorch, everything revolves around tensors. You can think of tensors as sophisticated arrays that can operate on GPUs for faster processing. They're like the building blocks of all our neural networks.
Imagine tensors as boxes that can hold numbers. A single number is a 0-dimensional tensor, a list of numbers is a 1-dimensional tensor, a table of numbers is a 2-dimensional tensor, and so on. These "boxes" can do math operations really fast, which is crucial for machine learning.
Let's look at some basic tensor operations:
import torch
# Creating tensors
x = torch.tensor([1, 2, 3, 4, 5])
y = torch.tensor([2, 4, 6, 8, 10])
# Element-wise operations
z = x + y
print("x + y =", z)
# Matrix multiplication
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])
C = torch.matmul(A, B)
print("A * B =", C)
# Gradients
x = torch.tensor([2.0], requires_grad=True)
y = x ** 2
y.backward()
print("Gradient of y with respect to x:", x.grad)
In this code, we create tensors, perform element-wise operations, do matrix multiplication, and even compute gradients, which are essential for training neural networks.
Building a Simple Neural Network
Now that we understand tensors, let's use them to build a neural network. A neural network is like a digital brain: it takes in information, processes it through layers of "neurons", and produces an output.
Think of a neural network as a series of filtering systems. Each layer filters the information in a specific way, gradually transforming the input into the desired output. For example, in an image recognition task, early layers might detect edges, middle layers might identify shapes, and later layers might recognize complex objects.
Let's create a simple neural network to solve the XOR problem:
import torch
import torch.nn as nn
import torch.optim as optim
# Define a simple neural network
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.layer1 = nn.Linear(2, 5)
self.layer2 = nn.Linear(5, 1)
def forward(self, x):
x = torch.relu(self.layer1(x))
x = self.layer2(x)
return x
# Create the model, loss function, and optimizer
model = SimpleNN()
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
# Training data
X = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32)
# Training loop
for epoch in range(1000):
# Forward pass
outputs = model(X)
loss = criterion(outputs, y)
# Backward pass and optimize
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 100 == 0:
print(f'Epoch [{epoch+1}/1000], Loss: {loss.item():.4f}')
# Test the model
with torch.no_grad():
predicted = model(X)
print("Final predictions:")
print(predicted)
This code defines a simple neural network with two layers, creates training data for the XOR problem, and trains the network to solve it. The XOR problem is a classic example where a simple linear model fails, but a neural network can solve it easily.
XOR stands for "exclusive OR". It's a logical operation that returns true only when the inputs differ (one is true and the other is false). Imagine you have two light switches that control the same light bulb. The light is on only when one switch is up and the other is down. If both switches are up or both are down, the light is off. This is XOR in action!
In machine learning terms:
- Input (0,0) should output 0
- Input (0,1) should output 1
- Input (1,0) should output 1
- Input (1,1) should output 0
A simple linear model (like a single-layer perceptron) can't solve this because XOR isn't linearly separable - you can't draw a straight line to separate the 1s from the 0s. But a neural network with at least one hidden layer can create a more complex decision boundary, allowing it to solve the XOR problem. This demonstrates the power of neural networks to learn and represent complex patterns that simpler models cannot.
Gradient Descent and Backpropagation
The magic behind training neural networks lies in two key concepts: gradient descent and backpropagation.
Gradient descent is like walking down a hill to find the lowest point. The "hill" represents the error of our network, and we want to find the lowest error. We take steps in the direction that decreases the error most quickly.
Backpropagation is how we calculate which direction to step. It's like tracing back through the network to see how each part contributed to the error, so we know how to adjust each part.
In our code above, loss.backward()
performs backpropagation, and optimizer.step()
performs gradient descent.
Interactive Visualization: Training Process
Let's visualize how loss decreases and accuracy increases during training. This simulation represents a typical training process for a neural network.
Challenge: Extend the Neural Network
Now it's your turn! Try to extend our simple neural network to solve a more complex problem. Here are some ideas:
- Add more layers to the network
- Increase the number of neurons in each layer
- Try to solve a different problem, like recognizing handwritten digits (you can use the MNIST dataset)
- Experiment with different activation functions (like sigmoid or tanh instead of ReLU)
- Implement a simple convolutional neural network (CNN) for image classification
This challenge will help you get comfortable with building and modifying neural networks in PyTorch.