How to build a CNN from scratch in Python?

Marilyn 126 Published: 02/11/2025

How to build a CNN from scratch in Python?

Building a Convolutional Neural Network (CNN) from scratch in Python can be a fascinating project! Here's a step-by-step guide on how to do it:

Step 1: Import necessary libraries

You'll need to import the following libraries:

numpy for numerical computations matplotlib and/or seaborn for visualization (if desired) random or scipy.stats for generating random data

import numpy as np
from matplotlib import pyplot as plt

Step 2: Define the neural network architecture

For a basic CNN, you'll need to define:

The number of convolutional layers (e.g., 3) The number of fully connected layers (e.g., 1) The number of neurons in each layer

num_convolutions = 3
num_fully_connected = 1
num_neurons_per_layer = [128, 64]

Step 3: Implement the forward pass

Write a function that takes an input tensor and returns the output of the entire network. This involves:

Convolutional layers: use np.convolve to convolute the input with learnable filters (weights) Pooling layers: apply max or average pooling using np.max or np.mean Fully connected layers: use np.dot to multiply weights by inputs and add biases

def forward_pass(inputs):
Convolutional layer 1
conv1 = np.convolve(inputs, filters[0], mode='same')
relu1 = np.maximum(conv1, 0)
Pooling layer 1
pool1 = np.max(relu1, axis=(1,))
Convolutional layer 2
conv2 = np.convolve(pool1, filters[1], mode='same')
relu2 = np.maximum(conv2, 0)
Pooling layer 2
pool2 = np.max(relu2, axis=(1,))
Fully connected layer
fully_connected = np.dot(pool2.flatten(), weights[-1]) + biases[-1]
return fully_connected

Step 4: Implement the backward pass

Write a function that takes an output tensor and returns the gradients of the inputs. This involves:

Backpropagating errors through each layer, using chain rule to compute gradients

def backward_pass(outputs):
Fully connected layer
fully_connected_grad = np.ones_like(outputs)
Convolutional layers
conv2_grad = np.zeros_like(pool2)
for i in range(len(fully_connected_grad)):
conv2_grad[i] = np.convolve(np.roll(fully_connected_grad, -i), filters[1], mode='same')
relu2_grad = (conv2 + 1) * conv2_grad
pool2_grad = np.zeros_like(pool2)
for i in range(len(relu2_grad)):
pool2_grad[i] = np.max(relu2_grad[i:], axis=(0,))
Pooling layers
pool1_grad = np.zeros_like(pool1)
for i in range(len(pool2_grad)):
pool1_grad[i] = np.max(pool2_grad[i:], axis=(0,))
conv1_grad = (relu1 + 1) * pool1_grad
return [conv1_grad, relu1, pool1]

Step 5: Train the network

Use a dataset to train your network. This involves:

Feeding input data through the forward pass Computing loss using a loss function (e.g., mean squared error) Backpropagating errors using the backward pass Updating weights and biases based on gradients and learning rate

def train_network(X, Y):
Initialize weights and biases
filters = np.random.rand(num_convolutions, 1, 3) * 0.01
biases = np.zeros((num_fully_connected,))
weights = np.random.rand(len(num_neurons_per_layer), num_neurons_per_layer[0]) * 0.01
for epoch in range(100):
for inputs, outputs in zip(X, Y):
Forward pass
outputs_pred = forward_pass(inputs)
Loss computation and backward pass
loss = np.mean((outputs - outputs_pred) ** 2)
grads = backward_pass(outputs_pred)
Weight update
for i in range(len(weights)):
weights[i] -= 0.01 * np.dot(grads[1].flatten(), filters[0]) * inputs.flatten()
for i in range(len(biases)):
biases[i] -= 0.01 * grads[2]
return filters, biases, weights

This is a basic outline of how to build a CNN from scratch in Python. Note that this code is highly simplified and may not be efficient or accurate for actual use cases. For practical applications, consider using libraries like TensorFlow or Keras, which provide pre-built functions for building and training neural networks.

Feel free to ask me any follow-up questions!

Python convolutional neural networks pdf

I'm happy to help! However, please note that I'll respond in English as per your request.

Here's a comprehensive overview of Python Convolutional Neural Networks (CNNs) in PDF format:

Introduction

Convolutional Neural Networks (CNNs) are a type of neural network that have gained immense popularity in recent years due to their exceptional performance on various computer vision tasks, such as image classification, object detection, and image segmentation. In this document, we'll delve into the world of CNNs, exploring how they work, their architecture, and some practical examples.

What are Convolutional Neural Networks?

A CNN is a neural network that uses convolutional layers to extract features from images or other data. These layers use filters (also known as kernels) to scan the input data, performing a dot product at each location to produce an output. This process is repeated for multiple locations and filter sizes, allowing the network to capture various patterns and relationships within the data.

Architecture

A typical CNN architecture consists of several layers:

Convolutional Layer: As mentioned earlier, this layer uses filters to scan the input data, extracting features and producing an output. Activation Function: This layer applies a non-linear activation function (e.g., ReLU, Sigmoid) to introduce non-linearity in the network. Pooling Layer (Optional): This layer reduces the spatial dimensions of the feature maps by performing max-pooling or average-pooling operations. Flatten Layer: This layer flattens the output from the pooling layer into a 1D array, allowing it to be fed into fully connected layers. Fully Connected Layers (Optional): These layers are used for classification tasks, where the output from the convolutional and pooling layers is fed into one or more fully connected layers for prediction.

Convolutional Layer

A convolutional layer typically consists of:

Filters: The filters (kernels) used in this layer to scan the input data. Stride: The step size used when scanning the input data with the filters. Padding: Optional padding added to the input data to ensure that the entire image is processed by all filters.

Activation Functions

Common activation functions used in CNNs include:

ReLU (Rectified Linear Unit): f(x) = max(0, x) Sigmoid: f(x) = 1 / (1 + exp(-x)) Tanh: f(x) = tanh(x)

Pooling Layers

Common pooling layers used in CNNs include:

Max-Pooling: Selects the maximum value at each location. Average-Pooling: Calculates the average value at each location.

Python Implementation

CNNs can be implemented in Python using popular libraries such as TensorFlow, Keras, and PyTorch. Here's an example of a basic CNN architecture in TensorFlow:

import tensorflow as tf
Define the model architecture
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
Compile the model
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

Conclusion

In this document, we've explored the world of Convolutional Neural Networks (CNNs), covering their architecture, layers, and activation functions. We've also seen a basic example of implementing a CNN in Python using TensorFlow.

Additional Resources

For further learning, I recommend checking out these resources:

Convolutional Neural Networks by Ian Goodfellow et al. (PDF) Python Convolutional Neural Networks with Keras (TensorFlow tutorial)

I hope this helps!