Hidden Layers in Neural Networks (Geometric View)
A hidden layer does not produce a single value. It produces a vector of values : one per neuron.
A hidden layer can be written as:
So instead of mapping:
it maps:
Each neuron is a separate function:
So:
- each neuron has its own weights
- each neuron defines its own hyperplane
- each neuron detects a different pattern
Example in 2D input space
Let and a hidden layer with 3 neurons.
Neuron 1
Detects:
- whether the point lies above the line
Neuron 2
Detects:
- which side of the line the point lies on
Neuron 3
Detects:
- another linear boundary in input space
The layer output is:
So instead of the original input we now have a transformed representation
The network replaces the original coordinates:
with new coordinates:
where each coordinate means:
- βhow strongly feature detector 1 activatesβ
- βhow strongly feature detector 2 activatesβ
- βhow strongly feature detector 3 activatesβ
A single neuron:
- produces one linear decision boundary
- can only separate space with one hyperplane
A layer of neurons:
- produces many hyperplanes
- creates a rich partition of space
- builds complex nonlinear representations
Each layer feeds into the next:
- Layer 1: simple patterns (edges, lines)
- Layer 2: combinations of patterns (curves, corners)
- Layer 3: higher-level structures (objects)
Example:
- neuron detects βedgeβ
- neuron detects βcurveβ
- neuron detects βeye-like shapeβ
Next layer combines them:
- βeye + nose + mouth β faceβ
Hidden layers produce feature vectors:
Final layer compresses features into prediction:
- classification: (digit probabilities)
- binary classification:
Example :
- Image classifier:
- Binary Classification:
Hidden layers learn new coordinate systems where each axis corresponds to a learned feature detector defined by a hyperplane in the previous space.
A hidden layer maps inputs into a new feature space where each coordinate represents the activation of a different learned hyperplane-based detector, enabling progressively more abstract representations.
A neural network progressively partitions space into regions and learns increasingly useful coordinate systems (representations) in which the target problem becomes simpler.
Neural networks act as adaptive coordinate systems that partition input space into polyhedral regions and assign each region a simple linear (or smooth) model.
