Hidden Layers in Neural Networks (Geometric View)

A hidden layer does not produce a single value. It produces a vector of values : one per neuron.

A hidden layer can be written as:

So instead of mapping:

it maps:

Each neuron is a separate function:

So:

  • each neuron has its own weights
  • each neuron defines its own hyperplane
  • each neuron detects a different pattern

Example in 2D input space

Let and a hidden layer with 3 neurons.

Neuron 1

Detects:

  • whether the point lies above the line

Neuron 2

Detects:

  • which side of the line the point lies on

Neuron 3

Detects:

  • another linear boundary in input space

The layer output is:

So instead of the original input we now have a transformed representation

The network replaces the original coordinates:

with new coordinates:

where each coordinate means:

  • β€œhow strongly feature detector 1 activates”
  • β€œhow strongly feature detector 2 activates”
  • β€œhow strongly feature detector 3 activates”

A single neuron:

  • produces one linear decision boundary
  • can only separate space with one hyperplane

A layer of neurons:

  • produces many hyperplanes
  • creates a rich partition of space
  • builds complex nonlinear representations

Each layer feeds into the next:

  • Layer 1: simple patterns (edges, lines)
  • Layer 2: combinations of patterns (curves, corners)
  • Layer 3: higher-level structures (objects)

Example:

  • neuron detects β€œedge”
  • neuron detects β€œcurve”
  • neuron detects β€œeye-like shape”

Next layer combines them:

  • β€œeye + nose + mouth β†’ face”

Hidden layers produce feature vectors:

Final layer compresses features into prediction:

  • classification: (digit probabilities)
  • binary classification:

Example :

  • Image classifier:
  • Binary Classification:

Hidden layers learn new coordinate systems where each axis corresponds to a learned feature detector defined by a hyperplane in the previous space.

A hidden layer maps inputs into a new feature space where each coordinate represents the activation of a different learned hyperplane-based detector, enabling progressively more abstract representations.

A neural network progressively partitions space into regions and learns increasingly useful coordinate systems (representations) in which the target problem becomes simpler.

Neural networks act as adaptive coordinate systems that partition input space into polyhedral regions and assign each region a simple linear (or smooth) model.