Hessian matrix


Hessian matrix: Understanding the second-order behavior of functions

The Hessian matrix is a mathematical tool used to analyze the curvature of a function. It helps us determine whether a function is convex, concave, or neither by looking at its second-order derivatives. In optimization, the Hessian is particularly useful because it tells us if a function has a unique minimum and whether standard optimization methods (like gradient descent) will work efficiently.

convex and concave functions.png|500


What is the Hessian matrix

The Hessian of a function f:RnR is a square matrix containing all the second-order partial derivatives of the function. It is written as:

Hf(x)=2f(x)=[2fx122fx1x22fx1xn2fx2x12fx222fx2xn2fxnx12fxnx22fxn2]

Each entry of this matrix represents the rate of change of one partial derivative with respect to another variable. The diagonal elements represent how the function curves along each coordinate direction, while the off-diagonal elements describe how the curvature changes between different variables.


Why is the Hessian important

The Hessian matrix is used to:


Understanding the Hessian in one dimension (1D case)

In one dimension, the Hessian reduces to a single second derivative:

Hf(x)=d2fdx2

Here’s how to interpret it:

Convex And Concave Functions And Inflection Points General.png|500

Example 1: quadratic function

Consider the function

f(x)=x2

Its first derivative is:

dfdx=2x

Its second derivative is:

d2fdx2=2

Since 2>0, the function is convex — its graph forms a parabolic bowl that always curves upwards.

Convex And Concave Functions And Inflection Points Good Example.png|400


Example 2: cubic function

Consider the function

f(x)=x3

The first derivative is:

dfdx=3x2

The second derivative is:

d2fdx2=6x

This function is not convex everywhere because the second derivative depends on x:

Convex And Concave Functions And Inflection Points Bad Example.png|400

This shows that convexity is not just about checking one point — it must hold everywhere.


Understanding the Hessian in two dimensions (2D case)

For a function of two variables f(x,y), the Hessian is a 2×2 matrix:

Hf(x,y)=[2fx22fxy2fyx2fy2]

To determine convexity, we check if this matrix is positive semidefinite using the leading principal minors test (or checking all eigenvalues 🤷‍♀️):

  1. The first leading principal minor (the first diagonal element) must be nonnegative:

    2fx20
  2. The determinant of the Hessian matrix must be nonnegative:

    det(Hf)=(2fx2)(2fy2)(2fxy)20

These two conditions ensure that the function is convex in two dimensions.


Example 1: convex function in 2D

Consider:

f(x,y)=x2+y2

First, compute the second derivatives:

2fx2=2,2fy2=2,2fxy=0

The Hessian matrix is:

Hf(x,y)=[2002]

Check the conditions:

Since both conditions hold, the function is convex.

Paraboloid.png

Geometric intuition:
This function represents a bowl-shaped surface in 3D, confirming convexity.


Example 2: non-convex function in 2D

Consider:

f(x,y)=x2y2

The Hessian matrix is:

Hf(x,y)=[2002]

Check the conditions:

Hyperbolic Paraboloid.png

Since the determinant is negative, the Hessian is not positive semidefinite, meaning the function is not convex — it has a saddle point.


Summary of the Hessian Matrix and Convexity

Dimension Hessian Form Convexity Condition
1D d2fdx2 d2fdx20
2D [2fx22fxy2fyx2fy2] Determinant test: D>0, 2fx2>0
nD 2f(x) Matrix is positive semidefinite

So, basically the idea of nD case is the same as 2D case, we need to proove that the Hessian matrix is positive semidefinite. We can choose any method to do that: minors test or eigenvalues test.


Final Takeaways