How to Calculate the Subgradient?


A subgradient is a generalization of the gradient for non-differentiable convex functions. Below are key cases for finding subgradients.


1️. Subgradient of a Weighted Sum

Let f1(x),f2(x),,fm(x) be convex functions on Rh, and let d1,d2,,dmR+.

If we define:

f(x)=i=1mdifi(x),

then the subdifferential is:

f(x)=i=1mdifi(x).

🔹 Key Insight: The subgradient of a weighted sum of convex functions is the weighted sum of their subgradients.


2️. Subgradient of the Maximum Function

Let f1(x),f2(x),,fm(x) be convex functions on Rh, and define:

f(x)=maxifi(x).

Then, the subdifferential is given by:

f(x)=conv(ifi(x)).

where conv denotes the convex hull of the union of all subgradients.

Convex Combination of Subgradients

For any set of subgradients a1,a2,,am, their convex combination is:

conv(iai)=λ1a1+λ2a2++λmam,

where:

🔹 Key Insight: The subgradient of maxifi(x) is the convex hull of the union of individual subgradients.


3️. Subgradient of a Scaled Function

If f(x) is convex on Rh and we define:

g(x)=df(x),d>0,

then the subdifferential satisfies:

(df(x))=df(x).

🔹 Key Insight: Scaling a function by d>0 scales its subgradients by the same factor.


Summary

Function Subdifferential
Sum: f(x)=difi(x) f(x)=difi(x)
Max: f(x)=maxifi(x) f(x)=conv(ifi(x))
Scaling: f(x)=df(x) (df(x))=df(x)

These rules are essential when dealing with non-smooth convex optimization problems! 🚀

1D Examples

Example 1: Absolute Value Function

Consider the function:

f(x)=|x|

This function is not differentiable at x=0 but is convex. The subgradient is:

f(x)={{1},x>0{1},x<0[1,1],x=0

🔹 Interpretation: At x=0, any value in [1,1] is a valid subgradient because the function has a "kink" at x=0.


Example 2: Maximum of Two Linear Functions

Consider:

f(x)=max(2x,x)

🔹 Interpretation: The function takes the gradient of the active branch, but at x=0, the subgradient is any convex combination of 1 and 2.


2D Examples

Example 3: Euclidean Norm Function

Consider the function:

f(x,y)=x2+y2

which is the Euclidean norm (distance from the origin). The gradient is:

f(x,y)=(xx2+y2,yx2+y2),(x,y)(0,0).

At the origin (0,0), the function is not differentiable. The subgradient is the unit ball:

f(0,0)={(a,b) | a2+b21}.

🔹 Interpretation: The subgradient at the origin includes all points inside the unit disk.

Why is the subgradient the unit ball?

Since the function is convex, we can use the subgradient definition:
The subdifferential at (0,0) is the set of all vectors (a,b) satisfying:

f(x,y)f(0,0)+(a,b),(x,y),(x,y).

Substituting f(0,0)=0:

x2+y2ax+by,(x,y).

By the Cauchy-Schwarz inequality, this holds if and only if:

a2+b21.

Thus, the subgradient at (0,0) is the unit disk:

f(0,0)={(a,b) | a2+b21}.

Example 4: Maximum of Two Planes

Consider:

f(x,y)=max(x,y)

which is the set of points:

λ(1,0)+(1λ)(0,1),0λ1.

🔹 Interpretation: At points where x=y, any convex combination of (1,0) and (0,1) is a valid subgradient.


Summary of Key Results

Function Subgradient f(x)
f(x)=|x| {1} for x<0, {1} for x>0, [1,1] at x=0
max(2x,x) {1} for x<0, {2} for x>0, [1,2] at x=0
x2+y2 (xx2+y2,yx2+y2) for (x,y)(0,0); unit ball at (0,0)
max(x,y) (1,0) if x>y, (0,1) if y>x, convex hull of (1,0) and (0,1) if x=y

These examples illustrate how subgradients generalize gradients for non-differentiable convex functions. 🚀