How to Calculate the Subgradient?

A subgradient is a generalization of the gradient for non-differentiable convex functions. Below are key cases for finding subgradients.

1️. Subgradient of a Weighted Sum

Let $f_{1} (x), f_{2} (x), \dots, f_{m} (x)$ be convex functions on $R^{h}$ , and let $d_{1}, d_{2}, \dots, d_{m} \in R_{+}$ .

If we define:

f (x) = \sum_{i = 1}^{m} d_{i} f_{i} (x),

then the subdifferential is:

\partial f (x) = \sum_{i = 1}^{m} d_{i} \partial f_{i} (x) .

🔹 Key Insight: The subgradient of a weighted sum of convex functions is the weighted sum of their subgradients.

2️. Subgradient of the Maximum Function

Let $f_{1} (x), f_{2} (x), \dots, f_{m} (x)$ be convex functions on $R^{h}$ , and define:

f (x) = max_{i} f_{i} (x) .

Then, the subdifferential is given by:

\partial f (x) = conv (⋃_{i} \partial f_{i} (x)) .

where conv denotes the convex hull of the union of all subgradients.

Convex Combination of Subgradients

For any set of subgradients $a_{1}, a_{2}, \dots, a_{m}$ , their convex combination is:

conv (⋃_{i} a_{i}) = λ_{1} a_{1} + λ_{2} a_{2} + \dots + λ_{m} a_{m},

where:

$λ_{i} \geq 0$
$\sum_{i} λ_{i} = 1$

🔹 Key Insight: The subgradient of $max_{i} f_{i} (x)$ is the convex hull of the union of individual subgradients.

3️. Subgradient of a Scaled Function

If $f (x)$ is convex on $R^{h}$ and we define:

g (x) = d f (x), d > 0,

then the subdifferential satisfies:

\partial (d f (x)) = d \partial f (x) .

🔹 Key Insight: Scaling a function by $d > 0$ scales its subgradients by the same factor.

Summary

Function	Subdifferential
Sum: $f (x) = \sum d_{i} f_{i} (x)$	$\partial f (x) = \sum d_{i} \partial f_{i} (x)$
Max: $f (x) = max_{i} f_{i} (x)$	$\partial f (x) = conv (⋃_{i} \partial f_{i} (x))$
Scaling: $f (x) = d f (x)$	$\partial (d f (x)) = d \partial f (x)$

These rules are essential when dealing with non-smooth convex optimization problems! 🚀

1D Examples

Example 1: Absolute Value Function

Consider the function:

f (x) = | x |

This function is not differentiable at $x = 0$ but is convex. The subgradient is:

\partial f (x) = {\begin{cases} {1}, & x > 0 \\ {- 1}, & x < 0 \\ [- 1, 1], & x = 0 \end{cases}

🔹 Interpretation: At $x = 0$ , any value in $[- 1, 1]$ is a valid subgradient because the function has a "kink" at $x = 0$ .

Example 2: Maximum of Two Linear Functions

Consider:

f (x) = max (2 x, - x)

If $x < 0$ , then $f (x) = - x$ and $\nabla f (x) = - 1$ .
If $x > 0$ , then $f (x) = 2 x$ and $\nabla f (x) = 2$ .
If $x = 0$ , then $f (x) = max (0, 0) = 0$ , so:
$\partial f (0) = conv ({- 1, 2}) = [- 1, 2] .$

🔹 Interpretation: The function takes the gradient of the active branch, but at $x = 0$ , the subgradient is any convex combination of $- 1$ and $2$ .

2D Examples

Example 3: Euclidean Norm Function

Consider the function:

f (x, y) = \sqrt{x^{2} + y^{2}}

which is the Euclidean norm (distance from the origin). The gradient is:

\nabla f (x, y) = (\frac{x}{\sqrt{x^{2} + y^{2}}}, \frac{y}{\sqrt{x^{2} + y^{2}}}), (x, y) \neq (0, 0) .

At the origin $(0, 0)$ , the function is not differentiable. The subgradient is the unit ball:

\partial f (0, 0) = {(a, b) | a^{2} + b^{2} \leq 1} .

🔹 Interpretation: The subgradient at the origin includes all points inside the unit disk.

Why is the subgradient the unit ball?

Since the function is convex, we can use the subgradient definition:
The subdifferential at $(0, 0)$ is the set of all vectors $(a, b)$ satisfying:

f (x, y) \geq f (0, 0) + ⟨ (a, b), (x, y) ⟩, \forall (x, y) .

Substituting $f (0, 0) = 0$ :

\sqrt{x^{2} + y^{2}} \geq a x + b y, \forall (x, y) .

By the Cauchy-Schwarz inequality, this holds if and only if:

a^{2} + b^{2} \leq 1.

Thus, the subgradient at $(0, 0)$ is the unit disk:

\partial f (0, 0) = {(a, b) | a^{2} + b^{2} \leq 1} .

Example 4: Maximum of Two Planes

Consider:

f (x, y) = max (x, y)

If $x > y$ , then $f (x, y) = x$ and $\nabla f (x, y) = (1, 0)$ .
If $y > x$ , then $f (x, y) = y$ and $\nabla f (x, y) = (0, 1)$ .
If $x = y$ , then the subgradient is:
$\partial f (x, y) = conv ({(1, 0), (0, 1)}) .$

which is the set of points:

λ (1, 0) + (1 - λ) (0, 1), 0 \leq λ \leq 1.

🔹 Interpretation: At points where $x = y$ , any convex combination of $(1, 0)$ and $(0, 1)$ is a valid subgradient.

Summary of Key Results

Function	Subgradient $\partial f (x)$
$f (x) = \| x \|$	${- 1}$ for $x < 0$ , ${1}$ for $x > 0$ , $[- 1, 1]$ at $x = 0$
$max (2 x, - x)$	${- 1}$ for $x < 0$ , ${2}$ for $x > 0$ , $[- 1, 2]$ at $x = 0$
$\sqrt{x^{2} + y^{2}}$	$(\frac{x}{\sqrt{x^{2} + y^{2}}}, \frac{y}{\sqrt{x^{2} + y^{2}}})$ for $(x, y) \neq (0, 0)$ ; unit ball at $(0, 0)$
$max (x, y)$	$(1, 0)$ if $x > y$ , $(0, 1)$ if $y > x$ , convex hull of $(1, 0)$ and $(0, 1)$ if $x = y$

These examples illustrate how subgradients generalize gradients for non-differentiable convex functions. 🚀