[go: up one dir, main page]

0% found this document useful (0 votes)
32 views23 pages

Chapter 1

computervision notes

Uploaded by

Shyma P V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views23 pages

Chapter 1

computervision notes

Uploaded by

Shyma P V
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

1.

Mathematical optimization

Mathematical Optimization Problem


A mathematical optimization problem is formulated as follows:

minimize f0 (x)
subject to the constraints:
fi (x) ≤ bi , i = 1, . . . , m.
Here:
• x = (x1 , . . . , xn ) is the optimization variable.
• The function f0 : Rn → R is the objective function.
• The functions fi : Rn → R, i = 1, . . . , m are the inequality constraint functions.
• The constants b1 , . . . , bm are the bounds for the constraints.
A vector x⋆ is called optimal, or a solution of the problem, if it satisfies the constraints and
minimizes the objective function. That is, for any vector z satisfying:

f1 (z) ≤ b1 , . . . , fm (z) ≤ bm ,
we have:

f0 (z) ≥ f0 (x⋆ ).
We often categorize optimization problems by the nature of the objective and constraint
functions. A key example is the linear program, in which the functions f0 , . . . , fm are linear,
meaning they satisfy:

fi (αx + β y) = α fi (x) + β fi (y)


for all x, y ∈ Rn and all α, β ∈ R. If the functions are not linear, the problem is called a nonlinear
program.
12 Chapter 1. Mathematical optimization

Applications of Mathematical Optimization


Mathematical optimization is a powerful tool used across various fields to make optimal decisions
subject to given constraints. One prominent application is portfolio optimization, where we aim to
allocate capital among n assets to minimize risk while meeting constraints such as budget limits
and minimum expected return. The variable xi represents the investment in the i-th asset, and the
objective is to find the allocation that minimizes the portfolio’s overall risk or variance.
Another key area is device sizing in electronic design, where optimization helps determine the
widths and lengths of components in electronic circuits. Constraints in this context include limits
imposed by the manufacturing process, timing requirements, and the total area available for the
circuit. The objective here is typically to minimize power consumption while ensuring the design
meets all technical specifications.
In data fitting, the goal is to find the best model from a family of potential models that
fits observed data. The variables are the model parameters, and constraints may represent prior
information or physical limits (e.g., nonnegativity of parameters). The objective function often
measures the prediction error or the misfit between the observed data and the model’s output.
Optimization helps in choosing the parameter values that minimize this error.
Optimization in engineering is widely used in designing civil, mechanical, chemical, and
aerospace systems. Engineers use optimization to meet performance, safety, or efficiency goals,
such as minimizing fuel consumption in aircraft design or maximizing the load-carrying capacity
of a bridge.
Finally, embedded optimization is gaining traction in real-time systems like robotics and
autonomous vehicles, where optimization algorithms make decisions and execute actions with
minimal human intervention. This requires optimization methods that are highly reliable and solve
problems within a predictable time frame.

1.1 Solution methods


1. Optimization Algorithms: Solution methods are algorithms designed to compute solu-
tions for optimization problems to a given accuracy. These methods have been extensively
developed since the 1940s, with significant progress in algorithms and software.
2. Effectiveness Depends on Problem Structure: The effectiveness of optimization algorithms
depends on the forms of objective and constraint functions, the number of variables and
constraints, and any special structure, such as sparsity, where each constraint depends on
only a few variables.
3. Challenges of General Optimization Problems: Even when objective and constraint
functions are smooth, solving general optimization problems can be challenging. These
methods often involve compromises such as long computation times or the possibility of not
finding an exact solution.
4. Exceptions to Difficult Optimization Problems: Some problem classes, such as least-
squares problems and linear programs, have efficient algorithms that can solve large problems
with hundreds or thousands of variables and constraints.
5. Convex Optimization: Convex optimization is another important exception. Like least-
squares and linear programming, there are efficient algorithms that can reliably solve large
convex problems.

1.1.1 Least-square problems


The least-squares problem is a common type of optimization problem where the objective is to
minimize the sum of the squares of the differences between a set of observed values and the values
predicted by a linear model.
1.1 Solution methods 13

Mathematically, the least-squares problem can be written as: A least-squares problem is an


optimization problem with no constraints (i.e., m = 0) and an objective which is a sum of squares
of terms of the form aTi x − bi :
minimize f0 (x) = ∥Ax − b∥22 (1.1)
k
= ∑ (aTi x − bi )2 , (1.2)
i=1

where A ∈ Rk×n (with k ≥ n), aTi are the rows of A, and the vector x ∈ Rn is the optimization
variable.

Analytical Solution: The Normal Equations


The least-squares problem can be solved analytically by deriving the normal equations. The
solution is obtained by differentiating the objective function with respect to x, setting the derivative
to zero, and solving for x. The resulting equation is:

AT Ax = AT b
This system of equations is known as the normal equations, and it provides the analytical
solution to the least-squares problem. The solution is given by:

x = (AT A)−1 AT b

Numerical Example
Consider a system with the following matrix A and vector b:
   
1 1 1
A = 1 2 , b = 2
  
1 3 2
We want to find the least-squares solution x that minimizes ∥Ax − b∥22 .
Step 1: Compute AT A and AT b:
 
  1 1  
T 1 1 1  3 6
A A= 1 2 =
1 2 3 6 14
1 3
 
  1  
T 1 1 1   5
A b= 2 =
1 2 3 10
2
Step 2: Solve the normal equation AT Ax = AT b:
    
3 6 x1 5
=
6 14 x2 10
By solving this system of linear equations, we get:
x1 = 0, x2 = 1
Thus, the least-squares solution is:  
0
x=
1
14 Chapter 1. Mathematical optimization

Interpretation of Results
 
0
The least-squares solution x = means that the linear model which best fits the observed data
1
 
1
b = 2 is given by the equation:

2
y = x1 + x2 t = 0 + 1 · t = t

This implies that the best-fitting model is y = t. In this case, the slope of the best-fit line is 1,
and the intercept is 0. The model predicts that for every unit increase in the independent variable t,
the dependent variable y increases by 1.
The least-squares method has minimized the squared differences between the observed data
points (1, 1), (2, 2), (3, 2) and the predicted values on the line y = t. This method provides a
straightforward way to find the best linear approximation to the data.

Applications
The least-squares problem is the basis for regression analysis, optimal control, and many parameter
estimation and data fitting methods. It has several statistical interpretations, such as maximum
likelihood estimation of a vector x, given linear measurements corrupted by Gaussian measurement
errors.
Recognizing an optimization problem as a least-squares problem is straightforward; we only
need to verify that the objective is a quadratic function (and then test whether the associated
quadratic form is positive semidefinite). While the basic least-squares problem has a simple fixed
form, several standard techniques are used to increase its flexibility in applications.
Weighted Least-Squares:
In weighted least-squares, the weighted least-squares cost function is:

k
minimize f0 (x) = ∑ wi (aTi x − bi )2 , (1.3)
i=1

where w1 , . . . , wk are positive weights. These weights are chosen to reflect differing levels of
concern about the sizes of the terms aTi x − bi , or to influence the solution. In a statistical setting,
weighted least-squares arises in estimation of x, given linear measurements corrupted by errors
with unequal variances.
Regularization:
Another technique in least-squares is regularization, in which extra terms are added to the cost
function. In the simplest case, a positive multiple of the sum of squares of the variables is added:

k n
minimize f0 (x) = ∑ (aTi x − bi )2 + ρ ∑ xi2 , (1.4)
i=1 i=1

where ρ > 0. The extra terms penalize large values of x, resulting in a sensible solution when
minimizing the first sum alone does not. The parameter ρ is chosen by the user to balance
minimizing the original objective function while keeping the sum of squares of x not too large.
Regularization is used in statistical estimation when x is given a prior distribution.

1.1.2 Linear programming


Linear programming is a class of optimization problems where both the objective function and all
constraint functions are linear. The standard form is:
1.1 Solution methods 15

minimize cT x (1.5)
subject to aTi x ≤ bi , i = 1, . . . , m. (1.6)

Here, c, a1 , . . . , am ∈ Rn and b1 , . . . , bm ∈ R are problem parameters that define the objective


and constraints.
Solving Linear Programs
Linear programming problems are usually solved using specialized algorithms. The two most
common methods are:

1.1.3 Graphical Method


The graphical method is a visual approach to solving Linear Programming Problems (LPP) with
two decision variables. While limited to two-variable problems, it provides valuable insights into
the problem’s structure and solution.
Steps of the Graphical Method
1. Identify the decision variables (usually x and y).
2. Plot the constraints:
• Convert inequality constraints to equations.
• Plot each constraint line on the coordinate system.
3. Identify the feasible region:
• The area that satisfies all constraints simultaneously.
• Usually a polygon (could be unbounded).
4. Determine the corner points (vertices) of the feasible region.
5. Evaluate the objective function at each corner point.
6. Select the optimal solution:
• For maximization: point with the highest objective function value.
• For minimization: point with the lowest objective function value.
7. Check for special cases:
• Unbounded solution
• No feasible solution
• Multiple optimal solutions
Examples
Example 1: Unique Optimal Solution
Maximize: Z = 3x + 2y
Subject to:

x+y ≤ 4
x≤3
y≤2
x, y ≥ 0

Solution:
1. Decision variables: x and y
2. Plot constraints (see graph)
3. Feasible region: Shaded area bounded by the constraints
4. Corner points: (0,0), (3,0), (3,1), (2,2), (0,2)
5. Evaluate objective function:
• (0,0): Z = 0
16 Chapter 1. Mathematical optimization

• (3,0): Z = 9
• (3,1): Z = 11
• (2,2): Z = 10
• (0,2): Z = 4
6. Optimal solution: (3,1) with Z = 11

y x=3

y=2

Feasible Region (3,1)

x+ y = 4

Example 2: Unbounded Solution


Maximize: Z = x + y
Subject to:

x−y ≥ 1
x+y ≥ 3
x, y ≥ 0

Solution:
1. Decision variables: x and y
2. Plot constraints (see graph)
3. Feasible region: Unbounded area above both constraint lines
4. As we move further in the positive x and y directions, Z increases indefinitely
5. Conclusion: Unbounded solution, no finite optimal solution exists

y Z increases

Feasible Region x−y = 1

x+y = 3
x
1.1 Solution methods 17

Example 3: No Feasible Solution


Maximize: Z = 2x + 3y
Subject to:

x+y ≤ 5
x+y ≥ 7
x, y ≥ 0

Solution:
1. Decision variables: x and y
2. Plot constraints (see graph)
3. Feasible region: No area satisfies all constraints simultaneously
4. Conclusion: No feasible solution exists
y

No Feasible Region

x+y = 5 x+y = 7 x

Example 4: Multiple Optimal Solutions


Maximize: Z = 2x + 2y
Subject to:

x+y ≤ 4
x≤3
y≤3
x, y ≥ 0
18 Chapter 1. Mathematical optimization

Solution:
1. Decision variables: x and y
2. Plot constraints (see graph)
3. Feasible region: Shaded area bounded by the constraints
4. Corner points: (0,0), (3,0), (3,1), (1,3), (0,3)
5. Evaluate objective function:
• (0,0): Z = 0
• (3,0): Z = 6
• (3,1): Z = 8
• (1,3): Z = 8
• (0,3): Z = 6
6. Optimal solutions: Both (3,1) and (1,3) with Z = 8
7. Conclusion: Multiple optimal solutions exist along the line segment connecting (3,1) and
(1,3)

y x=3

(1,3)
y=3

Z=8

Feasible Region
(3,1)

x+ y = 4

The graphical method provides a visual and intuitive way to solve LPPs with two variables. It
allows us to easily identify special cases such as unbounded solutions, infeasible problems, and
multiple optimal solutions. For problems with more than two variables, other methods like the
Simplex method are required.

1.1.4 Simplex Method


The Simplex Method is a well-established algorithm that iteratively improves the solution by
moving along the edges of the feasible region (the set of all possible solutions that satisfy the
constraints) until it finds the optimal solution. The Simplex Method is an algorithm for solving
linear programming problems. It is used to find the optimal solution to a linear objective function
subject to linear equality and inequality constraints. The method works by moving from one vertex
of the feasible region to another, always in the direction of improved objective function values,
until the optimal solution is reached.

Key Concepts
• Objective Function: The function we are trying to maximize or minimize.
• Constraints: The limitations or requirements of the problem, expressed as linear inequalities
or equations.
• Feasible Region: The set of all points that satisfy all constraints.
1.1 Solution methods 19

• Basic Feasible Solution: A solution that satisfies all constraints and corresponds to a vertex
of the feasible region.
• Slack Variables: Variables added to inequality constraints to transform them into equality
constraints.
• Pivot: The process of changing the basis (set of basic variables) to move to an adjacent
vertex.
Steps of the Simplex Method
1. Convert the problem to standard form. Transform all inequalities into equalities by adding
slack variables.
2. Create the initial simplex tableau. Write the system of equations and the objective function
in a tableau format.
3. Determine the entering variable (pivot column). Select the column with the most negative
coefficient in the objective function row, as it represents the variable that will increase the
objective value.
4. Determine the leaving variable (pivot row). Calculate the ratio of the right-hand side to the
pivot column for each row, and choose the smallest positive ratio.
5. Perform row operations to get the new basic feasible solution. Make the pivot element 1
and all other elements in the pivot column 0 by performing row operations.
6. Repeat steps 3-5 until the optimal solution is found. Continue this process until there are
no more negative coefficients in the objective function row.
Problem Statement
Maximize Z = 12x1 + 16x2 + 0S1 + 0S2
Subject to:

10x1 + 20x2 + S1 = 120


8x1 + 8x2 + S2 = 80
x1 , x2 , S1 , S2 ≥ 0

Initial Simplex Table

CBi Basic Variable x1 x2 S1 S2 Solution


0 S1 10 20 1 0 120
0 S2 8 8 0 1 80
Zj 0 0 0 0 0

Step 1: Calculate Z j −C j Row

CBi Basic Variable x1 x2 S1 S2 Solution


0 S1 10 20 1 0 120
0 S2 8 8 0 1 80
Zj 0 0 0 0 0
Z j −C j -12 -16 0 0

Step 2: Determine Entering Variable


The most negative value in Z j −C j row is -16, corresponding to x2 . So x2 enters the basis.
20 Chapter 1. Mathematical optimization

Step 3: Calculate Minimum Ratio


120
For S1 row: =6
20
80
For S2 row: = 10
8
The minimum ratio is 6, corresponding to the S1 row. So S1 leaves the basis.
Step 4: Perform Row Operations
Pivot on the element at the intersection of x2 column and S1 row (value 20).

CBi Basic Variable x1 x2 S1 S2 Solution


1 1
16 x2 2 1 20 0 6
0 S2 4 0 − 15 1 32
4
Zj 8 16 5 0 96
4
Z j −C j -4 0 5 0

Step 5: Check for Optimality


The Z j −C j row still has a negative value (-4 for x1 ), so we continue the process.
Step 6: Second Iteration
• Entering variable: x1 (most negative in Z j −C j row)
6 32
• Minimum ratio: min( 1/2 , 4 ) = min(12, 8) = 8, corresponding to S2 row
• Pivot element: 4 (intersection of x1 column and S2 row)
Perform row operations:

CBi Basic Variable x1 x2 S1 S2 Solution


3
16 x2 0 1 20 − 81 2
1 1
12 x1 1 0 − 20 4 8
3
Zj 12 16 5 1 128
3
Z j −C j 0 0 5 1

Final Solution
All values in the Z j −C j row are non-negative, so we’ve reached the optimal solution:
• x1 = 8
• x2 = 2
• Maximum value of Z = 128

1.1.5 Interior-Point Methods


Interior-Point Methods are newer techniques that work by exploring the interior of the feasible
region rather than just the edges. These methods can be more efficient for certain types of problems,
especially large ones.

1.1.6 Complexity
The exact number of calculations needed to solve a linear program can vary. However, Interior-Point
Methods typically require O(n2 m) operations, where n is the number of variables and m is the
number of constraints. Despite this, modern computers can handle very large problems efficiently.
1.2 Convex optimization 21

Using Linear Programming


Many applications naturally lead to linear programs, but others may require transforming the
original problem into a linear program. For instance, the Chebyshev approximation problem is:

minimize max |aTi x − bi |. (1.7)


i=1,...,k

Here, x ∈ Rn is the variable, and a1 , . . . , ak ∈ Rn and b1 , . . . , bk ∈ R are parameters. The objective


is to minimize the maximum absolute deviation of aTi x − bi . The Chebyshev approximation problem
can be reformulated as a linear program:

minimize t (1.8)
subject to aTi x − t ≤ bi , i = 1, . . . , k (1.9)
−aTi x − t ≤ −bi , i = 1, . . . , k (1.10)
where x ∈ Rn and t ∈ R. The objective function in the Chebyshev approximation problem (1.6)
is not differentiable, unlike the quadratic objective in least-squares problems.
Recognizing problems that can be reduced to linear programs requires some knowledge of
linear programming techniques, but these skills can be acquired with practice. Some software
systems can automatically recognize and reformulate problems as linear programs.

1.2 Convex optimization


A convex optimization problem is an optimization problem where both the objective function and
the constraint functions are convex. This property guarantees that if a local minimum is found, it is
also the global minimum, which makes solving such problems more straightforward compared to
non-convex optimization problems.

Convex Functions
A convex function f : Rn → R is a function where the line segment between any two points on its
graph lies above or on the graph itself. More formally, a function f is convex if for all x, y ∈ Rn and
for all α, β ≥ 0 such that α + β = 1, the following inequality holds:

f (αx + β y) ≤ α f (x) + β f (y).


Examples of Convex Functions
• Linear Function:
f (x) = aT x + b
Linear functions are both convex and concave.
• Quadratic Function:
1
f (x) = xT Qx + cT x + d
2
The quadratic function is convex if Q is a positive semi-definite matrix.
• Exponential Function:
f (x) = ex
Exponential functions are convex because their second derivative is always positive.
• Norm Function:
f (x) = ∥x∥2
Norm functions are convex.
22 Chapter 1. Mathematical optimization

Convex Optimization Problem


A convex optimization problem is an optimization problem where the objective function and all
constraint functions are convex. Specifically, it can be formulated as:

minimize f0 (x)
subject to fi (x) ≤ bi , i = 1, . . . , m,
where f0 , f1 , . . . , fm are convex functions and bi are constants.
Examples of Convex Optimization Problems
• Least-Squares Problem:
minimize ∥Ax − b∥22
This is a convex optimization problem because the objective function is a quadratic function
of x, which is convex.
• Linear Programming Problem:

minimize cT x subject to Ax ≤ b

Linear programming is a special case of convex optimization where both the objective
function and the constraints are linear and hence convex.

1.2.1 Solution, usage


Convex optimization is a versatile and powerful tool for solving various types of problems efficiently.
Here is an overview of its practical benefits and some challenges:

Practical Application and Efficiency


Interior-Point Methods
• Function: Interior-point methods are used to solve convex optimization problems by itera-
tively improving an estimate of the optimal solution.
• Iterations: Typically, these methods require between 10 and 100 iterations.
• Computational Complexity: Each iteration involves operations on the order of

max{n3 , n2 m, F},

where n is the number of variables, m is the number of constraints, and F is the cost of
evaluating the derivatives of the objective and constraint functions.
• Practical Performance: Interior-point methods can handle problems with hundreds of
variables and thousands of constraints efficiently on modern computers, often within a few
tens of seconds. By exploiting problem structure (such as sparsity), these methods can solve
much larger problems.

State of Technology
• Current Status: Solving general convex optimization problems is not yet as mature as
solving least-squares or linear programming problems. Research is ongoing, and significant
progress is being made.
• Active Research: Research on interior-point methods for general nonlinear convex optimiza-
tion is still very active, and no consensus has emerged regarding the best methods.
• Specialized Problems: For specific subclasses of convex optimization problems, such as
second-order cone programming and geometric programming, interior-point methods are
approaching a mature technology.
1.3 Nonlinear optimization 23

Conceptual Understanding and Formulation


• Conceptual Similarities: Using convex optimization is conceptually similar to using least-
squares or linear programming. If a problem can be formulated as a convex optimization
problem, it can be solved efficiently in a manner similar to these other methods.
• Recognition and Formulation: Recognizing a convex function or problem can be difficult.
Formulating a problem to fit the convex optimization framework often requires careful
analysis and understanding of the problem’s structure.
• Tricks and Transformations: There are many techniques for transforming problems into a
convex form, which can be more complex than for linear programming.
• Skill Development: The main skill in convex optimization is recognizing and formulating
problems. Once this skill is developed, many practical problems can be solved using convex
optimization techniques.

1.3 Nonlinear optimization


Nonlinear optimization, or nonlinear programming, refers to optimization problems where the
objective function or constraint functions are nonlinear. These problems are generally more
challenging compared to linear optimization problems. The general nonlinear programming
problem can be expressed as:

Minimize f (x) subject to gi (x) ≤ 0, i = 1, . . . , m (1.11)


where f (x) and gi (x) are nonlinear functions. Solving such problems is complex and no general
methods guarantee an optimal solution efficiently.
Example Consider the following optimization problem:

Minimize f (x) = x12 + x22 − 2x1 x2 + 2x1 − 6x2 (1.12)


subject to:

x12 + x22 ≤ 1 (1.13)


Here, f (x) is a nonlinear objective function and x12 + x22 ≤ 1 is a nonlinear constraint. This
problem is not linear due to the quadratic terms in the objective function and the constraint.

1.3.1 Local Optimization


Local optimization seeks to find a solution that is optimal within a local neighborhood of the
feasible region, rather than the global optimal solution. The key characteristics of local optimization
include:
• Objective: Find a point that minimizes the objective function among feasible points in a
local region around it.
• Scope: The solution is not guaranteed to be the globally optimal solution.
• Efficiency: Local optimization methods can be fast and handle large-scale problems effec-
tively, given that they only require differentiability of the objective and constraint functions.
Example Consider the following local optimization problem:

Minimize f (x) = x12 + x22 (1.14)


subject to:

x12 − x2 ≥ 0 (1.15)
24 Chapter 1. Mathematical optimization

Starting from an initial guess of (x1 , x2 ) = (1, 1), a local optimization method might converge
to a solution such as (x1 , x2 ) = (0, 0), which is a local minimum but not necessarily the global
minimum. For this example, (0, 0) is actually the global minimum, but this is not guaranteed for
more complex problems.
Advantages and Disadvantages
• Advantages: Can be applied to large-scale problems, fast computation.
• Disadvantages: Dependence on initial guesses, sensitive to algorithm parameters.

1.3.2 Global Optimization


Global optimization aims to find the true global optimum of the optimization problem. The key
characteristics include:
• Objective: Find the global minimum of the objective function over all feasible solutions.
• Efficiency: Generally, global optimization methods have exponential complexity concerning
problem size. Practical applications may require significant computation time.
Example Consider the following global optimization problem:

Minimize f (x) = sin(x1 ) + cos(x2 ) (1.16)

subject to:

x12 + x22 ≤ 1 (1.17)

In this problem, finding the global minimum within the feasible region (a disk of radius 1
centered at the origin) involves exploring the entire feasible region to guarantee that no better
solution exists outside the region. This approach can be computationally intensive, especially if the
feasible region or the function becomes more complex.
Advantages and Disadvantages
• Advantages: Guarantees finding the global optimum.
• Disadvantages: Computationally expensive, time-consuming.
Both local and global optimization methods have their place depending on the problem context.
Local optimization is generally faster and more practical for large-scale problems but does not
guarantee global optimality. Global optimization, while providing guarantees of finding the true
global optimum, can be computationally expensive and may only be feasible for smaller problems.

1.4 Convex sets


A convex set in linear programming (LPP) is a set of points in a plane that has a single optimum
solution, which is either a maximum or minimum.
A convex set is characterized by: The line segment joining any two points in the set completely
lies in the set No dents or indentations in the curve or polygon The feasible region of a linear
programming problem is a convex set because it’s the collection of feasible solutions.
The objective function of LPP defined over a convex set attains its optimum value at least one
of the corner points.

1.4.1 Lines and Line Segments


Consider two points x1 and x2 in Rn . A line passing through these points can be described using a
parameter θ ∈ R. Any point y on this line is given by:

y = θ x1 + (1 − θ )x2
1.4 Convex sets 25

where θ determines the location of the point y along the line. Here’s how the value of θ affects
the position of y:
• When θ = 0, the point y is exactly at x2 .
• When θ = 1, the point y is exactly at x1 .
• When 0 ≤ θ ≤ 1, the point y is on the line segment between x1 and x2 .
• If θ > 1, the point y is beyond x1 , extending the line beyond this point.
• If θ < 0, the point y is beyond x2 , extending the line in the opposite direction.

For θ = 0 : y = x2
For θ = 1 : y = x1
For 0 < θ < 1 : y is between x2 and x1
For θ > 1 : y is beyond x1
For θ < 0 : y is beyond x2

Alternative Expression
We can also express y in another form:

y = x2 + θ (x1 − x2 )

This form has the following interpretation:


• The point y starts at x2 (when θ = 0).
• The vector x1 − x2 is the direction from x2 to x1 .
• The parameter θ determines how far along this direction we travel. As θ increases, y moves
from x2 to x1 .
In the figure, illustrating this concept, showing the line passing through x1 and x2 , with different
values of θ . The line segment between x1 and x2 is emphasized for 0 ≤ θ ≤ 1.

Explanation of the Figure


• The line is described by the parametric equation θ x1 + (1 − θ )x2 .
• The darker portion of the line corresponds to the line segment, where 0 ≤ θ ≤ 1.
• The point x2 corresponds to θ = 0, and the point x1 corresponds to θ = 1.
• For values of θ outside the range [0, 1], the point y extends beyond the segment.

Summary
• The parametric representation allows us to easily describe points on a line or line segment.
• For 0 ≤ θ ≤ 1, we obtain a line segment between x1 and x2 .
• If θ is outside this range, the point y lies on the infinite extension of the line beyond x1 or x2 .
26 Chapter 1. Mathematical optimization

1.4.2 Affine and convex sets


What is an Affine Set?
An affine set is a set with a special property: if you pick any two points in the set, the entire line
through these two points must also lie within the set. More formally, a set C ⊆ Rn is affine if for
any x1 , x2 ∈ C and θ ∈ R, the point

y = θ x1 + (1 − θ )x2

also belongs to C. This means C contains every point on the line connecting x1 and x2 , for all values
of θ .

Intuitive Explanation:
• If θ = 0, then y = x2 .
• If θ = 1, then y = x1 .
• If 0 < θ < 1, y is somewhere between x1 and x2 .
• If θ > 1, the point y is beyond x1 , extending the line past it.
• If θ < 0, the point y is beyond x2 , extending the line in the opposite direction.

2. Affine Combinations of More Than Two Points


This concept can be extended to more than two points. If you have points x1 , x2 , . . . , xk in the set C,
any affine combination of these points should also be in C. An affine combination of x1 , x2 , . . . , xk
is given by:
y = θ1 x1 + θ2 x2 + · · · + θk xk

where the coefficients θ1 + θ2 + · · · + θk = 1.


If a set contains every affine combination of its points, it is an affine set.

3. Affine Sets as Subspaces Plus an Offset


Every affine set is closely related to subspaces. If C is an affine set and x0 ∈ C, then the set

V = C − x0 = {x − x0 | x ∈ C}

is a subspace, meaning it is closed under addition and scalar multiplication. In other words, V has
the same properties as a vector space.

Affine Set Representation:


We can represent the affine set C as:

C = V + x0 = {v + x0 | v ∈ V }

This means that the affine set C can be thought of as a subspace V shifted by the point x0 .

Dimension of an Affine Set:


The dimension of an affine set C is defined as the dimension of the subspace V = C − x0 , where x0
is any point in C.
1.4 Convex sets 27

4. Affine Hull
The affine hull of a set C ⊆ Rn is the smallest affine set that contains C. The affine hull of C,
denoted aff C, is the set of all affine combinations of points in C:

aff C = {θ1 x1 + · · · + θk xk | x1 , . . . , xk ∈ C, θ1 + · · · + θk = 1} .

The affine hull is the smallest affine set that contains C in the sense that if S is any affine set with
C ⊆ S, then aff C ⊆ S.

5. Example: Solution Set of Linear Equations


Consider the solution set of a system of linear equations:

C = {x | Ax = b}

where A ∈ Rm×n and b ∈ Rm .

Claim:
The solution set C is an affine set.

Proof:
Let x1 , x2 ∈ C, which means:
Ax1 = b and Ax2 = b.
Now consider an affine combination of x1 and x2 :

y = θ x1 + (1 − θ )x2 .

Applying the matrix A to y, we get:

Ay = A(θ x1 + (1 − θ )x2 ) = θ Ax1 + (1 − θ )Ax2 = θ b + (1 − θ )b = b.

This shows that the affine combination y also satisfies Ay = b, so y ∈ C. Therefore, C is an affine
set.

Associated Subspace:
The subspace associated with the affine set C is the nullspace of A, which is the set of solutions to
Ax = 0.

Converse:
Every affine set can be expressed as the solution set of a system of linear equations.

Summary
• An affine set is a set where the line between any two points stays inside the set.
• An affine set contains all affine combinations of its points.
• Every affine set is like a subspace that has been shifted by a point.
• The affine hull of a set is the smallest affine set that contains all points of the set.
• The solution set of a system of linear equations is an affine set, with the nullspace being the
associated subspace.
28 Chapter 1. Mathematical optimization

Affine dimension
The affine dimension of a set C ⊆ Rn is defined as the dimension of its affine hull, denoted as
aff(C). The affine dimension is especially useful in the context of convex analysis and optimization.
However, it may not always align with other definitions of dimension.

Example of Affine Dimension


Consider the unit circle in R2 , i.e., the set

C = {x ∈ R2 | x12 + x22 = 1}.

The affine hull of C is all of R2 , so its affine dimension is 2. However, according to other definitions
of dimension, such as the topological dimension, the unit circle has dimension 1 because it is a
curve.

Relative Interior
If the affine dimension of a set C ⊆ Rn is less than n, the set lies within the affine set aff(C) ⊂ Rn .
In this case, we define the relative interior of C, denoted as relint(C), as its interior relative to its
affine hull:
relint(C) = {x ∈ C | B(x, r) ∩ aff(C) ⊆ C for some r > 0},

where B(x, r) = {y ∈ Rn | ∥y − x∥ ≤ r} is the ball of radius r centered at x (using any norm).

Relative Boundary
We define the relative boundary of a set C as the set of points in the closure of C, but not in its
relative interior:
relbd(C) = C \ relint(C),

where C is the closure of C.

Example: Square in R3
Consider a square in the (x1 , x2 )-plane in R3 , defined as:

C = {x ∈ R3 | −1 ≤ x1 ≤ 1, −1 ≤ x2 ≤ 1, x3 = 0}.

- The affine hull of C is the (x1 , x2 )-plane, i.e.,

aff(C) = {x ∈ R3 | x3 = 0}.

- The interior of C is empty because there are no points of C where a full neighborhood is contained
within C in R3 . - However, the relative interior of C is:

relint(C) = {x ∈ R3 | −1 < x1 < 1, −1 < x2 < 1, x3 = 0}.

- The boundary of C in R3 is the set C itself, but the relative boundary of C is the wireframe outline
of the square:
relbd(C) = {x ∈ R3 | max{|x1 |, |x2 |} = 1, x3 = 0}.
1.4 Convex sets 29

Summary
- The affine dimension of a set is the dimension of its affine hull. - The relative interior of a set is
the interior with respect to its affine hull. - The relative boundary is the closure of the set minus
its relative interior.
The concept of affine dimension and relative interior are important in optimization and convex
analysis, particularly when analyzing sets that may not have an interior in the usual sense, but have
a relative interior in their affine hull.

1.4.3 Convex Sets


A set C is called convex if, for any two points x1 , x2 ∈ C and any θ with 0 ≤ θ ≤ 1, the point

θ x1 + (1 − θ )x2 ∈ C.

This means that the line segment between any two points in C lies within the set C.

Example of Convex and Non-Convex Sets


- **Convex Set**: A hexagon that includes its boundary is convex because any line segment between
two points inside it lies entirely within the hexagon. - **Non-Convex Set**: A kidney-shaped set
is not convex because a line segment between some points may leave the set.

Figure 1.1: Convex and non-convex sets. Left: Hexagon (convex). Middle: Kidney-shaped set
(non-convex). Right: A partially bounded square (non-convex).

Convex Combinations
A point of the form
θ1 x1 + · · · + θk xk
where θ1 + · · · + θk = 1 and θi ≥ 0 for i = 1, . . . , k, is called a convex combination of the points
x1 , . . . , xk .
Convex combinations represent a "weighted average" or a "mixture" of the points, where θi
indicates the proportion of each point in the combination.

Convex Hull
The convex hull of a set C, denoted conv(C), is the set of all convex combinations of points in C:
( )
k
conv(C) = θ1 x1 + · · · + θk xk | xi ∈ C, θi ≥ 0, ∑ θi = 1 .
i=1

The convex hull is the smallest convex set that contains C.


30 Chapter 1. Mathematical optimization

Figure 1.2: Convex hulls of two sets in R2 . Left: Convex hull of 15 points forms a pentagon. Right:
Convex hull of a kidney-shaped set.

Cones
A set C is called a cone (or nonnegative homogeneous) if, for every point x ∈ C and for any θ ≥ 0,
we have:
θ x ∈ C.
In simpler terms, if you take any point in the set and scale it by a non-negative factor, the resulting
point will still lie in the set.
A convex cone is a set that is both convex and a cone. This means that for any x1 , x2 ∈ C and
for any θ1 , θ2 ≥ 0, the point
θ1 x1 + θ2 x2 ∈ C.

Conic Combinations
A point of the form
θ1 x1 + · · · + θk xk where θ1 , . . . , θk ≥ 0
is called a conic combination (or nonnegative linear combination) of the points x1 , . . . , xk . A set is
a convex cone if and only if it contains all conic combinations of its elements.

Conic Hull
The conic hull of a set C is the set of all conic combinations of points in C, i.e.,
{θ1 x1 + · · · + θk xk | xi ∈ C, θi ≥ 0} .
The conic hull is the smallest convex cone that contains C.

Important Examples of Convex Sets


In this section, we describe some important examples of convex sets that we will encounter
throughout the rest of the book. We begin with some simple examples.
• The empty set 0,/ any single point (i.e., a singleton) {x0 }, and the entire space Rn are affine
(and hence convex) subsets of Rn .
• Any line is affine. If the line passes through the origin (i.e., zero), it is a subspace and also a
convex cone.
• A line segment is convex, but it is not affine (unless the segment reduces to a single point).
• A ray, which has the form
{x0 + θ v | θ ≥ 0},
where v ̸= 0, is convex but not affine. The ray is a convex cone if its base point x0 = 0.
• Any subspace is both affine and a convex cone, and hence it is convex.
1.5 Hyperplanes and Half spaces 31

Figure 1.3: Conic hull of two sets in R2 .

1.5 Hyperplanes and Half spaces


Hyperplanes
A hyperplane in Rn is a set of points that satisfy a linear equation of the form

{x ∈ Rn | aT x = b},

where a ∈ Rn , a ̸= 0, and b ∈ R.

Geometric Interpretation
Geometrically, a hyperplane is the set of points that are orthogonal (perpendicular) to the normal
vector a. The constant b shifts the hyperplane from the origin. We can express the hyperplane as:

{x | aT (x − x0 ) = 0},

where x0 is any point on the hyperplane. This shows that the hyperplane consists of all vectors
orthogonal to the normal vector a, starting from x0 .

Figure 1.4: A hyperplane with normal vector a and a point x0 .

Halfspaces
A halfspace is a region of space divided by a hyperplane. A closed halfspace is defined as:

{x ∈ Rn | aT x ≤ b},

where a ̸= 0.
32 Chapter 1. Mathematical optimization

Geometric Interpretation
A halfspace is one of the two regions determined by a hyperplane. The boundary of the halfspace is
the hyperplane {x | aT x = b}. We have:
• {x | aT x ≥ b}: The halfspace extending in the direction of the normal vector a.
• {x | aT x ≤ b}: The halfspace extending in the opposite direction.
The boundary of the halfspace is the hyperplane, while the interior, where aT x < b, is called an
open halfspace.

Figure 1.5: A hyperplane dividing Rn into two halfspaces.

1.6 Separating and supporting hyper planes


Separating Hyperplane Theorem
If C and D are two non-intersecting convex sets, i.e., C ∩ D = 0,
/ then there exists a hyperplane that
separates them. This means there exists a vector a ̸= 0 and a scalar b such that:

aT x ≤ b for all x ∈ C, and aT x ≥ b for all x ∈ D.

This hyperplane is called a separating hyperplane. The set {x | aT x = b} represents the hyperplane,
and it divides the space between C and D.

Strict Separation
In some cases, the separating hyperplane can strictly separate the sets C and D. This means that:

aT x < b for all x ∈ C, and aT x > b for all x ∈ D.

This is called strict separation.

Supporting Hyperplanes
A supporting hyperplane is a hyperplane that touches a convex set at a single point but does not
cut through it. If C ⊂ Rn is convex, and x0 ∈ C, then there exists a supporting hyperplane at x0 if
there is a vector a ̸= 0 and a scalar b such that:

aT x ≤ b for all x ∈ C and aT x0 = b.


1.6 Separating and supporting hyper planes 33

Figure 1.6: A hyperplane separates two disjoint convex sets C and D.

Figure 1.7: Construction of a separating hyperplane between two convex sets C and D. The
hyperplane is perpendicular to the line connecting the closest points c ∈ C and d ∈ D.

You might also like