On the smallest support size of integer solutions to linear equations

222 Accesses
2 Altmetric
Explore all metrics

Abstract

In this note, we study the size of the support of integer solutions to linear equations $Ax=b, ~x\in \mathbb {Z}^n$ where $A\in \mathbb {Z}^{m\times n}, b\in \mathbb {Z}^n$. We give an upper bound on the smallest support size as a function of A, taken as a worst case over all b such that the above system has a solution. This bound is asymptotically tight, and in fact matches the bound given in [1], while the proof presented here is simpler, relying only on linear algebra.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The main goal of this work is to establish upper and lower bounds on the smallest support size of integer solutions to a system of linear equations. Let $A \in \mathbb {Z}^{m \times n}$ be a matrix and denote by $\mathcal {L}(A):=\{Ax\mid x\in \mathbb {Z}^n\}$ the lattice generated by the columns of A. For any $b\in \mathcal {L}(A)$, we want to find an integer solution to $Ax=b$ with smallest support size, i.e.

$$\begin{aligned} \min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}^n\}, \end{aligned}$$

(1)

where $\Vert x\Vert _0:= |\{j: x_j \not = 0\}|$. We will assume without loss of generality that A has full row rank, i.e. ${\text {rank}}(A) = m$, because the system is trivially satisfied if and only if the equations corresponding to independent rows of A are satisfied. In this note, we study the largest value of (1) taken over all points b in the lattice $\mathcal {L}(A)$. To be precise, for matrix $A\in \mathbb {Z}^{m\times n}$, define

$$\begin{aligned} f(A):=\max \limits _{b\in \mathcal {L}(A)}\min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}^n\}. \end{aligned}$$

(2)

This quantity has been the focus of several previous studies. For example, in [1], f(A) is called the integer linear rank of A, denoted by ILR(A). According to [1], determining f(A) is NP-hard. Their proof actually implies that given both A, b as inputs, determining the value of (1) is NP-hard. Nontrivial bounds on problem (2) have been leveraged in many areas of discrete optimization; see, for example, references in Sect. 1 of [1] and in the introduction of [2].

As a result, there have been several studies focused on achieving increasingly strong upper bounds on the quantity (2). Currently, the best available upper bound of f(A) is achieved in [1]; let $f_{AADLO}(A)$ denote this bound. Let A be an integer matrix with m rows and at least m columns, with largest absolute entry denoted by $\Vert A\Vert _\infty $. Their bound is parameterized by the following quantities.

$$\begin{aligned} \begin{aligned} \Gamma (A):=&\max \limits ~\{|\det (B)|\mid B{\text { is an }}m\times m{\text { submatrix of }}A\};\\ \gcd (A):=&\gcd ~\{|\det (B)|\mid B{\text { is an }}m\times m{\text { submatrix of }}A\}. \end{aligned} \end{aligned}$$

The main contribution of this work is the following theorem, which essentially matches the state-of-the-art bound $f_{AADLO}(A)$.

Theorem 1

Let A be an integer matrix with m rows and full row rank. Let $p_i$ be the i-th prime. Then, the following relation between $\Gamma (A)$, $\gcd (A)$ and f(A) holds.

(3)

Furthermore, the bound is asymptotically tight in the sense that for arbitrarily large integers m and t, there exist an integer matrix A with m rows, full row rank, $\left\Vert A\right\Vert _\infty >t$, and a vector $b\in \mathcal {L}(A)$ such that equality holds in (3).

We will see how the bound (3) is comparable to $f_{AADLO}(A)$; in particular, the proof of Theorem 1 can easily be adapted to obtain the bound $f_{AADLO}(A)$.

We are interested in obtaining bounds on f(A) parameterized by the number of rows and the largest absolute entry of A. With this in mind, define

$$\begin{aligned} \begin{aligned} h(m,t):=&\max \limits _{A\in \mathbb {Z}^{m\times n},\left\Vert A\right\Vert _\infty \le t, b\in \mathcal {L}(A)}\min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}^n\}\\ =&\max \limits _{A\in \mathbb {Z}^{m\times n},\left\Vert A\right\Vert _\infty \le t}f(A). \end{aligned} \end{aligned}$$

(4)

The following bound follows as a corollary.

Theorem 2

$$\begin{aligned} h(m,t)\le O\left( \frac{m\log (\sqrt{m}~t)}{\log \log (\sqrt{m}~t)}\right) . \end{aligned}$$

(5)

We intend for this paper to stand out for the strength of its result, but also for its simplicity and readability.

Overview. In Sect. 2, we discuss related work and how the results of this paper compare with recent developments in bounding the support size of integer solutions to systems of equations/inequalities. In Sect. 3.1 we introduce the notion of integral independence, which will be useful throughout the proofs of Theorems 2 and 1. In particular, in Sect. 3.2, we achieve a nontrivial lower bound on the largest $m \times m$ subdeterminant of a matrix A with integrally independent columns, given by Theorem 1. In Sect. 3.3, using the relationship between the largest subdeterminant and largest entry given by the Hadamard’s inequality [8], we achieve the upper bound on the number of columns, i.e. h(m, t), given by Theorem 2. Finally, in Sect. 3.4 we exhibit the matrix A referred to in the latter statement of Theorem 1.

Notation. Throughout this paper we use $p_i$ to denote the i-th prime number. In particular, $p_1 =2$. For any $p,q \in \mathbb {Z}$, we use $p \mid q$ to denote that p divides q, i.e. $q/p \in \mathbb {Z}$; similarly, we use $p \not \mid q$ to denote that p does not divide q; we may use this interchangeably with the language q is (respectively, is not) divisible by p. We use the convention that $\gcd (a,0) = |a|$ for any $a \in \mathbb {Z}$, including 0. We use the notation $[n]:=\{1,2,...,n\}$, and, for positive integers $i\le j$, $[i:j]:=\{i,i+1,...,j\}$. For any matrix $A\in \mathbb {R}^{m\times n}$, and $I\subseteq [m]$, $J\subseteq [n]$, we denote by $A_{I\times J}$ the submatrix of A consisting of rows indexed by I and columns indexed by J. Denote by $\left\Vert A\right\Vert _\infty $ the infinity norm of A. For $x \in \mathbb {R}$, denote by $\left\lfloor x\right\rfloor $ the largest integer less than or equal to x; the smallest integer greater than or equal to x. We use $\mathbb {Z}^n_{\ge 0}:= \{x \in \mathbb {Z}^n \mid x_j \ge 0 {\text { for all }} j \in [n] \}$, and $\mathbb {R}^n_{\ge 0}$ analogously.

2 Related work

Problem (2) has garnered interest from multiple research disciplines. For example, an interesting problem is to find nonnegative integer solutions to $Ax=b$. Define the integer conic hull generated by the columns of A to be the vectors that can be represented as nonnegative integer combinations of the columns of A, denoted as ${{\,\mathrm{int\_cone}\,}}(A):=\{Ax\mid x\in \mathbb {Z}_{\ge 0}^n\}$. Let

$$\begin{aligned} \begin{aligned} g(m,t):=\max _{A\in \mathbb {Z}^{m\times n},~\left\Vert A\right\Vert _\infty \le t,~ b \in {{\,\mathrm{int\_cone}\,}}(A)}\min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}_{\ge 0}^n\}. \end{aligned} \end{aligned}$$

(6)

Note that $h(m,t)\le g(m,t)$. Indeed, for any $b\in \mathcal {L}(A)$, there exists $A'$, which can be obtained upon negating some columns of A, such that $b\in {{\,\mathrm{int\_cone}\,}}(A')$. Take any solution $x'$ to $A'x'=b,~x'\in \mathbb {Z}_{\ge 0}^n$ and upon negating some coordinates of $x'$, we obtain a solution to $Ax=b,~x\in \mathbb {Z}^n$ of the same support as $x'$. Thus an upper bound on g(m, t) can serve as an upper bound for h(m, t). Eisenbrand and Shmonin [7] establish the first upper bound on g(m, t) using the pigeonhole principle, achieving $g(m,t)\le 2\,m\log (4mt)$. This bound has been improved by [2] to $g(m,t)\le 2m\log (2\sqrt{m}t)$ using Siegel’s lemma [3]. Using g(m, t) to upper bound h(m, t), one can obtain $h(m,t)\le 2m\log (2\sqrt{m}t)$, which is the best upper bound on the integer linear rank, parameterized by the number of rows and largest absolute entry of the matrix, available in the literature. Note that the bound of Theorem 2 improves these bounds on h(m, t) by a factor of $O(\log \log (\sqrt{m}t))$. An asymptotic lower bound for g(m, t) is also established in [2]: for any $\epsilon >0$, there exist a matrix $A\in \mathbb {Z}^{m\times n}$ with n/m large enough and a vector $b\in {{\,\mathrm{int\_cone}\,}}(A)$, such that $\min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}_{\ge 0}^n\}\ge m\log (\left\Vert A\right\Vert _\infty )^{\frac{1}{1+\epsilon }}$.

Another interesting direction is when the columns of matrix A form a Hilbert basis, i.e. in other words, for any b in ${{\,\textrm{cone}\,}}(A)\cap \mathbb {Z}^n$, b is also in ${{\,\mathrm{int\_cone}\,}}(A)$, where ${{\,\textrm{cone}\,}}(A):=\{Ax\mid x\in \mathbb {R}_{\ge 0}^n\}$. Cook et al. [5] show that when ${{\,\textrm{cone}\,}}(A)$ is pointed and the columns of A form an integral Hilbert basis, the smallest support size of solutions to $Ax=b,~x\in \mathbb {Z}_{\ge 0}^n$ is always upper bounded by $2m-1$. Sebö [13] improves this bound to $2m-2$. Note that the bound is independent of $\left\Vert A\right\Vert _\infty $.

As far as we know, the most relevant paper in the literature is [1], where they establish an upper bound for f(A) and show that it is optimal. For any $z\in \mathbb {Z}_{>0}$, consider the prime factorization $z=q_1^{s_1}\cdots q_k^{s_k}$ such that $q_1,...,q_k$ are pairwise distinct. Aliev et al. [1] introduce $\Omega _m(z):=\sum _{i=1}^{k}\min \{s_i,m\}$, called truncated prime $\Omega $-funciton. Let $\left( {\begin{array}{c}[n]\\ m\end{array}}\right) $ be all the subsets of [n] of cardinality m and for $\tau \in \left( {\begin{array}{c}[n]\\ m\end{array}}\right) $, let $A_{\tau }$ be the $m\times m$ submatrix of A with columns indexed by $\tau $. They show

$$\begin{aligned} \begin{aligned} f(A)\le m+\min _{\tau \in \left( {\begin{array}{c}[n]\\ m\end{array}}\right) , \det (A_\tau )\ne 0}\Omega _m\left( \frac{|\det (A_\tau )|}{\gcd (A)}\right) . \end{aligned} \end{aligned}$$

(7)

They also show this bound is optimal in the sense that neither m can be replaced by any smaller constant nor the function $\Omega _m$ can be replaced by any smaller function.

The proof of bound (7) relies on the connection between the theory of finite Abelian groups and lattice theory. In particular, they use the primary decomposition theorem of finite Abelian groups (see e.g. Chapter 5.2 in [6]) and group representation of lattices (see e.g. Section 4.4 of [12]). In contrast, the approach in this paper is significantly simpler, only using linear algebra. The proof is self-contained and does not require knowledge of group theory or lattice theory. Moreover, we will show how the bound (7) can be obtained as a byproduct of our proof of Theorem 1.

3 Proof of theorems 1 and 2

3.1 Integral independence

A set of vectors $\{a_1,...,a_n\}$ is called integrally dependent if there is some $k \in [n]$ such that $a_k = \sum _{i \not = k} \mu _i a_i$, where every $\mu _i $ is an integer, and integrally independent otherwise. Furthermore, if A is a matrix with integrally independent columns, we call the matrix itself integrally independent. The integral independence is closely related to the smallest support of integral solutions to an equation $Ax=b$, which we will elaborate on in the following proposition.

Proposition 3

Let A be a matrix with columns $a_1,...,a_n \in \mathbb {Z}^m$. Then, the following are equivalent:

1.
The vectors $\{a_1,...,a_n\}$ are integrally independent;
2.
There exists a vector $b \in \mathcal {L}(A)$ such that $\left\Vert x\right\Vert _0 = n$ for any integer solution x to $Ax = b$;
3.
There is no integer vector in the null space of A with an entry equal to 1, i.e. $\{x \in \mathbb {Z}^n: Ax = 0,~ \exists j\in [n] {\text { such that }} x_j = 1\} = \emptyset $.

Proof

We start by proving $(1) \implies (2)$. Let $b = \sum _{i \in [n]} a_i $. Consider an integral vector $x\in \mathbb {Z}^{n}$ such that $\sum _{i \in [n]} a_i x_i = \sum _{i \in [n]} a_i$. For sake of contradiction, assume some component of x is zero; without loss of generality, assume $x_n = 0$. Then, $\sum _{i \in [n-1]} a_i x_i = \sum _{i \in [n]} a_i$. It follows that $a_n = - \sum _{i \in [n-1]} (1 - x_i) a_i$, where $a_n$ is an integral combination of $\{a_1,...,a_{n-1}\}$, a contradiction.

We now prove $(2) \implies (1)$. Let $b\in \mathcal {L}(A)$ be such that every integral x with $Ax = b$ has $\left\Vert x\right\Vert _0 = n$. Further let $x'$ be one such choice of x and write $b = \sum _{i \in [n]} a_i x'_i $. Suppose for sake of contradiction that $\{a_1,...,a_n\}$ is integrally dependent. Then, by definition, there is some $k \in [n]$ and $\mu \in \mathbb {Z}^{n-1}$ such that $a_k = \sum _{i \not = k} \mu _i a_i$. So, we can write

$$ b = \sum _{i \not = k} a_i x'_i + \sum _{i \not = k}x'_k \mu _i a_i = \sum _{i \not = k} (x'_i + \mu _i x'_k)a_i. $$

It follows from $x'$ and $\mu $ are both integral that each $(x'_i + \mu _i x'_k)$ is integral. Then, b can be written as an integral combination of $\{a_1,...,a_n\} \setminus \{a_k\}$, contradicting the fact that $\left\Vert x\right\Vert _0=n$ for every integral x such that $Ax = b$.

We now prove $(1) \implies (3)$. Suppose there is some $x \in \mathbb {Z}^n {\setminus } \{0\}$ such that $\sum _{i \in n} a_i x_i = 0$; and without loss of generality, assume $x_n = 1$. Then, we can rewrite this as $a_n = -\sum _{i \in [n-1]}\frac{x_i}{x_n} a_i=\sum _{i \in [n-1]}(-x_i) a_i$. Therefore, $\{a_1,...,a_n\}$ is integrally dependent.

We now prove $(3) \implies (1)$. Suppose that $\{a_1,...,a_n\}$ is integrally dependent. Then, by definition, there exists a $k \in [n]$ such that $a_k = \sum _{i \not = k} \mu _i a_i$, where every $\mu _i$ integer. Therefore, vector $\begin{pmatrix} -\mu _1&-\mu _2&...&- \mu _{k-1}&1&-\mu _{k+1}&...&-\mu _n \end{pmatrix}$ is in the null space of A. $\square $

3.2 Upper bound on f(A)

Applying unimodular column operations, we can bring A into Hermite normal form $H=\begin{pmatrix} D&0 \end{pmatrix}$, where $D\in \mathbb {Z}^{m\times m}$ is a lower triangular matrix. We have the relation $AU=\begin{pmatrix} D&0 \end{pmatrix}$, where $U\in \mathbb {Z}^{n\times n}$ is a unimodular matrix corresponding to the unimodular operations (see e.g. Section 1.5.2 in [4]). To proceed, we remind readers of some basic facts about unimodular operations.

Define the GCD of zeros to be zero, and note that the GCD of several numbers is zero if and only if they are all equal to zero. Recall that for $C\in \mathbb {Z}^{m\times n}$ with $m\le n$, gcd(C) is the GCD of all $m\times m$ subdeterminants of C.

Fact 1

Let $C\in \mathbb {Z}^{m\times n}$ with $m \le n$ and $V \in \mathbb {Z}^{n x n}$ be a unimodular matrix. Then, $gcd(CV)=gcd(C)$.

Fact 2

$gcd(C)=0$ if and only if the rows of C are linearly dependent.

The first fact follows from the GCD of all $m\times m$ subdeterminants of C being invariant under unimodular column operations (see, e.g., Section 4.4 of [12]). To see the second fact, we can bring C into its Hermite normal form $\begin{pmatrix} C'&0 \end{pmatrix}$ by applying unimodular column operations V, i.e. $CV=\begin{pmatrix} C'&0 \end{pmatrix}$. By the first fact, the GCD of all $m\times m$ subdeterminants of C equals that of CV, which is $\det (C')$. Thus, the GCD is 0 if and only if $\det (C')=0$, which means the rows of $C'$ are linearly dependent. Since V is unimodular and thus nonsingular, the rows of $C'$ are linearly dependent if and only if the rows of C are linearly dependent. Furthermore, it may be useful to recall that the Hermite normal form of a matrix is unique. Due to these facts and the properties of unimodular matrices, we have the following lemma.

Lemma 4

Let $A\in \mathbb {Z}^{m\times n}$ be a matrix with integrally independent columns. Let $H = AU$ be the Hermite normal form of A, where U is a unimodular matrix. Suppose $U=\begin{pmatrix} U_1&U_2 \end{pmatrix}$, where $U_1\in \mathbb {Z}^{n\times m}, U_2\in \mathbb {Z}^{n\times (n-m)}$. Then $U_2$ must satisfy the following properties:

(P1).
$U_2$ has at least one nonsingular $(n-m)\times (n-m)$ submatrix.
(P2).
The GCD of all $(n-m)\times (n-m)$ subdeterminants of $U_2$ is 1.
(P3).
For each row i of $U_2$, there exists some prime $q_i \ge 2$ such that $q_i\mid (U_2)_{ij}$ for all $j \in [n-m]$.

Proof

It follows from $U^{-1}U = I$ that $U_2^\top (U^{-1})^\top =\begin{pmatrix} I_{n-m}&0 \end{pmatrix}$. Together with the fact that $(U^{-1})^\top $ is unimodular we obtain that the Hermite normal form of $U_2^\top $ is $\begin{pmatrix} I_{n-m}&0 \end{pmatrix}$. According to Fact 1, the $gcd(U_2^\top )=\det (I_{n-m})=1$, proving (P2). In particular, it is nonzero and thus by Fact 2, $U_2$ has rank $n-m$ and therefore has a nonsingular $(n-m)\times (n-m)$ submatrix, proving (P1).

It remains to prove property (P3). Since $A\begin{pmatrix} U_1&U_2 \end{pmatrix}=\begin{pmatrix} D&0 \end{pmatrix}$, for $D\in \mathbb {Z}^{m\times m}$ some lower triangular matrix, we notice that columns of $U_2$ are in the null space of A. Since the columns of A are integrally independent, by Proposition 3, any integral x in the null space of A has no entry equal to 1. Suppose for the sake of contradiction, the entries in row i of $U_2$ have GCD 1. Then, there exists an integral $\mu \in \mathbb {Z}^{n-m}$ such that $\sum _{j=1}^{n-m}\mu _j(U_2)_{ij}=1$. Then $x = U_2 \mu $ is an integral vector in the null space of A with $x_i = 1$, which contradicts the integral independence of A. Therefore, for any row of $U_2$, its entries must have a common divisor strictly greater than 1, and therefore a prime that is at least 2. $\square $

The following theorem is the key to derive Theorem 1 of this paper. It establishes a lower bound for the largest $m\times m$ subdeterminant of a matrix with integrally independent columns.

Theorem 5

Let matrix $A\in \mathbb {Z}^{m\times n}$ be of full row rank with integrally independent columns. Let $p_i$ be the i-th prime. Then,

To prove this theorem, we need to introduce Jacobi’s formula (see Theorem 2.5.2 of [11]), which uses linear algebra to establish the relation between the subdeterminant of a matrix and its inverse.

Lemma 6

(Jacobi’s formula) Let $A\in \mathbb {R}^{n\times n}$, $A=(a_{ij})_1^n$. For any $I, J\subseteq [n]$, denote $A_{I\times J}$ be the submatrix of A consisting of rows indexed by I and columns indexed by J. Let $A^*=(a_{ij}^*)_1^n$ be the classical adjoint of A, where

$$a^*_{ij}=(-1)^{i+j}\det (A_{[n]\setminus \{j\}~\times ~ [n]\setminus \{i\}})$$

is the cofactor of element $a_{ji}$ in A. Let $\sigma =\begin{pmatrix} i_1 & i_2 & \cdots & i_n \\ j_1 & j_2 & \cdots & j_n \end{pmatrix}$ be an arbitrary permutation and $(-1)^\sigma $ be its sign. Let $I=\{i_1,...i_p\}$, $J=\{j_1,...,j_p\}$, $I'=\{i_{p+1},...,i_n\}$, $J'=\{j_{p+1},...,j_n\}$ for some $1\le p\le n-1$. Then,

$$\det (A^*_{I\times J})=(-1)^\sigma \det (A_{J'\times I'})\cdot \det (A)^{p-1}.$$

To prove Theorem 5, we will use the following direct corollary of Lemma 6.

Corollary 7

Let $U\in \mathbb {R}^{n\times n}$ be a unimodular matrix. Then for any $I,J\subseteq [n]$ with $1\le |I|=|J|\le n-1$,

$$\begin{aligned} \det ((U^{-1})_{I\times J})=\pm \det (U_{[n]\backslash J~\times ~ [n]\backslash I}), \end{aligned}$$

(8)

where $U_{I\times J}$ denotes the submatrix of U consisting of rows indexed by I and columns indexed by J.

Proof of Theorem 5

Suppose by applying unimodular column operations, we bring A into Hermite normal form $H=\begin{pmatrix} D&0 \end{pmatrix}$, where $D\in \mathbb {Z}^{m\times m}$ is a lower triangular matrix. We obtain the relation of $AU=\begin{pmatrix} D&0 \end{pmatrix}$, where $U\in \mathbb {Z}^{n\times n}$ is a unimodular matrix. Let $U=\begin{pmatrix} U_1&U_2 \end{pmatrix}$, where $U_1\in \mathbb {Z}^{n\times m}, U_2\in \mathbb {Z}^{n\times (n-m)}$.

Recall property (P3) of $U_2$ in Lemma 4 saying that for each row i of $U_2$, there exists some prime $q_i \ge 2$ such that $q_i\mid (U_2)_{ij}$ for all $j \in [n-m]$. For any prime q, let $I_q:=\{i\in [n]: q\mid (U_2)_{ij}, \forall j\in [n-m]\}$ be the indices of rows of $U_2$ which are divisible by q. We show for any prime q, $U_2$ must have a nonsingular $(n-m)\times (n-m)$ submatrix $(U_2)_{I\times [n-m]}$ with $I\subseteq [n]{\setminus } I_q$, $|I|=n-m$. Suppose not. Then every nonsingular $(n-m)\times (n-m)$ submatrix of $U_2$ must include at least one row whose index is in $I_q$. Since such a row is divisible by q, the determinant of such a submatrix must also be divisible by q. By property (P1) in Lemma 4, there exists at least one such nonsingular submatrix. Therefore, the GCD of the $\left( {\begin{array}{c}n\\ n-m\end{array}}\right) $ such subdeterminants is at least $q \ge 2$, contradicting property (P2) in Lemma 4. Therefore, $|I_q|\le n-|I|=m$ for any prime q.

Recall $p_i$ is the i-th prime and $q_i \ge 2$ is a prime that divides row i of $U_2$. The above argument implies that $U_2$ has a nonsingular $(n-m)\times (n-m)$ submatrix, denoted as V, whose row indices are in $[n]\setminus I_{p_1}$. Assume V consists of the first $n-m$ rows of $U_2$ without loss of generality. Then, we have

$$q_1q_2\cdots q_{n-m}~\Big | ~|\det (V)|.$$

Notice that each prime $p_i$ appears at most m times among $\{q_1,...,q_{n-m}\}$, since $|I_q| \le m$ for any prime q. Also, by the way we pick V, $p_1$ does not appear in $\{q_1,...,q_{n-m}\}$. Together with the fact that $\det (V)\ne 0$, this allows us to bound below $|\det (V)|$ as

The last term in the inequality shows a smallest possible combination of primes satisfying the conditions above, where the first $\left\lfloor \frac{n-m}{m}\right\rfloor =\left\lfloor \frac{n}{m}\right\rfloor -1$ primes starting from $p_2$ each appears m times, while the -th prime appears the remainder of $n-m\left\lfloor \frac{n}{m}\right\rfloor $ times.

Next, we use the relation between the determinants of the submatrix of A and U to bound above $\det (V)$. Notice $A=\begin{pmatrix} D&0 \end{pmatrix}U^{-1}$ and thus $A_{[m]\times [n-m+1:n]}=D\cdot (U^{-1})_{[m]\times [n-m+1:n]}$. We have

$$\begin{aligned} \begin{aligned} |\det (A_{[m]\times [n-m+1:n]})| =&|\det (D)|\cdot |\det ((U^{-1})_{[m]\times [n-m+1:n]})|\\ =&\gcd (A) \cdot |\det ((U^{-1})_{[m]\times [n-m+1:n]})| \\ =&\gcd (A) \cdot |\det (U_{[n-m]\times [m+1:n]})|\\ =&\gcd (A) \cdot |\det (V)|, \end{aligned} \end{aligned}$$

where the second equality follows from Fact 1 and the third equality follows from Corollary 7. Recall $\Gamma (A)$ is the largest absolute value of an $m\times m$ subdeterminant of A. Combining them together we obtain

as desired. $\square $

Recall from Proposition 3 that a matrix A has integrally independent columns if and only if there exists some b such that every integer solution to $Ax=b$ has full support. Taking advantage of this observation, Theorem 5 can be naturally adapted to derive an upper bound for f(A), i.e. the maximum, taken over $b\in \mathcal {L}(A)$, of smallest support size of an integer solution to $Ax=b$ (formally defined in (2)). This yields the proof of one of our main results.

Proof of Theorem 1

Take b such that $Ax=b$ has maximum smallest support size of an integer solution to $Ax=b$, i.e. $\min ~\{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}^n\}=f(A)$. Let $x^*$ be an integral solution to $Ax=b$ with smallest support, i.e. $\left\Vert x^*\right\Vert _0=f(A)$. We can assume without loss of generality that $A\in \mathbb {Z}^{m\times f(A)}$ by deleting the columns j of A corresponding to $x^*_j=0$ for $j\in [n]$. This will not increase $\Gamma (A)$, the largest $m\times m$ subdeterminant of A, and will not decrease $\gcd (A)$, the GCD of all $m\times m$ subdeterminants of A (this is easily seen by the fact that redundant columns can be zeroed out using unimodular column operations). Thus it suffices to prove inequality (3) after deleting redundant columns of A. By the minimality of support size of $x^*$, any integer solution to $Ax=b$ has full support. By Proposition 3, columns of A are integrally independent. It follows from Theorem 5 that inequality (3) holds. $\square $

Remark 1

We demonstrate how to modify the proof of Theorem 5 to obtain the bound (7) given in [1]. We use the same notation as in the proof of Theorem 5. For any $m\times m$ submatrix of A whose columns are indexed by J where $|J|=m$, we have

$$\begin{aligned} \begin{aligned} |\det (A_{[m]\times J})| =&|\det (D)|\cdot |\det ((U^{-1})_{[m]\times J})|\\ =&\gcd (A) \cdot |\det ((U^{-1})_{[m]\times J})| \\ =&\gcd (A) \cdot |\det (U_{[n]\backslash J ~\times ~ [m+1:n]})|. \end{aligned} \end{aligned}$$

We also know that

$$\begin{aligned} \prod _{i\in [n]\backslash J}q_i~ \Big |~|\det (U_{[n]\backslash J ~\times ~ [m+1:n]})|, \end{aligned}$$

and thus

$$\begin{aligned} \prod _{i\in [n]\backslash J}q_i~ \Big |~\frac{|\det (A_{[m]\times J})|}{\gcd (A)}, \end{aligned}$$

where $q_i, i\in [n]\backslash J$ are prime numbers with the same prime repeating at most m times in $\{q_i\mid i\in [n]\backslash J\}$. Recall notation $\Omega _m(z)=\sum _{i=1}^{k}\min \{s_i,m\}$ for the prime factorization of $z=r_1^{s_1}\cdots r_k^{s_k}$ with multiplicities $s_1,...,s_k\in \mathbb {Z}_{>0}$. Clearly, when $x\mid y$, $\Omega _m(x)\le \Omega _m(y)$. Thus, $\Omega _m\left( \prod _{i\in [n]\backslash J}q_i\right) \le \Omega _m\left( \frac{|\det (A_{[m]\times J})|}{\gcd (A)}\right) $. Moreover, since the multiplicity of each $q_i$ in $\prod _{i\in [n]\backslash J}q_i$ is at most m, we have $\Omega _m\left( \prod _{i\in [n]\backslash J}q_i\right) =|[n]\backslash J|=n-m$. Therefore, $n-m\le \Omega _m\left( \frac{|\det (A_{[m]\times J})|}{\gcd (A)}\right) $. Since J is an arbitrary subset of [n] with cardinality m, we obtain $n\le m+\min _{\tau \in \left( {\begin{array}{c}[n]\\ m\end{array}}\right) , \det (A_\tau )\ne 0}\Omega _m\Big (\frac{|\det (A_\tau )|}{\gcd (A)}\Big )$. Applying the same argument as in the proof of Theorem 1 above, we obtain $f(A)\le m+\min _{\tau \in \left( {\begin{array}{c}[n]\\ m\end{array}}\right) , \det (A_\tau )\ne 0}\Omega _m\Big (\frac{|\det (A_\tau )|}{\gcd (A)}\Big )$.

3.3 Upper bound on h(m, t)

Recall h(m, t) is the maximum, taken over all integer matrices A with m rows and largest absolute entry t, of the smallest support size of an integer solution to $Ax=b$ for some $b\in \mathcal {L}(A)$ (formally defined in (4)). Using a relation between the size of largest absolute entry of a matrix and the size of its subdeterminant, we want to use results in Sect. 3.2 to obtain an upper bound for h(m, t). We will use the well-known Hadamard inequality [8], which gives us an upper bound on the determinant of a matrix, i.e. for any matrix $B\in \mathbb {R}^{m\times m}$,

$$|\det (B)|\le (\sqrt{m}\left\Vert B\right\Vert _\infty )^m.$$

Applying this to inequality (3), we obtain the following corollary.

Corollary 8

h(m, t) satisfies

(9)

Proof

Let A be a matrix with m rows, full row rank, and $\left\Vert A\right\Vert _\infty \le t$ such that $\min \{\Vert x\Vert _0 \mid Ax = b, x \in \mathbb {Z}^{n}\}=h(m,t)$ for some $b\in \mathcal {L}(A)$. Then inequality (9) follows directly from Theorem 1 and Hadamard’s inequality. $\square $

Furthermore, we can prove the upper bound on h(m, t) of Theorem 2 by applying prime number theorem (see e.g. [9] Theorem 6) to approximate the product of the first k primes $\prod \nolimits _{i=1}^{k}p_i$.

Proof of Theorem 2

According to the prime number theorem, the product of the first k primes $\prod \nolimits _{i=1}^{k}p_i\sim e^{(1+o(1))k\log k}$, where $\log (\cdot )$ is the natural logarithm (this follows from the Prime Number Theorem, see Theorem 6 of [9] and Sequence A002110 of [10]). Let $n=h(m,t)$. Relaxing (9), we have

Taking logarithm and applying $\prod \nolimits _{i=1}^{k}p_i\sim e^{(1+o(1))k\log k}$, we have

$$\log (\sqrt{m}~t)\ge \log (p_1 p_2\cdots p_{\left\lfloor \frac{n}{m}\right\rfloor })-\log 2\ge C\left\lfloor \frac{n}{m}\right\rfloor \log \left\lfloor \frac{n}{m}\right\rfloor $$

for some constant C.

We claim that $\left\lfloor \frac{n}{m}\right\rfloor =O\left( \frac{\log (\sqrt{m}~t)}{\log \log (\sqrt{m}~t)}\right) $. Suppose not, we assume $\left\lfloor \frac{n}{m}\right\rfloor >\frac{2\log (\sqrt{m}~t)}{C\log \log (\sqrt{m}~t)}$ for some m that is arbitrarily large. Thus,

$$\begin{aligned}\log (\sqrt{m}~t)\ge C\left\lfloor \frac{n}{m}\right\rfloor \log \left\lfloor \frac{n}{m}\right\rfloor > \frac{2\log (\sqrt{m}~t)}{\log \log (\sqrt{m}~t)}\Big (\log \log (\sqrt{m}~t)\\log\big (C\log \log (\sqrt{m}~t)\big )\Big ).\end{aligned}$$

This will give us

$$2\log \big (C\log \log (\sqrt{m}~t)\big )>\log \log (\sqrt{m}~t),$$

which is not going to hold when m is arbitrarily large, a contradiction. Thus, we have $\left\lfloor \frac{n}{m}\right\rfloor =O\left( \frac{\log (\sqrt{m}~t)}{\log \log (\sqrt{m}~t)}\right) $. It follows that $n=O\left( \frac{m\log (\sqrt{m}~t)}{\log \log (\sqrt{m}~t)}\right) $. $\square $

3.4 Lower bound on f(A)

Finally, we prove the latter statement of Theorem 1 by giving an example integer matrix A with arbitrarily large number of rows m and $\left\Vert A\right\Vert _\infty $, showing that the upper bound on f(A) is asymptotically tight. A similar construction has appeared in [1, 2] to prove lower bounds on support size of integer solutions.

Proposition 9

For arbitrarily large (m, t), there exists A with m rows, full row rank, and $\left\Vert A\right\Vert _\infty >t$ such that equality holds in (3). In other words, the bound in Theorem 1 is asymptotically tight.

Proof

Let $p_i$ be the i-th prime. There exists k such that $p_2p_3\cdots p_k>t$. Let $n=km$. Define $A\in \mathbb {Z}^{m\times n}$ as

$$\begin{aligned} \begin{aligned} A_{ij}={\left\{ \begin{array}{ll} p_1p_2\cdots p_k/p_r & ~~j=(i-1)k+r,~ 1\le r \le k\\ ~~~0 & ~~{\text {otherwise}} \end{array}\right. } \end{aligned} \end{aligned}$$

and $b=A\textbf{1}$. Then, $\left\Vert A\right\Vert _\infty = p_2p_3\cdots p_k>t$. For any $i\in [m]$, $b_i=\sum _{r=1}^{k}p_1p_2\cdots p_k/p_r$. Observe that $p_j\not \mid b_i$, for any $i=1,...,m, j=1,...,k$. Any $x\in \mathbb {Z}^n$ with $\left\Vert x\right\Vert _0<n$ would have $x_j=0$ for some $j=(i-1)k+r$ with $1\le r\le k$. Thus, $(Ax)_i=\sum _{s=1,s\ne r}^{k}(p_1p_2\cdots p_k/p_s) x_{(i-1)k+s}$. Since $p_r\mid (p_1p_2\cdots p_k/p_s)$, $\forall s\ne r$, we have $p_r\mid (Ax)_i$. Thus, $Ax\ne b$, which means any integer x with $\left\Vert x\right\Vert _0<n$ is not a solution to $Ax=b$. Therefore, any integer solution of $Ax=b$ has smallest support size $n=km$. Due to the block structure of A, the Hermite normal form of A is $ \begin{pmatrix} I_m&0 \end{pmatrix}$ since $ \gcd (\{A_{ij}:j=(i-1)k+r, 1\le r\le k\})=1, \forall i \in [m]$. Therefore, $ \gcd (A)=\det (I_m)=1$. Moreover, $\Gamma (A)=\max \limits _{B} ~\{|\det (B)|\mid B{\text { is an }}m\times m{\text { submatrix of}} A\}=p_2^mp_3^m\cdots p_k^m$, matching the right hand side in (3). $\square $

References

Aliev, I., Averkov, G., De Loera, J.A., Oertel, T.: Sparse representation of vectors in lattices and semigroups. Math. Program. 192(1–2), 519–546 (2022)
Article MathSciNet MATH Google Scholar
Aliev, I., De Loera, J.A., Eisenbrand, F., Oertel, T., Weismantel, R.: The support of integer optimal solutions. SIAM J. Optim. 28(3), 2152–2157 (2018)
Article MathSciNet MATH Google Scholar
Bombieri, E., Vaaler, J.: On siegel’s lemma. Invent. Math. 73(1), 11–32 (1983)
Article MathSciNet MATH Google Scholar
Conforti, M., Cornuéjols, G., Zambelli, G., et al.: Integer programming, vol. 271. Springer (2014)
MATH Google Scholar
Cook, W., Fonlupt, J., Schrijver, A.: An integer analogue of caratheodory’s theorem. J. Comb. Theory, Ser. B 40(1), 63–70 (1986)
Article MathSciNet MATH Google Scholar
Dummit, D.S., Foote, R.M.: Abstract algebra, vol. 3. Wiley Hoboken (2004)
MATH Google Scholar
Eisenbrand, F., Shmonin, G.: Carathéodory bounds for integer cones. Oper. Res. Lett. 34(5), 564–568 (2006)
Article MathSciNet MATH Google Scholar
Hadamard, J.: Resolution d’une question relative aux determinants. Bull. des sciences math. 2, 240–246 (1893)
MATH Google Scholar
Hardy, G.H., Wright, E.M.: An introduction to the theory of numbers. Oxford University Press, Xx (1979)
MATH Google Scholar
OEIS Foundation Inc. The On-Line Encyclopedia of Integer Sequences. Published electronically at https://oeis.org
Prasolov, V.V.: Problems and theorems in linear algebra, vol. 134. American Mathematical Soc (1994)
MATH Google Scholar
Schrijver, A.: Theory of linear and integer programming. John Wiley & Sons (1998)
MATH Google Scholar
Sebö, A.: Hilbert bases, caratheodory’s theorem and combinatorial optimization. In: Proceedings of the 1st integer programming and combinatorial optimization conference, pp. 431–455 (1990)

Download references

Acknowledgements

This work was supported by ONR grant N00014-22-1-2528 and a Balas PhD Fellowship. We acknowledge the invaluable discussions with Ahmad Abdi, Gérard Cornuéjols, Levent Tunçel, and Anthony Karahalios without whom this project would not have been possible. We thank Ahmad Abdi and Timm Oertel for feedback that very much improved the presentation of these results. We also thank two anonymous reviewers for carefully reading the paper and giving useful suggestions.

Funding

Open Access funding provided by Carnegie Mellon University.

Author information

Authors and Affiliations

Applied Scientist, Amazon, Seattle, USA
Yatharth Dubey & Siyue Liu
Tepper School of Business, Carnegie Mellon University, Pittsburgh, USA
Siyue Liu

Authors

Yatharth Dubey
View author publications
You can also search for this author in PubMed Google Scholar
Siyue Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siyue Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Dubey, Y., Liu, S. On the smallest support size of integer solutions to linear equations. Math. Program. (2025). https://doi.org/10.1007/s10107-025-02202-7

Download citation

Received: 14 July 2023
Accepted: 17 January 2025
Published: 18 February 2025
DOI: https://doi.org/10.1007/s10107-025-02202-7

Mathematics Subject Classification

90C10

On the smallest support size of integer solutions to linear equations

Abstract

1 Introduction

Theorem 1

Theorem 2

2 Related work

3 Proof of theorems 1 and 2

3.1 Integral independence

Proposition 3

Proof

3.2 Upper bound on f(A)

Fact 1

Fact 2

Lemma 4

Proof

Theorem 5

Lemma 6

Corollary 7

Proof of Theorem 5

Proof of Theorem 1

Remark 1

3.3 Upper bound on h(m, t)

Corollary 8

Proof

Proof of Theorem 2

3.4 Lower bound on f(A)

Proposition 9

Proof

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification