Skip to main content

Jeremy Barbay

Given a set P of n points in Rd, where each point p of P is associated with a weight w(p) (positive or negative), the Maximum-Weight Box problem consists in finding an axis-aligned box B maximizing ∑ p∈B∩P w(p). We describe algorithms for... more
Given a set P of n points in Rd, where each point p of P is associated with a weight w(p) (positive or negative), the Maximum-Weight Box problem consists in finding an axis-aligned box B maximizing ∑ p∈B∩P w(p). We describe algorithms for this problem in two dimensions that run in the worst case in O(n²) time, and much less on more specific classes of instances. In particular, these results imply similar ones for the Maximum Bichromatic Discrepancy Box problem. These improve by a factor of Θ(log n) on the best worst-case complexity previously known for these problems, O(n² lg n) [Cortés et al., J. Alg., 2009; Dobkin et al., J. Comput. Syst. Sci., 1996]. Although the O(n²) result can be deduced from new results on the Klee’s Measure problem [Chan, 2013], it is a more direct and simplified (nontrivial) solution, which further provides smaller running times on specific classes on instances.
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data to support online rank and select queries, with the minimum amount of work in the worst case over instances... more
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data to support online rank and select queries, with the minimum amount of work in the worst case over instances of size $n$ and number of queries $q$ fixed (i.e., the query size). Barbay et al. (2016) refined this approach to take advantage of the gaps between the positions hit by the queries (i.e., the structure in the queries). We develop new techniques in order to further refine this approach and to take advantage all at once of the structure (i.e., the multiplicities of the elements), the local order (i.e., the number and sizes of runs) and the global order (i.e., the number and positions of existing pivots) in the input; and of the structure and order in the sequence of queries. Our main result is a synergistic deferred data structure which performs much better on large classes of instances, while performing always asymptotically as good as previous solu...
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data to support online rank and select queries, with the minimum amount of work in the worst case over instances... more
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data to support online rank and select queries, with the minimum amount of work in the worst case over instances of size n and number of queries q fixed. Barbay et al. (2016) refined this approach to take advantage of the gaps between the positions hit by the queries (i.e., the structure in the queries). We develop new techniques in order to further refine this approach and take advantage all at once of the structure (i.e., the multiplicities of the elements), some notions of local order (i.e., the number and sizes of runs) and global order (i.e., the number and positions of existing pivots) in the input; and of the structure and order in the sequence of queries. Our main result is a synergistic deferred data structure which outperforms all solutions in the comparison model that take advantage of only a subset of these features. As intermediate results, we des...
From the 8th of July 2018 to the 13th of July 2018, a Dagstuhl Seminar took place with the topic “Synergies between Adaptive Analysis of Algorithms, Parameterized Complexity, Compressed Data Structures and Compressed Indices”. There, 40... more
From the 8th of July 2018 to the 13th of July 2018, a Dagstuhl Seminar took place with the topic “Synergies between Adaptive Analysis of Algorithms, Parameterized Complexity, Compressed Data Structures and Compressed Indices”. There, 40 participants from as many as 14 distinct countries and four distinct research areas, dealing with running time analysis and space usage analysis of algorithms and data structures, gathered to discuss results and techniques to “go beyond the worst-case” for classes of structurally restricted inputs, both for (fast) algorithms and (compressed) data structures. The seminar consisted of (1) a first session of personal introductions, each participant presenting his expertise and themes of interests in two slides; (2) a series of four technical talks; and (3) a larger series of presentations of open problems, with ample time left for the participants to gather and work on such open problems. Seminar July 8–13, 2018 – http://www.dagstuhl.de/18281 2012 ACM S...
We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within O ( n ( 1 + lg α ) ) ⊆ O ( n lg n ) , where the alternation α ∈ [ 1 . . n − 1 ] approximates the minimal amount of sorting... more
We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within O ( n ( 1 + lg α ) ) ⊆ O ( n lg n ) , where the alternation α ∈ [ 1 . . n − 1 ] approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation α . Such results refine the state of the art complexity of Θ ( n lg n ) in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen’s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.’s deferred data structure to partially sort a multiset accordingly to the queries performed on ...
We describe and analyze the first adaptive algorithm for merging k convex hulls in the plane. This merging algorithm in turn yields a synergistic algorithm to compute the convex hull of a set of planar points, taking advantage both of the... more
We describe and analyze the first adaptive algorithm for merging k convex hulls in the plane. This merging algorithm in turn yields a synergistic algorithm to compute the convex hull of a set of planar points, taking advantage both of the positions of the points and their order in the input. This synergistic algorithm asymptotically outperforms all previous solutions for computing the convex hull in the plane.
Divide-and-Conquer is a central paradigm for the design of algorithms, through which fundamental computational problems like sorting arrays and computing convex hulls are solved in optimal time within $\Theta(n\log{n})$ in the worst case... more
Divide-and-Conquer is a central paradigm for the design of algorithms, through which fundamental computational problems like sorting arrays and computing convex hulls are solved in optimal time within $\Theta(n\log{n})$ in the worst case over instances of size $n$. A finer analysis of those problems yields complexities within $O(n(1 + \mathcal{H}(n_1, \dots, n_k))) \subseteq O(n(1{+}\log{k})) \subseteq O(n\log{n})$ in the worst case over all instances of size $n$ composed of $k$ "easy" fragments of respective sizes $n_1, \dots, n_k$ summing to $n$, where the entropy function $\mathcal{H}(n_1, \dots, n_k) = \sum_{i=1}^k{\frac{n_i}{n}}\log{\frac{n}{n_i}}$ measures the difficulty of the instance. We consider whether such refined analysis can be applied to other solutions based on Divide-and-Conquer. We describe two optimal examples of such refinements, one for the computation of planar convex hulls adaptive to the decomposition of the input into simple chains, and one for the...
Research Interests:
Abstract: Given a set $ P $ of $ n $ planar points, two axes and a real-valued score function $ f () $ on subsets of $ P $, the Optimal Planar Box problem consists in finding a box (ie axis-aligned rectangle) $ H $ maximizing $ f (H\ cap... more
Abstract: Given a set $ P $ of $ n $ planar points, two axes and a real-valued score function $ f () $ on subsets of $ P $, the Optimal Planar Box problem consists in finding a box (ie axis-aligned rectangle) $ H $ maximizing $ f (H\ cap P) $. We consider the case where $ f () $ is monotone decomposable, ie there exists a composition function $ g () $ monotone in its two arguments such that $ f (A)= g (f (A_1), f (A_2)) $ for every subset $ A\ subseteq P $ and every partition $\{A_1, A_2\} $ of $ A $. In this context we propose a solution for the ...
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data, so that to support online rank and select queries, with the minimum amount of work in the worst case over... more
Karp et al. (1988) described Deferred Data Structures for Multisets as "lazy" data structures which partially sort data, so that to support online rank and select queries, with the minimum amount of work in the worst case over instances of size $n$ and query number $q$ fixed (i.e., the query size). Barbay et al. (2016) refined this approach to take advantage of the gaps between the positions hit by the queries (i.e., the structure in the queries). We further refine this approach to take advantage of the structure (i.e., the frequency of the elements), local order (i.e., the number and sizes of runs) and global order (i.e., the number of existing pivot positions and sizes of the gaps separating them) in the input, and of the structure and order in the sequence of queries. Our main result is a synergistic deferred data structure which is much better on some large classes of instances, while always asymptotically as good as previous solutions. As an interesting set of interme...
Abstract We present a data structure that stores a sequence $ s [1.. n] $ over alphabet $[1..\ sigma] $ in $ n\ Ho (s)+ o (n)(\ Ho (s){+} 1) $ bits, where $\ Ho (s) $ is the zero-order entropy of $ s $. This structure supports the... more
Abstract We present a data structure that stores a sequence $ s [1.. n] $ over alphabet $[1..\ sigma] $ in $ n\ Ho (s)+ o (n)(\ Ho (s){+} 1) $ bits, where $\ Ho (s) $ is the zero-order entropy of $ s $. This structure supports the queries\ access,\ rank\ and\ select, which are fundamental building blocks for many other compressed data structures, in worst-case time $\ Oh {\ lg\ lg\ sigma} $ and average time $\ Oh {\ lg\ Ho (s)} $.
Abstract: We describe a new technique to compute an optimal prefix-free code over $\ alphabetSize $ symbols from their frequencies $\{\ frequency_1,..,\ frequency_\ alphabetSize\} $. This technique yields an algorithm running in linear... more
Abstract: We describe a new technique to compute an optimal prefix-free code over $\ alphabetSize $ symbols from their frequencies $\{\ frequency_1,..,\ frequency_\ alphabetSize\} $. This technique yields an algorithm running in linear time in the $\ Omega (\ lg\ alphabetSize) $-word RAM model when each frequency holds into $\ Oh (1) $ words, hence improving on the $\ Oh (\ alphabetSize\ lg\ lg\ alphabetSize) $ solution based on sorting in the word RAM model.
Abstract: Binary relations are an important abstraction arising in many data representation problems. The data structures proposed so far to represent them support just a few basic operations required to fit one particular application. We... more
Abstract: Binary relations are an important abstraction arising in many data representation problems. The data structures proposed so far to represent them support just a few basic operations required to fit one particular application. We identify many of those operations arising in applications and generalize them into a wide set of desirable queries for a binary relation representation. We also identify reductions among those operations.
Abstract: We show that the complexity of the online and offline versions of the multiselect problem are at most within a constant factor of each other, in both internal and external memory, by describing and analyzing O (1)-competitive... more
Abstract: We show that the complexity of the online and offline versions of the multiselect problem are at most within a constant factor of each other, in both internal and external memory, by describing and analyzing O (1)-competitive online algorithms in internal and external memory, and showing a new lower bound on the complexity of the problem in external memory.
Abstract: Given a set $ P $ of $ n $ planar points, two axes and a real-valued score function $ f () $ on subsets of $ P $, the Optimal Planar Box problem consists in finding a box (ie axis-aligned rectangle) $ H $ maximizing $ f (H\ cap... more
Abstract: Given a set $ P $ of $ n $ planar points, two axes and a real-valued score function $ f () $ on subsets of $ P $, the Optimal Planar Box problem consists in finding a box (ie axis-aligned rectangle) $ H $ maximizing $ f (H\ cap P) $. We consider the case where $ f () $ is monotone decomposable, ie there exists a composition function $ g () $ monotone in its two arguments such that $ f (A)= g (f (A_1), f (A_2)) $ for every subset $ A\ subseteq P $ and every partition $\{A_1, A_2\} $ of $ A $.

And 13 more