openproblems

# Themes to work on

SIMCO Lorentz Workshop 2-6 September, 2013. On the final day of the SIMCO workshop the following recommendation on the 10 most important themes of SIMCO for further development and research were resulting from the discussion.

1. Local optimality of sets in set-based optimization with respect to unary and n-ary set indicators. (Sébastien Verel et al)

2. Preference articulation on sets of solutions (including several different non-standard problems: a. Robustness b. Diversity c. Constraints d. Noise e. Interchangeability f. Diversity versus representedness) Performance indicators for non-standard problems (see also 5.)

3. Performance of stochastic optimizers of sets: statistics on sets, multiple runs.

4. Subset selection algorithms including approximation algorithms, exact algorithms, limit behavior and complexity bounds

5. Indicators for non-standard problems; computational efficiency , optimal sets. (See also 2. )

6. Multi-deme versus standard approach. Terminology: a. Population-level or Set-level versus b. Individual or Element-level :

i. Selection

ii. Evolution

iii. Variation

iv. Comparison

“competition between groups”: lift this biological phenomenon to set-based MOO. In mathematical community: set-level/element-level in bio-inspired computing population-level/individual-level.

7. Interaction of/with the user: tools (visualization in higher dimension), interaction schemes, workflows, pipelines, multicriteria decision support

9. Run-time analysis, performance guarantees

10. (Computational) geometry aspects (or relation to) of (to) MOO solution sets.

# Open Problems

This side features a list of questions and open problems

1. as a “nice to have” list, which can be
1. considered “if time allows”
2. given to (PhD) students
2. worked on, enlarges, shortened etc. at our workshop

#### Type of problems

There are different types/styles of problem descriptions that can be discussed in SIMCO and in the following we shall summarize a few:

##### Generic open problems

Generic open problems are formulated in a conscise way, but parts of the problem definition are kept as a generic variable. As an example one can see typical questions asked for performance indicators on Pareto fronts, such as their computational complexity, optimal $\mu$-distribution, etc.. In this example the class of Pareto fronts and the indicator may be seen as generic parts of the problem definition.

##### Fully instantiated problem formulations

A part of this working document is devoted to well-stated questions on mathematical objects such as algorithms, complexity bounds, optimal sets of points, etc.. An inspiration an open problem project in computational geometry:

Erik D. Demaine, Joseph S. B. Mitchell, and Joseph O'Rourke: The Open Problems Project, http://cs.smith.edu/~orourke/TOPP/

Problems formulated in this style should be formulated in a way that concrete mathematical results can be obtained that provide the answer for the problem or a subproblem. It will be essential for such an endeavour to draw the demarcation line between what is known, and what is unknown. Due to a large amount of literature and due to the fact that often results on salient issues are published as side-results in papers with a different general scope this is a challenging task. The workshop may help to come to an up-to-date picture, precise statements, and a more complete list of the state of the art and open problems.

## Short List - 10 Generic Questions on Unary Set Indicators

By unary set indicators we understand functions that assign a scalar value to a set of objective function vectors in $\mathbb{R}^d$.

1. optimal $\mu$-distribution: Given a unary set indicator (e.g. R2, hypervolume indicator, $\alpha$) and given a Pareto front. What are the geometrical properties of bounded-size sets that maximize this indicator?
2. computational complexity of indicator: What is the time and space complexity to compute an indicator for $k$ points in $d$ dimensions?
3. complexity of $k$-subset selection: What is the computational complexity of selecting the optimal $k$-size subset from a finite set of $n$ objective function vectors with respect to a given indicator? Are accurate approximation algorithms available?
4. incremental updates: What is the time and space complexity of incrementally updating the set indicator (adding and deleting points)?
5. set-gradient: What are the 1st order and 2nd order derivatives of indicators for a given set (interpreted as a $kd$ dimensional vector). How can it be computed? What are its properties?
6. $k$-replacement landscape: Given a unary set indicator and a class of Pareto fronts. Is it sufficient to iteratively replace only $k$ points for accessing the global optimum from any given point? What are the neighborhoods created by $k$ replacement strategies?
7. parameter sensitivity: How do parameters (dominance cone-angle, reference point/set, weights) influence the optimal $\mu$-distribution and computational complexity of the set-indicator?
8. dynamic parameters: Can dynamic updates of parameters such as reference points and weighting schemes be used to adapt to the problem or have undesirable effects, such as causing divergence or hypervolume regression?
9. relationship between indicators: Given a Pareto front and two set indicators, say $A$ and $B$. Does the optimization of set indicator $A$ lead to approximately optimal sets for indicator $B$?
10. extension to non-Pareto set orders: How can existing unary indicators for set orderings based on the Pareto order be generalized to Non-Pareto orders (e.g. cone-orders, interval orders, probabilistic orders)?

Fonseca, C. M., Knowles, J. D., Thiele, L., & Zitzler, E. (2005). A tutorial on the performance assessment of stochastic multiobjective optimizers. In Third International Conference on Evolutionary Multi-Criterion Optimization (EMO 2005) (Vol. 216).

#### Revision History

M. Emmerich (March 8, 2013)

## Statistical properties of set-indicators

This section discusses problems related to statistical and order-theoretical properties of performance indicators that measure the merit of an approximation set.

### Problem: optimal $\mu$-distribution

#### Problem description

Let $A_\mu^d \subset \mathbb{R}^d$ denote the set of finite point sets in $\mathbb{R}^d$ of cardinality $\mu$, let $F^* \subset \mathbb{R}^d$ denote a given Pareto front, and let $I$ be a given unary quality indicator (e.g. the hypervolume indicator with $r \in \mathbb{R}^d$ denoting its reference point). What are the sets $A \in A_\mu^d$ of $\mu$ points that maximize the indicator among all sets of size $\mu$? First described in [ABB+09] for the hypervolume indicator—calling the sets of maximal hypervolume indicator “optimal $\mu$-distributions”.

#### Relevance

This problem is relevant for understanding the bias of performance assessment metrics of multicriteria optimizers. It is also relevant for knowing the limiting distribution achieved by multicriteria archivers and optimization algorithms that work with a bounded size of archives/populations. Knowing the optimal $\mu$-distributions also allows to interpret the outcomes of bounded-population based multi-criteria optimization algorithms absolutely by comparing the achieved indicator value with the achievable one.

#### Results

Most of the known results relate to the hypervolume indicator. There are exact results on the dependency of the optimal distribution on the local curvature of a Pareto front given as a differentiable curve in 2-D. The optimal distribution of points is most dense at parts of the Pareto curve where the angle of the tangent with the $x$-axis is $-45^\circ$ (“knees”), it decreases to zero for angles close to $90^\circ$ and $0^\circ$. The precise formula that relates the density to the tangential angle is given in [ABB+09]. Other results relate the optimal distributions for the hypervolume indicator with the $\alpha$-approximation, see for example [FHN09]. Also first results for the R2 indicator are known [BWT12], including numerical ones.

#### Open Questions

The dependency of the density on the curvature of the Pareto front is widely unclear for $d \geq 3$.

#### Revision History

D. Brockhoff, March 8, 2013
M. Emmerich, 27 Feb 2013

#### Bibliography

[ABB+09]Anne Auger, Johannes Bader, Dimo Brockhoff, and Eckart Zitzler. 2009. Theory of the hypervolume indicator: optimal $\mu$-distributions and the choice of the reference point. FOGA '09. ACM, New York, NY, USA, 87-102. http://doi.acm.org/10.1145/1527125.1527138

[ABB10] Anne Auger, Johannes Bader, Dimo Brockhoff: Theoretically Investigating Optimal $\mu$-Distributions for the Hypervolume Indicator: First Results for Three Objectives. PPSN 2010:586-596

[BWT12] D. Brockhoff, T. Wagner, and H. Trautmann: On the Properties of the R2 Indicator. GECCO 2012, pages 465–472. ACM, 2012

[FHN09] T. Friedrich, C. Horoba, F. Neumann: Multiplicative Approximations and the Hypervolume Indicator. GECCO 2009, pages 571-578, ACM, 2009.

## Computational Geometry Problems

### Computation of the Attainment Surface

#### Problem Statement

Given a finite set $A$ in $\mathbb{R}^d$ the attainment surface is the boundary of the subspace where points are dominated by at least one solution in $A$ in $\mathbb{R}^d$. The problem is, how to compute the attainment surface efficiently for a given dimension $d$.

#### Relevance

This problem is relevant for the visualization of results in Pareto optimization.

#### Relation to Other Problems

Empirical Attainment Function (EAF).

#### Results

The empirical attainment function can be obtained by sorting the point set in 2-D. The best known algorithms for more than three dimensions are described in [FGL+11].

#### Open Questions

The computational complexity of the problem to compute all corner points and faces of the attainment function in more than two dimensions is unknown.

#### Entry Revision History

M. Emmerich, 27 Feb 2013

#### Bibliography

[FGL+11] Fonseca, C., Guerreiro, A., López-Ibánez, M., & Paquete, L. (2011). On the computation of the empirical attainment function. In Evolutionary Multi-Criterion Optimization (pp. 106-120). Springer Berlin/Heidelberg.

### Computation of the Empirical Attainment Function

Given $k$ Pareto non-dominated sets, the $x$% attainment surface is defined by the boundary of the set of points that are dominated by at least $x$% of the $k$ non-dominated sets. A point is dominated by a set, iff at least one point in the set dominates it.

The $x$% attainment surface was first defined in [GFF+01].

#### Relevance

The x% attainment surface is used in Performance assessment for averaging Pareto front approximations.

#### Relation to Other Problems

The 0% attainment surface for a single non-dominated set is the attainment surface of that set.

#### Results

The problem of computing the empirical attainment function is formalised, and upper and lower bounds on the corresponding number of output points are presented in [FGL11].

#### Open Questions

The complexity of the EAF in more than three dimensions remains unknown.

#### Entry Revision History

M. Emmerich, 27 Feb 2013

#### Bibliography

[GFF+01] Grunert da Fonseca, V., Fonseca, C., & Hall, A. (2001). Inferential performance assessment of stochastic optimisers and the attainment function. In Evolutionary Multi-Criterion Optimization (pp. 213-225). Springer Berlin/Heidelberg.

[FGL11] Fonseca, C., Guerreiro, A., López-Ibánez, M., & Paquete, L. (2011). On the computation of the empirical attainment function. In Evolutionary Multi-Criterion Optimization (pp. 106-120). Springer Berlin/Heidelberg.

### Computing Sizes of the Facets of the Dominated Hypervolume Polytope

Given a set $A$ of $n$ non-dominated points, let $DomSet(A)$ denote the subspace of points in $R^d$ that are dominated by at least one element in $A$ and $r$ denote a reference point. Then, in case of minimization, $P = DomSet(A) \cap [-\infty, r]$ is, in regular cases, an ortho-convex $d$-dimensional polytope, bounded by $O(d n)$ axis-parallel ($d-1$)-facets ( hyperplanes). The problem is to compute the size of each $d-1$-facet of this polytope.

#### Relevance

This is relevant for the computation of Gradient Components of the Hypervolume [ED13]: Given a set $A$ of $n$ non-dominated points, the change rate of the hypervolume indicator when moving the $i$-th point differentially in the $j$-th coordinate direction correspond to the facet adjacent to the $j$-th point, spanned by the dimensions $1, \dots, \check{i}, \dots, m$.

#### Relation to Other Problems

The Lebesgue measure of the considered ortho-convex polytope is the hypervolume indicator of $A$ for reference point $r$. The problem is closely related to the computation of hypervolume contributions. In fact, the visible boundary of the hypervolume dominated only by the $i$-th point consists of the facets adjascent to this point.

#### Results

Asymptotically optimal algorithm with complexity in $\Theta(n \log n)$ for computing the size of all facets in a single run are known for $d=2$ (reduction to sorting) and $d=3$. [ED14]

#### Open Problems

Efficient computation for $d > 3$.

#### Entry Revision History

M. Emmerich, A. Deutz 31 Aug 2013

#### Bibliography

Emmerich, M., & Deutz, A. (2014). Time Complexity and Zeros of the Hypervolume Indicator Gradient Field. In EVOLVE-A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation III (pp. 169-193). Springer International Publishing. (DOI10.1007/978-3-319-01460-9_8).

### Computation of the Hypervolume Indicator

#### Problem Statement

The hypervolume indicator is the Lebesgue measure of a union of a set of boxes in $\mathbb{R}^d$ that all share a common corner point. Without loss of generality, we choose the upper corner of all boxes as the common point, and it is called the reference point. In this case we have a collection of interval boxes $C = \{[l_i, u_i] \subset \mathbb{R}^d | i = 1, ..., n\}$ and the hypervolume indicator is given by

$$\mbox{HI}(C) = \mbox{Vol}(\bigcup_{[l,u] \in C} [l,u])$$

where Vol denotes the Lebesgue measure.

#### Relevance

The hypervolume indicator was proposed in [ZDT98] for performance assessment in evolutionary multicriterion optimization. The lower corners of the boxes correspond with points from a finite approximation set to the Pareto front. And the shared upper corner is a reference point that is chosen by the user.

A fast computation of the hypervolume indicator would speed up algorithms and performance assessment, especially for $n>3$.

#### Results

The computation of the hypervolume indicator in asymptotically optimal time $O(n \log n)$ for 2-D and 3-D is described in [BFL+09]. An alternative asymptotically optimal algorithm using divide and conquer is described in [Gue11]. The problem was shown to be a special case of Klee's measure problem (KMP) as pointed out in [BFL+09].

For more than three dimensions the algorithm in [Cha13] is the fastest algorithm with time complexity in $O(n^{d/3} \mbox{polylog} n)$.

Other algorithms with slightly worse complexity:

$d = 4: O(n^{1.5} log n)$ [YS12]

$d = 5: O(n^2 log n)$ [YS12]

$d = 6: O(n^{2.5} log n)$ [YS12]

$d > 6: O(n^{((d+2)/3))}$ [Bri12] by reduction to unit cube as described in [Bri13].

Box-Decomposition algorithms with a time complexity $O(n^{\lfloor\frac{d-1}{2}\rfloor+1})$ are proposed in [LKF15].

A lower bound for $n\geq 2$ of $\Omega(n \log n)$ was proven in [BFL+09].

In $d$ the problem scales exponentially as shown in.

Incremental updating of the hypervolume indicator has amortized time complexity in $\Theta(log n)$ in 2-D and amortized time complexity in $O(n)$ in 3-D. As pointed out in [HE13], this is achieved by using subroutines of dimension sweep algorithms for computing the hypervolume indicator in 3-D with time complexity in $O(n \log n)$ described in [BFL+09], and, respectively, the hypervolume indicator in 4-D with time complexity in $O(n^2)$ [GEF12].

An extensive webpage dedicated to results on the hypervolume indicator is given at: www.hypervolume.org

#### Open problems

For fixed $d$ the time complexity of computing the hypervolume indicator is still unknown for $n>3$ [Bri13]. Also algorithms with a provable small constant (hidden in the big Oh notation) would be desirable, even in low dimensions.

#### Entry Revision History

M. Emmerich, February 27, 2013 M. Emmerich, August 30, 2013 M. Emmerich, Oktober 1, 2014 (Added Chan's paper) M. Emmerich, February, 2015 (Added Approximation Algorithm by Guerreiro et al. 2015) M. Emmerich, 15 February, 2016 (Added Lacour et al. 2015)

#### Bibliography

[ZT98] Zitzler, E., & Thiele, L. (1998). Multiobjective optimization using evolutionary algorithms—A comparative case study. In Parallel Problem Solving from Nature—PPSN V (pp. 292-301). Springer Berlin/Heidelberg.

[BFL+09] Beume, N., Fonseca, C. M., López-Ibáñez, M., Paquete, L., & Vahrenhold, J. (2009). On the complexity of computing the hypervolume indicator. Evolutionary Computation, IEEE Transactions on, 13(5), 1075-1082.

[Cha13] Chan, T. M. (2013, October). Klee's measure problem made easy. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on (pp. 410-419). IEEE.

[Gue11] Guerreiro, Andreia Próspero. Efficient Algorithms for the Assessment of Stochastic Multiobjective Optimizers. Diss. Master’s thesis, IST, Technical University of Lisbon, Portugal, 2011.

[Beu09] Beume, Nicola, et al. “On the complexity of computing the hypervolume indicator.” Evolutionary Computation, IEEE Transactions on 13.5 (2009): 1075-1082.

[YS12] Yildiz, Hakan, and Subhash Suri. “On Klee's measure problem for grounded boxes.” Proceedings of the 2012 symposuim on Computational Geometry. ACM, 2012.

[Bri12] Bringmann, Karl: An improved algorithm for Klee’s measure problem on fat boxes. Computational Geometry: Theory and Applications, 45:225–233, 2012.

[Bri13] Bringmann, Karl. “Bringing Order to Special Cases of Klee's Measure Problem.” arXiv preprint arXiv:1301.7154 (2013).

[BF10] Bringmann, Karl, and Tobias Friedrich. “Approximating the volume of unions and intersections of high-dimensional geometric objects.” Computational Geometry 43.6 (2010): 601-610.

[HE13] Hupkens, Iris and Emmerich, Michael. (2013). Logarithmic-Time Updates in SMS-EMOA and Hypervolume-Based Archiving. In EVOLVE-A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation IV (pp. 155-169). Springer International Publishing.

[GFE12] Guerreiro, Andreia P., Fonseca, Carlos M., and Emmerich, Michael T. (2012). A Fast Dimension-Sweep Algorithm for the Hypervolume Indicator in Four Dimensions. In CCCG (pp. 77-82).

[LKF15] Lacour, R., Kamroth, K., & Fonseca, C. M. (2015). A Box Decomposition Algorithm to Compute the Hypervolume Indicator. arXiv preprint arXiv:1510.01963.

### Hypervolume-based subset selection problem

#### Problem Statement

Given a set $A \subset \mathbb{R}^d$ of $n$ non-dominated points in $\mathbb{R}^d$ and a constant $1 \leq k \leq n$, find a set $B \subset A$ of size $k$ that maximizes the hypervolume among all sets of size $k$.

The problem is discussed in [ABB+09] and [Bader09].

#### Relevance

This problem is relevant for finding a small subset (size $k$) from a database of $n$ solutions that maximizes the probability that the decision maker finds at least one solution in the subset acceptable [EDY15].

This problems is also relevant for archiving of non-dominated subsets, when the archive has bounded size. Fast subset selection makes powerful selection schemes in evolutionary multiobjective optimization possible. The same problem of finding a set of fixed size with the best indicator value among all subsets of this size occurs also for other indicators, see for example [VPP13] for a result on the $\epsilon$-indicator.

#### Results

An algorithm based on dynamic programming with time complexity $O(k n^2)$ for $d=2$ was proposed in [ABB+09]. Recently, improved algorithms for the case $d=2$ reduced the time complexity to $O(n(k + \log n))$ [BFK14][KFP+14]. In case $k=n-1$, the problem is equivalent to computing the minimal hypervolume contributor. Approximation algorithms for the $3$ objective case, which exploit submodularity of the hypervolume indicator, are provided in [GFP15]. In [RBB+16] it was announced that in general the problem is NP hard for $3$ and more dimensions. In [BCE17] a detailed proof is given and fixed parameter approximation algorithms.

#### Open problems

It is unknown, whether there are more efficient algorithms for $d=2$. For $d \geq 3$ it is unknown whether it is W[1] complete.

#### Entry Revision History

M. Emmerich, April 4, 2015 (remark on relevance)
M. Emmerich, March 2, 2013
D. Brockhoff, June 20, 2013 (update)
D. Brockhoff, June 19, 2014 (update)

#### Bibliography

[ABB+09] Auger, A., Bader, J., Brockhoff, D., & Zitzler, E. (2009, July). Investigating and exploiting the bias of the weighted hypervolume to articulate user preferences. In Genetic and Evolutionary Computation Conference (GECCO 2009) (pp. 563-570).

[Bader09] J. Bader. Hypervolume-Based Search For Multiobjective Optimization: Theory and Methods. PhD thesis, ETH Zurich, 2009

[BFK14] K. Bringmann, T. Friedrich, and P. Klitzke. Two-dimensional Subset Selection for Hypervolume and Epsilon-Indicator. In Proceedings of the ACM Genetic and Evolutionary Computation Conference (GECCO 2014). ACM press (to appear), 2014.

[KFP+14] Tobias Kuhn, Carlos M. Fonseca, Luís Paquete, Stefan Ruzika, José Rui Figueira. Hypervolume Subset Selection in Two Dimensions: Formulations and Algorithms. Preprint available at https://kluedo.ub.uni-kl.de/frontdoor/index/index/docId/3798, 2014.

[GFP15] Guerreiro, A. P., Fonseca, C. M., & Paquete, L. (2015, July). Greedy hypervolume subset selection in the three-objective case. In Proceedings of the 2015 on Genetic and Evolutionary Computation Conference (pp. 671-678). ACM.

[VPP13] D. Vaz, L. Paquete, A. Ponte. A note on the ε-indicator subset selection. Theoretical Computer Science, 2013. In Press.

[EDY15] Emmerich, M. T., Deutz, A. H., & Yevseyeva, I. (2015). A Bayesian Approach to Portfolio Selection in Multicriteria Group Decision Making. Procedia Computer Science, 64, 993-1000.

[RBB+16] Günter Rote, Kevin Buchin, Karl Bringmann, Sergio Cabello, and Michael Emmerich. Selecting K Points that Maximize the Convex Hull Volume. JCDCG3 2016; The 19th Japan Conference on Discrete and Computational Geometry, Graphs, and Games. September 2 - 4, 2016, Tokyo University of Science. Page 58-60; http://www.jcdcgg.u-tokai.ac.jp/JCDCG3_abstracts.pdf

[BCE17] Karl Bringmann, Sergio Cabello, Michael T. M. Emmerich: Maximum Volume Subset Selection for Anchored Boxes. Symposium on Computational Geometry 2017: 22:1-22:15 (http://drops.dagstuhl.de/opus/volltexte/2017/7201/)

### Computing Hypervolume Contributions

#### Problem Statement

Given a set $A$ of $n$ non-dominated points in $\mathbb{R}^d$. Compute for (all) $x \in A$ how much the hypervolume indicator decreases, when $x$ is removed from $A$.

#### Relevance

This problem is relevant for maintaining bounded size archives [KCF03] and for selection schemes in evolutionary multiobjective optimization [EBN05].

#### Results

The complexity of computing all hypervolume contributions in 2-D and 3-D is $O(n \log n)$ [EF11]. The lower complexity bound is $\Omega(n \log n)$. Moreover,in the four-dimensional case all contributions are computed in $O(n^2)$ time [GF17]. In four and more dimensions algorithms by [BF12] have polynomial computation time when $d$ is fixed, but the computation time increases exponentially in $d$. The computation of a single hypervolume contribution in dimension $d$ has at least the complexity of the computation of the hypervolume indicator in $d-1$. [EF11]. The asymptotically fastest known algorithms for $d \geq 5$ are stated in [BF12] [IHR07].

Updating all contributions in an archive of $n$ mutually non-dominated points – after adding or deleting a single point – can be achieved with an amortized time complexity in $\Theta(log n)$ per added/deleted point in the 2-D case [HE13]. Algorithms to update all hypervolume contributions under single-point changes for 3- and 4-objective problems are known with (amortized) runtimes linear and quadratically in $n$ respectively [GF17].

#### Open Problems

The time-complexity of computing hypervolume contributions in more than $3$ dimensions is unknown.

Computing the hypervolume contributions exactly is #P-hard, approximating them is NP-hard; the same holds for the minimal contribution [BF09]

#### Entry Revision history

M. Emmerich, October 24, 2017 D. Brockhoff, September 11, 2017
M. Emmerich, August 31, 2013
D. Brockhoff, March 8, 2013
M. Emmerich, March 2 2013

#### Bibliography

[BF09] K. Bringmann, T. Friedrich: Approximating the Least Hypervolume Contributor: NP-hard in General, but Fast in Practice. EMO 2009, pages 6-20, Springer, 2009.

[KCF03] Knowles, J. D., Corne, D. W., & Fleischer, M. (2003, December). Bounded archiving using the Lebesgue measure. In Evolutionary Computation, 2003. CEC'03. The 2003 Congress on (Vol. 4, pp. 2490-2497). IEEE.

[EF11] Emmerich, M., & Fonseca, C. (2011). Computing hypervolume contributions in low dimensions: asymptotically optimal algorithm and complexity results. In Evolutionary Multi-Criterion Optimization (pp. 121-135). Springer Berlin/Heidelberg.

[EBN05] Emmerich, M., Beume, N., & Naujoks, B. (2005). An EMO algorithm using the hypervolume measure as selection criterion. In Evolutionary Multi-Criterion Optimization (pp. 62-76). Springer Berlin/Heidelberg.

[IHR07] Igel, C., Hansen, N., & Roth, S. (2007). Covariance matrix adaptation for multi-objective optimization. Evolutionary Computation, 15(1), 1-28.

[BF12] Bringmann, K., & Friedrich, T. (2012). Approximating the least hypervolume contributor: NP-hard in general, but fast in practice. Theoretical Computer Science, 425, 104-116.

[HE13] Hupkens, Iris, & Emmerich, Michael. Logarithmic-Time Updates in SMS-EMOA and Hypervolume-Based Archiving. EVOLVE-A Bridge between Probability, Set Oriented Numerics, and Evolutionary Computation IV. Springer International Publishing, 155-169. 2013.

[GF17] Guerreiro, A. P. & Fonseca, C. M. (2017). Computing and Updating Hypervolume Contributions in Up to Four Dimensions. CISUC TECHNICAL REPORT TR-2017-001, Jan. 9, 2017

### Computing all ($m$-)nearest neighbors

Given a set of $n$ points in $R^d$, compute for each point the nearest neighbor of that point in a given Minkowski distance. The All-$m$-nearest neighbor problem is the problem of computing for each point the $m$ nearest neighbors.

#### Relevance

This problem is relevant for instance in the computation of diversity measures and diversity-based selection schemes in multiobjective optimization and diversity-oriented optimization and performance assessment. Moreover, acceleration of kernel based metamodelling methods.

#### Results

An algorithm with worst case time complexity $\Theta((cd)^d n \log n)$ for some positive constant $c$ and space complexity $O(n)$ in the algebraic decision tree model is provided in [Vai89]. It is asymptotically optimal in $n$. A fast randomized algorithm with average case time complexity $O( c^d n \log n)$ is provided in [Cla83]. The all $m$ nearest neighbor problem can be solved with an algorithm that has complexity $O(m n \log n)$ (here omitting the influence of $d$).

Incremental updates of the all nearest neighbors and diversity computation. Computation of the (averaged) Hausdorff distance between two finite sets.

#### Open Problems

The growth of the constant with $d$ is considerable.

#### Bibliography

[Vai89] Vaidya, P. M. (1989). AnO (n logn) algorithm for the all-nearest-neighbors problem. Discrete & Computational Geometry, 4(1), 101-115.

[Cla83]Clarkson, K. L. (1983, November). Fast algorithms for the all nearest neighbors problem. In FOCS (Vol. 83, pp. 226-232).

#### Revision History

Michael Emmerich, 31. August 2013

# Sorting and Searching

### Fast non-dominated sorting

#### Problem description

In non-dominated sorting a finite set $A \subset \mathbb{R}^d$ is partitioned into layers (disjoint subsets) of equal rank. The first ranked layer $R_1 \subseteq A$ is the non-dominated subset of $A$. The second layer is the non dominated subset of $A \setminus R_1$, the third layer is the non dominated subset of $A \setminus R_2$, and so forth. Non dominated sorting was described first by David Goldberg in [Gol89].

#### Relevance

Non-dominated sorting is used in various multiobjective optimization algorithms for establishing ranks among solutions. The most prominent example is the NSGA-II algorithm [DAP+00].

#### Results

An algorithm using iterated KLP algorithm requires computational effort of $O(n^2 \log{n})$ for $d=2$, and $O(n^2 \log^{d-2}{n})$ for $d>2$. Fast non-dominated sorting schemes with time complexity $O(d n^2)$ were proposed in [DAP+00]. Clearly a lower bound is given by $\Omega(n \log n)$, because a side result of non-dominated sorting is the non-dominated set for which this lower complexity bound is known [KLP75]. The fastest algorithm with proven worst case complexity is published here [BS14] and it is a variation of Jensen's algorithm [J203], which allowed no duplicate coordinate values. The time complexity is $O(n log^{d-1} n)$. For 3 objective functions a faster algorithm is known as layers of maxima algorithm [BG02] and has time complexity $O(n \log n \log \log n)$ and by Nekrich [Nek11] which has a worst case time complexity of $O(n (\log \log n)^3$. There is a number of algorithms with reportedly good average case complexity [HY13].

#### Open problems

It is unknown whether or not the known upper complexity bounds can be improved.

#### Entry Revision History

M. Emmerich, March 2, 2013; M. Emmerich, October 1, 2014 M. Emmerich, January 14, 2015 (thanks to Luís Paquete for bringing to my attention [BG02]), Emmerich March 2017.

#### Bibliography

[BS14] Buzdalov, M., & Shalyto, A. (2014). A Provably Asymptotically Fast Version of the Generalized Jensen Algorithm for Non-dominated Sorting. In Parallel Problem Solving from Nature–PPSN XIII (pp. 528-537). Springer International Publishing.

[DAP+00] Deb, K., Agrawal, S., Pratap, A., & Meyarivan, T. (2000). A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II. Lecture notes in computer science, 1917, 849-858.

[J203]Jensen, M.T.: Reducing the Run-time Complexity of Multiobjective EAs: The NSGA-II and Other Algorithms. Transactions on Evolutionary Computation 7(5), 503–515 (2003)

[Gol89] David E. Goldberg. 1989. Genetic Algorithms in Search, Optimization and Machine Learning (1st ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.

[KLP75]Kung, H. T., Luccio, F., & Preparata, F. P. (1975). On finding the maxima of a set of vectors. Journal of the ACM, 22(4), 469-476.

[MT05] Mostaghim, S., & Teich, J. (2005). Quad-trees: a data structure for storing Pareto sets in multiobjective evolutionary algorithms with elitism. Evolutionary Multiobjective Optimization, 81-104.

[BG02] A. L. Buchsbaum and M. T. Goodrich: Three-Dimensional Layers of Maxima, In: R. Möhring and R. Raman (Eds.): ESA 2002, LNCS 2461, pp. 257–269, 2002.

[HY13] H. Wang and X. Yao: Corner Sort for Pareto-Based Many-Objective Optimization. IEEE transactions on cybernetics 03/2013; 44(1). DOI: 10.1109/TCYB.2013.2247594

[Nek11] Yakov Nekrich. 2011. A Fast Algorithm for Three-Dimensional Layers of Maxima Problem. In Algorithms and Data Structures. Number 6844 in Lecture Notes in Computer Science. 607–618.

### Finding the Pareto non-dominated set of a finite set of vectors

#### Problem description

Find the non-dominated vectors with respect to the Pareto dominance for a finite set $A \subset \mathbb{R}^d$ of size $n$.

#### Relevance

There are countless examples in multiobjective optimization, database selection, as well as in computational geometry (e.g. visibility problems) where this problem occurs.

#### Results

The KLP algorithm requires computational effort of $\Theta(n \log{n})$ for $d=2,3$, and $O(n \log^{d-2}{n})$ for $d>3$. A general lower bound is $\Omega(n \log{n})$. An dimension sweet algorithm is used for $d=2,3$, and for $d>3$ a divide-and-conquer algorithm. All results above are shown in [KLP75].

#### Open problems

It is unknown whether or not the known upper complexity bounds for $n>3$ can be further improved.

The problem is also studied for special cases and allowing preprocessing in the context of so-called skyline queries.

#### Entry Revision History

M. Emmerich, January 14, 2015;

#### Bibliography

[KLP75] Kung, H. T., Luccio, F., & Preparata, F. P. (1975). On finding the maxima of a set of vectors. Journal of the ACM, 22(4), 469-476.

### Finding the set of minima of a partially ordered set

Given a finite partially ordered set of size $n$, what is the number of comparisons required to identify its minimal elements. A variation of this problem, which is called bounded width posets, is also discussed here. Width is the longest anti-chain in the set.

#### Relevance

This problem provides an upper bound for all partially ordered sets, such as the practically relevant cone orders and subset inclusion.

#### Results

In the general case the time complexity to solve this problem is $\Theta(n^2)$. The lower bound of this time complexity stems from the time complexity of deciding whether or not a set is an anti-chain. If the length of the longest anti-chain in the partially ordered set is $k$ the parameterized complexity is given by $\Theta(kn)$. All results can be found in [DKM+11].

#### Literature

[DKM+11] Daskalakis, C., Karp, R. M., Mossel, E., Riesenfeld, S. J., & Verbin, E. (2011). Sorting and selection in posets. SIAM Journal on Computing, 40(3), 597-622..

M. Emmerich, January 14, 2015;