Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853

nabenabe0928 · 2024-12-10T02:53:09Z

Motivation

This PR is to fix misleading typing to hotfix CI checking.

Fix mypy errors due to numpy 2.2.0 #5848

Description of the changes

Eliminate typings that do not align with the actual behaviors

Note

These changes can probably be removed after the following issue is resolved.
numpy/numpy#27957

nabenabe0928 · 2024-12-10T03:22:32Z

optuna/samplers/_nsgaiii/_elite_population_selection_strategy.py

@@ -103,7 +103,7 @@ def __call__(self, study: Study, population: list[FrozenTrial]) -> list[FrozenTr

 def _generate_default_reference_point(
    n_objectives: int, dividing_parameter: int = 3
 ) -> np.ndarray[tuple[int, int], np.dtype[np.float64]]:


Technically speaking, this typing is stricter and correct, but for consistency, I removed this typing.

nabenabe0928 · 2024-12-10T03:23:11Z

optuna/samplers/_gp/sampler.py

@@ -329,14 +329,15 @@ def _warn_and_convert_inf(

 def _get_constraint_vals_and_feasibility(
    study: Study, trials: list[FrozenTrial]
-) -> tuple[np.ndarray, np.bool | np.ndarray]:


The second return is definitely not boolean, so I removed this to avoid any confusion.

nabenabe0928 · 2024-12-10T03:25:45Z

optuna/_hypervolume/hssp.py

    contribs = np.prod(diff_of_loss_vals_and_ref_point, axis=-1)
    selected_indices = np.zeros(subset_size, dtype=int)
    selected_vecs = np.empty((subset_size, n_objectives))
-    indices: np.ndarray[tuple[int, ...], np.dtype[Any]] = np.arange(


This typing is also not correct, because indices is always a 1d array.
As this typing says indices can be a ND array, which is not correct, I removed this.

nabenabe0928 · 2024-12-10T03:26:27Z

optuna/_gp/optim_mixed.py

@@ -208,7 +208,8 @@ def local_search_mixed(
    # TODO(kAIto47802): Think of a better way to handle this.
    lengthscales = 1 / np.sqrt(inverse_squared_lengthscales[continuous_indices])

-    discrete_indices = np.where(steps > 0)[0]
+    # NOTE(nabenabe): MyPy Redefinition for NumPy v2.2.0. (Cast signed int to int)
+    discrete_indices = np.where(steps > 0)[0].astype(int)


By casting from unsinged int to int, we can avoid the issue although this loosens the typing.

nabenabe0928 · 2024-12-10T03:27:46Z

optuna/study/_multi_objective.py

-    nondominated_indices: np.ndarray[tuple[int, ...], np.dtype[np.signedinteger[Any]]] = np.arange(
-        n_trials
-    )


The same reason

Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853 (comment)

nabenabe0928 · 2024-12-10T03:28:40Z

optuna/study/_multi_objective.py

+        # NOTE(nabenabe0928): Ignore typing for a temporal solution to NumPy v2.2.0 weird typing.
+        nondominated_indices = nondominated_indices[
+            nondominated_and_not_top
+        ]  # type: ignore[assignment]


I could not find any good ways to tell NumPy that the return of this vectorization is gonna be a 1d integer array, so I ignore the typing here for now.

I found this option, so I replaced the code.

TYP: regression between 2.1.3 and 2.2.0 (mypy only) numpy/numpy#27957 (comment)

HideakiImamura · 2024-12-10T05:38:26Z

@porink0424 @y0z Could you review this PR?

porink0424

I left a few comments, PTAL.

porink0424 · 2024-12-13T05:44:37Z

optuna/_hypervolume/hssp.py

+    (n_solutions, n_objectives) = rank_i_loss_vals.shape
+    assert isinstance(n_solutions, int), "MyPy Redefinition for NumPy v2.2.0."
    contribs = np.prod(diff_of_loss_vals_and_ref_point, axis=-1)
    selected_indices = np.zeros(subset_size, dtype=int)
    selected_vecs = np.empty((subset_size, n_objectives))
-    indices: np.ndarray[tuple[int, ...], np.dtype[Any]] = np.arange(
-        rank_i_loss_vals.shape[0], dtype=int
-    )
+    indices = np.arange(n_solutions)
    for k in range(subset_size):
        max_index = int(np.argmax(contribs))
        selected_indices[k] = indices[max_index]
        selected_vecs[k] = rank_i_loss_vals[max_index].copy()
-        keep = np.ones(contribs.size, dtype=bool)
+        assert n_solutions - k > 0
+        keep = np.ones(n_solutions - k, dtype=bool)


The implementation code itself has changed, and I'm concerned it might result in different behavior.

This seems to go beyond the scope of the PR, which is meant to fix static analysis issues with mypy, so would it be possible to adjust it to stay within the scope?

porink0424 · 2024-12-13T05:47:29Z

optuna/study/_multi_objective.py

+    # TODO(nabenabe): Replace with the following once Python 3.8 is dropped.
+    # nondominated_indices: np.ndarray[tuple[int], np.dtype[np.signedinteger]] = ...
+    nondominated_indices = np.arange(n_trials)


The original type np.ndarray[tuple[int, ...], np.dtype[np.signedinteger[Any]]] is not incorrect, just loosely typed, so I think that leaving a TODO comment would be better than using a type ignore statement.

y0z

LGTM

porink0424

It seems that the refactor goes beyond addressing the errors by mypy.

porink0424 · 2024-12-18T02:16:53Z

optuna/_hypervolume/hssp.py

+    (n_solutions, n_objectives) = rank_i_loss_vals.shape
+    n_solutions = int(n_solutions)  # MyPy Redefinition for NumPy v2.2.0.


As I mentioned in the previous review, I still have some concerns due to my limited understanding of the logic:

Is it safe to remove the assertion comparing subset_size and rank_i_indices.size?

Are rank_i_loss_vals.shape[1] and reference_point.size guaranteed to be the same value?

Is it safe to remove the assertion comparing subset_size and rank_i_indices.size?

Yes, it is, because we had assert n_solutions - k > 0 in the for loop, but as we removed this line, I reverted.

The original assertion was assert subset_size < n_solutions.
What we did was assert k < n_solutions, which we, in turn, check till whether subset_size - 1 < n_solutions.
The only difference is that we do not check the case of subset_size < n_solutions, but because of the following line:

optuna/optuna/_hypervolume/hssp.py

Lines 90 to 91 in 298653e

if rank_i_indices.size == subset_size:

return rank_i_indices

we were already checking subset_size != n_solutions and subset_size - 1 < n_solutions, leading to subset_size <= n_solutions and subset_size != n_solutions, i.e. subset_size < n_solutions.

This completed the original assertion.

Are rank_i_loss_vals.shape[1] and reference_point.size guaranteed to be the same value?

Yes, it is guaranteed.

First of all, the code-level wise, it is guaranteed as _solve_hssp is only used here.

optuna/optuna/samplers/_tpe/sampler.py

Lines 682 to 683 in 298653e

worst_point = np.max(rank_i_lvals, axis=0)

reference_point = np.maximum(1.1 * worst_point, 0.9 * worst_point)

But most importantly, _solve_hssp is, roughly speaking, a sorting algorithm for an array with a shape of (n_solutions, n_objectives) given a reference point with the shape of (n_objectives, ).

porink0424 · 2024-12-18T02:18:37Z

optuna/_hypervolume/hssp.py

+        assert n_solutions - k > 0
+        keep = np.ones(n_solutions - k, dtype=bool)


Is it possible to use contribs.size as it is in the original logic?

I tweaked the code and I made it.
But interestingly, the numpy sizing is now very sensitive to the cast information.

For example, if we replace indices = np.arange(n_solutions) with indices = np.arange(int(n_solutions)), contribs.size does not work anymore.

Anyways, your concern was resolved.

…2.0-mypy

y0z

LGTM

porink0424

Thank you for the detailed explanation and prompt fix.
LGTM 👍

nabenabe0928 added 5 commits December 10, 2024 03:51

Refactor the fix for MyPy errors due to NumPy v2.2.0

97f205a

Fix

c239211

Fix

087cba2

Fix

ba943ad

Fix

ebaeaf8

nabenabe0928 commented Dec 10, 2024

View reviewed changes

nabenabe0928 added 2 commits December 10, 2024 04:29

Fix

ce3f535

Fix

942ae1e

HideakiImamura assigned y0z and porink0424 Dec 10, 2024

nabenabe0928 added 3 commits December 11, 2024 02:05

Use direct cast to address NumPy v2.2.0 typing

cbb3d8e

Fix CI

18aca9e

Fix error for minimum versions

e861a97

porink0424 requested changes Dec 13, 2024

View reviewed changes

porink0424 added the code-fix Change that does not change the behavior, such as code refactoring. label Dec 13, 2024

nabenabe0928 added 2 commits December 13, 2024 07:00

Address the kato's comment

f3c4d52

Apply kato's comment

54aff4f

y0z approved these changes Dec 13, 2024

View reviewed changes

y0z removed their assignment Dec 13, 2024

porink0424 reviewed Dec 18, 2024

View reviewed changes

nabenabe0928 added 2 commits December 18, 2024 04:56

Merge remote-tracking branch 'upstream/master' into refactor-numpy-2.…

50774f6

…2.0-mypy

Address kato's comments

9c0df57

y0z approved these changes Dec 18, 2024

View reviewed changes

porink0424 approved these changes Dec 18, 2024

View reviewed changes

porink0424 removed their assignment Dec 18, 2024

y0z merged commit 945f856 into optuna:master Dec 18, 2024
14 checks passed

y0z added this to the v4.2.0 milestone Dec 18, 2024

nabenabe0928 deleted the refactor-numpy-2.2.0-mypy branch June 5, 2025 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853

Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		(n_solutions, n_objectives) = rank_i_loss_vals.shape
		n_solutions = int(n_solutions) # MyPy Redefinition for NumPy v2.2.0.

	worst_point = np.max(rank_i_lvals, axis=0)
	reference_point = np.maximum(1.1 * worst_point, 0.9 * worst_point)

		assert n_solutions - k > 0
		keep = np.ones(n_solutions - k, dtype=bool)

Uh oh!

Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853

Refactor the fix for MyPy errors due to NumPy v2.2.0 #5853

Uh oh!

Conversation

Uh oh!

Motivation

Description of the changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!