-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
modules: The 'features' module #11307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
63ddf0d
to
a3fa20b
Compare
6f686ff
to
5257b40
Compare
Looks promising, thanks @seiko2plus! One question I have is what CPU features are of interest here aside from SIMD ones? From the discussion at #11033 I thought you were going to try and fit this into the existing For others, @seiko2plus posted this test program to try this out earlier (here): project('test', 'c')
feature = import('feature')
cpu_features = feature.cpu_features()
targets = [['AVX2_FMA3', [cpu_features['AVX2'], cpu_features['FMA3']]],
['AVX512_COMMON', cpu_features['AVX512_COMMON']],
['AVX512_SKX', cpu_features['AVX512_SKX']]]
foreach target : targets
t = feature.test(target[1])
if t[0]
message('target name:', target[0])
data = t[1]
message('Implicit features:', data['implicit'])
message('Compiler arguments:', data['args'])
message('Headers:', data['headers'])
message('Need to detect on runtime:', data['detect'])
message('Supported #definitions:', data['defines'])
message('Unsupported #definitions:', data['undefines'])
message('Compiler support this target by ', data['support'])
message('========================================')
endif
endforeach @seiko2plus this diff fixes the minor issue I had with feature detection here: --- a/mesonbuild/modules/feature/x86_features.py
+++ b/mesonbuild/modules/feature/x86_features.py
@@ -42,7 +42,7 @@ if T.TYPE_CHECKING:
def test_code_main(code: str) -> str:
return dedent(f'''\
- int main(int, char **argv)
+ int main(int unused, char **argv)
{{
char *src = argv[1];
{dedent(code)} |
The current meson SIMD module lacks flexibility and modifies it to support our needs in a way to provide static and dynamic dispatching (baseline/dispatch) that fits "any project needs" will require more effort than I expected. So I just split the work into two modules rather than a single module covering all the operations, the first module is the module 'feature' which provides what is necessary to be coded by python, and the second module 'compiler_opt' will be left for future step basically it will require to convert the following two meson scripts (not completed yet) into python module:
We still can modify the current meson SIMD module to allow only the use of the CPU features in this module without any extra modifications but we're not going to use it in NumPy, for many considerations are hard to explain in one comment.
Do you mean aside from SIMD features? Any CPU features in general, but main concern features that will be required for optimizing certain operations. For example, POPCNT and LZCNT. But if you mean aside from the current meson SIMD module? Flexibility is the main reason, imagine if a project wants to modify or add a new feature or a new architecture. What if there is a necessity to make fast or custom modifications to an existing feature? some examples: So assuming adding feature X implies feature
|
I think that'd be a vendorable package containing only
I meant SIMD features in general. Which is still the main use case here I think. But yes,
Nice examples you have given here already, thanks. |
It certainly would not be the only meson-using project that used a subproject containing sources which are intended strictly for use as a vendorable package, together with the |
qggq |
f5bd2ac
to
bed3012
Compare
650e7e8
to
74da606
Compare
@@ -50,6 +50,7 @@ | |||
from unittests.subprojectscommandtests import SubprojectsCommandTests | |||
from unittests.windowstests import WindowsTests | |||
from unittests.platformagnostictests import PlatformAgnosticTests | |||
from unittests.featurestests import FeaturesTests |
Check notice
Code scanning / CodeQL
Unused import
a0bfd1a
to
29f1d69
Compare
29f1d69
to
44dc7f1
Compare
44dc7f1
to
1f8351f
Compare
@eli-schwartz, Hi Eli, I am curious if there a method to directly generate documentation for the supported methods from the Python scripts themselves. |
This vendors the current state of mesonbuild/meson#11307, which is the "feature module" for the multi-target build support that NumPy needs to build its SIMD support (by Sayed Adel). That PR is, as vendored, based on top of the state of the Meson master branch in commit 9d88d0d5c, which is a dev commit in between the 1.2.0 and 1.2.1 tags from 18 July 2023. No modifications have been made to any of the vendored files.
I don't know that it's import, I can always rename the module in 11885 if we ever have agreement on adding that module, it could be named |
I'd be in favor of renaming it |
@@ -0,0 +1 @@ | |||
../init_features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
symlinks will be problematic on Windows platform.
Seems like this is pretty close to getting merged. Would be great to get in for 1.6. |
This module aims to provide a more dynamic way to control common features between compilers that would generally require special headers, arguments, or compile-time test cases, quite ideal to manage CPU features and solve compatibility issues.
and brings X86 CPU features
- Removes CPU features implmentation and move them NumPy meson scripts - Removes attr 'header', since it can be handeled within a configuration file - re-implment method test, make it more simple - Implment method `multi_targets` Handels multi_targets via meson script was pretty anoyning from python make it much simpler
- Improve multi_targets() method: * rename multi_target() to multi_targets() use plural to indicates handling more than one CPU target * Split the function into multiple smaller functions for better organization. * Modify the function to return a single library array instead of a list of libraries. * Avoid returning the result of testing the enabled features to keep it * simple. * Enhance debugging capabilities by printing the test results of enabled targets. * Avoid returns static_lib instead wraps it with another object to reduce the number of final generated objects also to sort the all objects based on lowest interest * validate the sort of dispatch features - fix mypy - init the module doc - allow disables the cache - fix method `features.new` name during trigger arguments errors - generate config files inside the builddir instead - add test cases - cache tests within coredata - fix duplicate baseline build - refactor module name from `feature` to `features` - fix python typying
- Attributes without match were never actually added to the list - Only the last conflict actually had an effect, earlier results were discarded
82d41db
to
413ccdb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't done a full review, but I have skimmed this and have some feedback.
Does numpy feel like this is baked enough to merge?
Dealing with a numerous of CPU features within C and C++ compilers poses intricate challenges, | ||
particularly when endeavoring to support an extensive set of CPU features across diverse | ||
architectures and multiple compilers. Furthermore, the conundrum of accommodating | ||
both fundamental features and supplementary features dispatched at runtime | ||
further complicates matters. | ||
|
||
In addition, the task of streamlining generic interface implementations often leads to the necessity | ||
of intricate solutions, which can obscure code readability and hinder debugging and maintenance. | ||
Nested namespaces, recursive sources, complex precompiled macros, or intricate meta templates | ||
are commonly employed but can result in convoluted code structures. | ||
|
||
Enter the proposed module, which offers a simple pragmatic multi-target solution to | ||
facilitate the management of CPU features with heightened ease and reliability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section reads more like an MR proposal, it feels like it's trying to justify the existence of the module. I don't know if that makes sense, and I'm not really sure how to improve it
from typing import ( | ||
Dict, Set, Tuple, List, Callable, Optional, | ||
Union, Any, Iterable, cast, TYPE_CHECKING | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't do this anywhere else in Meson, we do import typing as T
and then use the T
namespace. For consistency I'd prefer if we kept with that.
convertor = lambda values: self.convert( | ||
func_name, opt_name, values | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a lambda here feels wierd, could we just store the attributes against in the calss and then set convertor = self.convert
as a bound class method?
@typed_pos_args('features.new', str, int) | ||
@typed_kwargs('features.new', | ||
KwargInfo( | ||
'implies', | ||
(FeatureObject, ContainerTypeInfo(list, FeatureObject)), | ||
default=[], listify=True | ||
), | ||
KwargInfo( | ||
'group', (str, ContainerTypeInfo(list, str)), | ||
default=[], listify=True | ||
), | ||
KwargConfilctAttr('features.new', 'detect', default=[]), | ||
KwargConfilctAttr('features.new', 'args', default=[]), | ||
KwargInfo('test_code', (str, File), default=''), | ||
KwargInfo( | ||
'extra_tests', (ContainerTypeInfo(dict, (str, File))), | ||
default={} | ||
), | ||
KwargInfo('disable', (str), default=''), | ||
) | ||
def init_attrs(state: 'ModuleState', | ||
args: Tuple[str, int], | ||
kwargs: 'FeatureKwArgs') -> None: | ||
self.name = args[0] | ||
self.interest = args[1] | ||
self.implies = set(kwargs['implies']) | ||
self.group = kwargs['group'] | ||
self.detect = kwargs['detect'] | ||
self.args = kwargs['args'] | ||
self.test_code = kwargs['test_code'] | ||
self.extra_tests = kwargs['extra_tests'] | ||
self.disable: str = kwargs['disable'] | ||
if not self.detect: | ||
if self.group: | ||
self.detect = [ConflictAttr(val=f) for f in self.group] | ||
else: | ||
self.detect = [ConflictAttr(val=self.name)] | ||
|
||
init_attrs(state, args, kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being done here instead of in the caller?
# Copyright (c) 2023, NumPy Developers. | ||
# All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've moved to using SPDX identifiers for the license, so we should have that there.
Also, this All rights reserved
seems spurious, since to land in Meson it will need to be under the Apache-2.0 or a compatible permissive license like the MIT or BSD-2-clause
# Copyright (c) 2023, NumPy Developers. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see future.__annotations__
used as well, so we don't have to quote all of the argument types.
match: Union[re.Pattern, None] = field( | ||
default=None, hash=False, compare=False | ||
) | ||
mfilter: Union[re.Pattern, None] = field( | ||
default=None, hash=False, compare=False | ||
) | ||
mjoin: str = field(default='', hash=False, compare=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we fit each assignment on one line, w won't exceed 88 chars, so we should do that.
ret[attr] = val | ||
return ret | ||
|
||
class KwargConfilctAttr(KwargInfo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class KwargConfilctAttr(KwargInfo): | |
class KwargConflictAttr(KwargInfo): |
(with suitable renames later in the file)
raise MesonException( | ||
f'{func_name}: unknown keys {unknown_keys} in ' | ||
f'option {opt_name}' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that for ~100 characters it is still acceptable to be on one line when dealing with effectively just very long strings. Failing that, I'd still prefer to see:
msg = 'long message here'
raise MesonException(msg)
if TYPE_CHECKING: | ||
IMPLIED_ATTR = Union[ | ||
None, str, Dict[str, str], List[ | ||
Union[str, Dict[str, str]] | ||
] | ||
] | ||
class FeatureKwArgs(TypedDict): | ||
#implies: Optional[List['FeatureObject']] | ||
implies: NotRequired[List[Any]] | ||
group: NotRequired[List[str]] | ||
detect: NotRequired[List[ConflictAttr]] | ||
args: NotRequired[List[ConflictAttr]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would stick the typing logic up at the top, still.
ConflictAttr could be quoted in an if T.TYPE_CHECKING:
block to cause it to parse as a forward reference, it doesn't need to be defined before.
'extra_tests', ( | ||
NoneType, ContainerTypeInfo(dict, (str, File))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'extra_tests', ( | |
NoneType, ContainerTypeInfo(dict, (str, File))) | |
'extra_tests', | |
(NoneType, ContainerTypeInfo(dict, (str, File)))), |
These are two unrelated type arguments to KwargInfo.
def test_code(state: 'ModuleState', compiler: 'Compiler', | ||
args: List[str], code: 'Union[str, File]' | ||
) -> Tuple[bool, bool, str]: | ||
# TODO: Add option to treat warnings as errors | ||
with compiler.cached_compile( | ||
code, state.environment.coredata, extra_args=args | ||
) as p: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same way that ) -> Tuple[bool, bool, str]:
is aligned with the function open parens, and not dedented to the beginning of the file, we should also have with xxx():
either on the same line, or have it break across lines with an 8-space indent and no dedenting.
str, | ||
( | ||
str | File | CustomTarget | CustomTargetIndex | | ||
GeneratedList | StructuredSources | ExtractedObjects | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are StructuredSources necessary here?
They are only permitted in the first place, in:
- Java jars
- rust targets (cannot be polyglot languages in the same target, you have to have a pure rust target that can be linked with other languages as a static/shared library)
So supporting it here feels premature.
str, File, build.CustomTarget, build.CustomTargetIndex, | ||
build.GeneratedList, build.StructuredSources, build.ExtractedObjects, | ||
build.BuildTarget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from ...interpreter.type_checking import SourcesVarargsType
gives you what you want here
@typed_pos_args('features.multi_targets', str, min_varargs=1, varargs=( | ||
str, File, build.CustomTarget, build.CustomTargetIndex, | ||
build.GeneratedList, build.StructuredSources, build.ExtractedObjects, | ||
build.BuildTarget | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here:
from ...interpreter.type_checking import SOURCES_VARARGS
target_name = test_result['target_name'] | ||
if target_name in enabled_targets_names: | ||
skipped_targets.append(( | ||
d, f'Dublicate target name "{target_name}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d, f'Dublicate target name "{target_name}"' | |
d, f'Duplicate target name "{target_name}"' |
Hey everyone! Let me share what's going on with this module. As you know, this was mainly built for NumPy during our switch to Meson. Back in early 2023, Jan Wassenberg (@jan-wassenberg) from Google and Christopher Sidebottom (@Mousius) from ARM suggested we use Highway instead of continuing with NumPy's own universal intrinsics, especially since NumPy had already started supporting C++. Highway isn't just generic SIMD intrinsics - it also comes with a CPU Dispatcher that leverages C++ namespaces, metaprogramming, preprocessing, and CPU target attributes. It works on both modern Clang and GCC without any build system dependencies. The approaches reflect different priorities:
Right now, NumPy is moving to Highway as a library interface, and we've put NumPy's universal intrinsics on the shelf. We're temporarily using Meson CPU Dispatcher until we figure out whether to:
If we end up sticking with Meson CPU Dispatcher, I think we should move all that Meson code from numpy/meson_cpu into this module (runtime library included) so other projects can benefit too. Looking at this code again after almost two years, it definitely needs some usability improvements. When I originally wrote it, I was just focused on getting SIMD optimization into NumPy as quickly as possible. To be honest, I don't want to invest more time in this module until we make a final decision, so it probably won't be merged anytime soon. For now, I'm just focusing on fixing bugs and keeping things running through NumPy's vendored Meson version. Sorry for the delayed response to the pull request. I've had some tough personal stuff going on. |
@seiko2plus Given that status, would you be willing to set this module to a draft until you've decided what you want to do long term? |
@dcbaker, Sure, no problems. Thank you. |
This module aims to provide a more dynamic way to control common
features between compilers that would generally require special
headers, arguments, or compile-time test cases, quite ideal
to manage CPU features and solve compatibility issues.
TODO:
[ ] add x86 features[ ] add Arm features[ ] add PPC64 features[ ] add IBMZ features