Support for `uint16`, `uint32`, and `uint64` #58734

pmeier · 2021-05-21T07:38:15Z

The array API specification stipulates the data types that we need to support to be compliant. Currently we are missing support for uint16, uint32, and uint64.

cc @mruberry @rgommers @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi @ezyang @msaroufim @wconstab @bdhirsh @anijain2305 @zou3519 @gchanan @soumith @ngimel

The text was updated successfully, but these errors were encountered:

rgommers · 2021-11-13T05:06:54Z

Someone just asked about other uint types on Slack. And from pytorch/vision#4326 (comment):

This is because PyTorch doesn't (yet) support uint16, and is also a problem when reading PNG images of type uint16.

16-bit image support making uint16 the most interesting one of the missing dtypes had been my guess before.

There are no plans to work on this issue currently I think, unless more demand materializes.

NicolasHug · 2021-11-13T13:34:58Z

There are no plans to work on this issue currently I think, unless more demand materializes.

I'll add one :)

Another tangible need for uint16 support in torchvision is pytorch/vision#4731

We added support for native 16 bits png decoding in torchvision, but we can't make this API public for now, because we output int32 tensors and this wouldn't be compatible with the rest of our transforms.
It'd be great if we could make it public because Pillow 16 bits png support is fairly limited.

kernelmethod · 2022-02-20T23:34:01Z

Bumping this.

My research collaborators and I are working on some cryptographic applications where we could really use uint32 / uint64. Some operations on Z_{2^64} that we'd like to calculate with secure multi-party computation, e.g. comparison, would be a lot more straightforward to implement with unsigned integers as the underlying dtype.

a-gn · 2023-01-18T15:47:27Z

We have a GIS pipeline that uses image transforms from multiple projects, some of which only support uint, but uint8 leads to too much loss of color information. We could really use uint16 support.

neelnanda-io · 2023-01-24T09:49:27Z

I would appreciate uint16 support! I'm trying to do NLP stuff with a large dataset of tokens between 0 and 51000, and it's annoying to consume double the storage to keep them as int32s (I'm currently storing them as uint16 via HuggingFace, but I need to load them as NumPy and manually convert them)

oliver-batchelor · 2023-02-02T01:04:43Z

I'm doing work on HDR imaging and we read images from the camera as 16-bit unsigned. It's possible to work around it by using other frameworks but it would be really useful.

VladShtompel · 2023-02-17T15:31:19Z

I'm doing work on HDR imaging and we read images from the camera as 16-bit unsigned. It's possible to work around it by using other frameworks but it would be really useful.

This is exactly the issue me and my team are faced with right now.

StrongChris · 2023-04-03T01:36:01Z

I'm doing work with DICOM data that is often 10 or even 14 unsigned bits. A uint16 would be very nice for these! My work is focused on speed so using the smallest possible datatype would be very appreciated.

ezyang · 2023-04-22T18:17:20Z

We should add these dtypes, and then build out support via PT2. We probably aren't going to add kernels for everything but Triton makes it very easy to JIT compile these operations.

NicolasHug · 2023-04-24T12:24:12Z

@ezyang would Triton be able to enable CPU support?

ezyang · 2023-04-24T12:36:51Z

Not Triton per se, but we have a CPU inductor backend, so the answer is yes!

soulitzer · 2023-05-08T17:27:26Z

From Triage review: We still need some limited eager support, e.g. factory functions, conversion functions. Also consideration with autocast? (maybe not too bad?)

vadimkantorov · 2023-07-30T10:48:41Z

Also, bit ops are only well-defined/standardized in CPUs for unsigned dtypes if I understand well: #105465

vadimkantorov · 2023-08-03T20:43:57Z

uint16 would also be useful for interop with opencv (CV_16U dtype)

Refer: pytorch/pytorch#58734

The dtypes are very useless right now (not even fill works), but it makes torch.uint16, uint32 and uint64 available as a dtype. Towards #58734 Signed-off-by: Edward Z. Yang <ezyangmeta.com> [ghstack-poisoned]

The dtypes are very useless right now (not even fill works), but it makes torch.uint16, uint32 and uint64 available as a dtype. Towards #58734 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: #116594 Approved by: https://github.com/albanD ghstack dependencies: #116698, #116693

ionutmodo · 2024-03-06T12:30:37Z

Hi guys! I would like to add another usage for uint16 on GPUs, which can be used in designing efficient adaptive sparse optimizers.

When working with torch, the output is often float32. Torch does not have good support for conversion to uint types: pytorch/pytorch#58734 Support float32 and the signed integer types for convenience.

rbelew · 2024-09-10T23:50:44Z

ANother uint64 use case arises trying to embed hashes from spaCy's token hashing function. Spacy uses hashing on texts to get unique ids (cf. SO )

>>> import spacy
>>> nlp = spacy.load('en')
>>> text = "here is some test text"
>>> doc = nlp(text)
>>> [token.norm for token in doc]
[411390626470654571, 3411606890003347522, 7000492816108906599, 1618900948208871284, 15099781594404091470]

vadimkantorov · 2024-09-11T16:42:10Z

Another uint16 usecase is for natural representing UTF-16 code points... e.g. as one of ways of interop/representing arrays of stings as tensors:

[discussion] [feature request] Native tensor-backed array of strings (at least a read-only one) and basic string processing functions for addition into core + discussion of extremely basic data frames (also for reducing python object heap pressure) #101699

vadimkantorov · 2024-09-11T20:11:59Z

And another usecase for unsigned dtypes is bitshifts, as bitshift for signed dtypes (if signed bit is set) is theoretically implementation-defined or undefined

oliver-batchelor · 2024-09-11T22:26:43Z

@vadimkantorov @rbelew In case you didn't realise torch >= 2.3 has all the unsigned int types.

vadimkantorov · 2024-09-12T08:46:30Z

Yeah, I saw the support, but I didn't realize which ops are enabled for them. E.g. is bitshift enabled?

ezyang · 2024-09-12T12:43:31Z

@malfet just added bitshift. Op support is on a per request basis: we will add it if requested

vadimkantorov · 2024-09-12T12:51:14Z

just added bitshift.

If you are meaning the #135525, it only added and/or/xor, but maybe bitshifts were added in some other PRs...

FL33TW00D · 2024-10-11T09:33:51Z

Echoing the comment from @neelnanda-io, would be great to add uint16 support for Embedding indices. A good chunk of vocabs are smaller than 65535 :)

ezyang · 2024-10-12T02:56:07Z

for embedding I think this involves beefing up index_select

vadimkantorov · 2024-10-12T08:40:26Z

related on supporting natively shorted bitwidth indexes:

[discussion] Support other index dtypes for scatter, scatter_reduce and other indexing functions in addition to int64: uint8, int16, int32, uint16 etc (without casting copies/reallocations) #61819

(might also be useful for the sparse tensors, wherever indexes are used to also support shorter dtypes including unsigned, and not only int64)

and also related (on efficient support of shorter-width dtypes for Embedding - but back then in the signed context):

Support ByteTensor and ShortTensor for nn.Embedding and nn.EmbeddingBag #103580

vadimkantorov · 2024-10-28T23:43:28Z

@malfet just added bitshift.

@ezyang it appears that not yet:

shift operators not supporting integral types #139124

malfet · 2024-10-28T23:45:08Z

@malfet just added bitshift.

@ezyang it appears that not yet:

shift operators not supporting integral types #139124

[Edit] You are right, it's not there yet, let me propose a PR that extends bitshifts to those...

ev-br · 2025-02-26T09:47:32Z

As a data point, we tried enabling unsigned integers in the Array API test suite, in data-apis/array-api-compat#253

That results in 70-odd failures, the majority of which are of the *_cpu is not implemented for uint16 variety.

F80F

mdhaber · 2025-04-15T21:07:47Z

pmeier added the module: python array api Issues related to the Python Array API label May 21, 2021

pmeier mentioned this issue May 21, 2021

Python Array API Compatibility Tracker #58743

Open

29 tasks

H-Huang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 22, 2021

fmassa mentioned this issue Aug 27, 2021

Clean up to_tensor. pytorch/vision#4326

Closed

honno mentioned this issue Mar 30, 2022

Support testing any array module in tests/array_api HypothesisWorks/hypothesis#3275

Merged

mworchel mentioned this issue Dec 10, 2022

Inconsistent behavior of UInt constructor for PyTorch and numpy mitsuba-renderer/drjit#111

Closed

StrongChris mentioned this issue Apr 3, 2023

Support more datatypes StrongResearch/dimble#1

Open

ezyang added high priority oncall: pt2 labels Apr 22, 2023

pytorch-bot bot added the triage review label Apr 22, 2023

soulitzer removed the triage review label May 8, 2023

rgommers mentioned this issue Jun 11, 2023

scipy.ndimage.find_objects #102201

Open

vadimkantorov mentioned this issue Jul 30, 2023

[proposal] Bit ops: e.g. setbit/getbit/togglebit/byteswap #105465

Open

ClaudiaComito mentioned this issue Aug 21, 2023

Aliases for unsigned integer types uint16, uint32 and uint64 helmholtz-analytics/heat#782

Open

voznesenskym removed the high priority label Nov 13, 2023

vivekkhandelwal1 added a commit to vivekkhandelwal1/torch-mlir that referenced this issue Nov 21, 2023

[MLIR][TORCH] Add support for unsigned integer types

ddb574e

Refer: pytorch/pytorch#58734

ezyang mentioned this issue Jan 2, 2024

Add unsigned integer dtypes to PyTorch #116594

Closed

Felix-Petersen mentioned this issue Jun 26, 2024

Population Count Op #36380

Open

Alarmod mentioned this issue Aug 23, 2024

ultralytics.utils.metrics.mask_iou has an integer overflow if you are working with uint8 masks. ultralytics/ultralytics#15782

Closed

2 tasks

tylerjereddy mentioned this issue Sep 7, 2024

TST: remove redundant torch skips scipy/scipy#21516

Merged

mdhaber mentioned this issue Jan 29, 2025

Output of torch.sum with unsigned input should be unsigned data-apis/array-api-compat#242

Open

ev-br mentioned this issue Feb 26, 2025

ENH: torch: unsigned types data-apis/array-api-compat#253

Closed

3 tasks

vboussot mentioned this issue Apr 16, 2025

ENH: Replace unsigned long with uint64_t for non-zero Jacobian indices SuperElastix/elastix#1316

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for `uint16`, `uint32`, and `uint64` #58734

Support for `uint16`, `uint32`, and `uint64` #58734

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support for uint16, uint32, and uint64 #58734

Support for uint16, uint32, and uint64 #58734

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support for `uint16`, `uint32`, and `uint64` #58734

Support for `uint16`, `uint32`, and `uint64` #58734