8000 ENH: Support character string arrays · numpy/numpy@d4e11c7 · GitHub
[go: up one dir, main page]

Skip to content

Commit d4e11c7

Browse files
pearuHaoZeke
authored andcommitted
ENH: Support character string arrays
TST: added test for issue #18684 ENH: f2py opens files with correct encoding, fixes #635 TST: added test for issue #6308 TST: added test for issue #4519 TST: added test for issue #3425 ENH: Implement user-defined hooks support for post-processing f2py data structure. Implement character BC hook. ENH: Add support for detecting utf-16 and utf-32 encodings.
1 parent e5fcb9d commit d4e11c7

25 files changed

+1784
-356
lines changed

doc/source/f2py/advanced.rst

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,4 +96,54 @@ and the corresponding <C type>. The <C type> can be one of the following::
9696
complex_long_double
9797
string
9898

99-
For more information, see the F2Py source code ``numpy/f2py/capi_maps.py``.
99+
For more information, see F2Py source code ``numpy/f2py/capi_maps.py``.
100+
101+
.. _Character strings:
102+
103+
Character strings
104+
=================
105+
106+
Assumed length chararacter strings
107+
-----------------------------------
108+
109+
In Fortran, assumed length character string arguments are declared as
110+
``character*(*)`` or ``character(len=*)``, that is, the length of such
111+
arguments are determined by the actual string arguments at runtime.
112+
For ``intent(in)`` arguments, this lack of length information poses no
113+
problems for f2py to construct functional wrapper functions. However,
114+
for ``intent(out)`` arguments, the lack of length information is
115+
problematic for f2py generated wrappers because there is no size
116+
information available for creating memory buffers for such arguments
117+
and F2PY assumes the length is 0. Depending on how the length of
118+
assumed length character strings are specified, there exist ways to
119+
workaround this problem, as exemplified below.
120+
121+
If the length of the ``character*(*)`` output argument is determined
122+
by the state of other input arguments, the required connection can be
123+
established in a signature file or within a f2py-comment by adding an
124+
extra declaration for the corresponding argument that specifies the
125+
length in character selector part. For example, consider a Fortran
126+
file ``asterisk1.f90``:
127+
128+
.. include:: asterisk1.f90
129+
:literal:
130+
131+
Compile it with ``f2py -c asterisk1.f90 -m asterisk1`` and then in Python:
132+
133+
.. include:: asterisk1_session.dat
134+
:literal:
135+
136+
Notice that the extra declaration ``character(f2py_len=12) s`` is
137+
interpreted only by f2py and in the ``f2py_len=`` specification one
138+
can use C-expressions as a length value.
139+
140+
In the following example:
141+
142+
.. include:: asterisk2.f90
143+
:literal:
144+
145+
the lenght of output assumed length string depends on an input
146+
argument ``n``, after wrapping with F2PY, in Python:
147+
148+
.. include:: asterisk2_session.dat
149+
:literal:

doc/source/f2py/asterisk1.f90

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
subroutine foo1(s)
2+
character*(*), intent(out) :: s
3+
!f2py character(f2py_len=12) s
4+
s = "123456789A12"
5+
end subroutine foo1

doc/source/f2py/asterisk1_session.dat

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
>>> import asterisk1
2+
>>> asterisk1.foo1()
3+
b'123456789A12'

doc/source/f2py/asterisk2.f90

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
subroutine foo2(s, n)
2+
character(len=*), intent(out) :: s
3+
integer, intent(in) :: n
4+
!f2py character(f2py_len=n), depend(n) :: s
5+
s = "123456789A123456789B"(1:n)
6+
end subroutine foo2

doc/source/f2py/asterisk2_session.dat

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
>>> import asterisk
2+
>>> asterisk.foo2(2)
3+
b'12'
4+
>>> asterisk.foo2(12)
5+
b'123456789A12'
6+
>>>

doc/source/f2py/signature-file.rst

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -645,21 +645,20 @@ A C expression may contain:
645645
according to given dependence relations;
646646
* the following CPP macros:
647647

648-
* ``rank(<name>)``
648+
``f2py_rank(<name>)``
649649
Returns the rank of an array ``<name>``.
650-
651-
* ``shape(<name>,<n>)``
650+
``f2py_shape(<name>, <n>)``
652651
Returns the ``<n>``-th dimension of an array ``<name>``.
653-
654-
* ``len(<name>)``
652+
``f2py_len(<name>)``
655653
Returns the length of an array ``<name>``.
656-
657-
* ``size(<name>)``
654+
``f2py_size(<name>)``
658655
Returns the size of an array ``<name>``.
659-
660-
* ``slen(<name>)``
656+
``f2py_itemsize(<name>)``
657+
Returns the itemsize of an array ``<name>``.
658+
``f2py_slen(<name>)``
661659
Returns the length of a string ``<name>``.
662660

661+
663662
For initializing an array ``<array name>``, F2PY generates a loop over
664663
all indices and dimensions that executes the following
665664
pseudo-statement::
@@ -706,4 +705,15 @@ Currently, multi-line blocks can be used in the following constructs:
706705

707706
* as a list of C arrays of the ``pymethoddef`` statement;
708707

709-
* as a documentation string.
708+
+ as documentation string.
709+
710+
Extended char-selector
711+
-----------------------
712+
713+
F2PY extends char-selector specification, usable within a signature
714+
file or a F2PY directive, as follows::
715+
716+
<extended-charselector> := <charselector>
717+
| (f2py_len= <len>)
718+
719+
See :ref:`Character Strings` for usage.

numpy/f2py/auxfuncs.py

Lines changed: 50 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -28,18 +28,21 @@
2828
'getfortranname', 'getpymethoddef', 'getrestdoc', 'getusercode',
2929
'getusercode1', 'hasbody', 'hascallstatement', 'hascommon',
3030
'hasexternals', 'hasinitvalue', 'hasnote', 'hasresultnote',
31-
'isallocatable', 'isarray', 'isarrayofstrings', 'iscomplex',
31+
'isallocatable', 'isarray', 'isarrayofstrings',
32+
'ischaracter', 'ischaracterarray', 'ischaracter_or_characterarray',
33+
'iscomplex',
3234
'iscomplexarray', 'iscomplexfunction', 'iscomplexfunction_warn',
3335
'isdouble', 'isdummyroutine', 'isexternal', 'isfunction',
34-
'isfunction_wrap', 'isint1array', 'isinteger', 'isintent_aux',
36+
'isfunction_wrap', 'isint1', 'isint1array', 'isinteger', 'isintent_aux',
3537
'isintent_c', 'isintent_callback', 'isintent_copy', 'isintent_dict',
3638
'isintent_hide', 'isintent_in', 'isintent_inout', 'isintent_inplace',
3739
'isintent_nothide', 'isintent_out', 'isintent_overwrite', 'islogical',
3840
'islogicalfunction', 'islong_complex', 'islong_double',
3941
'islong_doublefunction', 'islong_long', 'islong_longfunction',
4042
'ismodule', 'ismoduleroutine', 'isoptional', 'isprivate', 'isrequired',
4143
'isroutine', 'isscalar', 'issigned_long_longarray', 'isstring',
42-
'isstringarray', 'isstringfunction', 'issubroutine',
44+
'isstringarray', 'isstring_or_stringarray', 'isstringfunction',
45+
'issubroutine',
4346
'issubroutine_wrap', 'isthreadsafe', 'isunsigned', 'isunsigned_char',
4447
'isunsigned_chararray', 'isunsigned_long_long',
4548
'isunsigned_long_longarray', 'isunsigned_short',
@@ -68,24 +71,41 @@ def debugcapi(var):
6871
return 'capi' in debugoptions
6972

7073

74+
def _ischaracter(var):
75+
return 'typespec' in var and var['typespec'] == 'character' and \
76+
not isexternal(var)
77+
78+
7179
def _isstring(var):
7280
return 'typespec' in var and var['typespec'] == 'character' and \
7381
not isexternal(var)
7482

7583

76-
def isstring(var):
77-
return _isstring(var) and not isarray(var)
84+
def ischaracter_or_characterarray(var):
85+
return _ischaracter(var) and 'charselector' not in var
7886

7987

8088
def ischaracter(var):
81-
return isstring(var) and 'charselector' not in var
89+
return ischaracter_or_characterarray(var) and not isarray(var)
90+
91+
92+
def ischaracterarray(var):
93+
return ischaracter_or_characterarray(var) and isarray(var)
94+
95+
96+
def isstring_or_stringarray(var):
97+
return _ischaracter(var) and 'charselector' in var
98+
99+
100+
def isstring(var):
101+
return isstring_or_stringarray(var) and not isarray(var)
82102

83103

84104
def isstringarray(var):
85-
return isarray(var) and _isstring(var)
105+
return isstring_or_stringarray(var) and isarray(var)
86106

87107

88-
def isarrayofstrings(var):
108+
def isarrayofstrings(var): # obsolete?
89109
# leaving out '*' for now so that `character*(*) a(m)` and `character
90110
# a(m,*)` are treated differently. Luckily `character**` is illegal.
91111
return isstringarray(var) and var['dimension'][-1] == '(*)'
@@ -126,6 +146,11 @@ def get_kind(var):
126146
pass
127147

128148

149+
def isint1(var):
150+
return var.get('typespec') == 'integer' \
151+
and get_kind(var) == '1' and not isarray(var)
152+
153+
129154
def islong_long(var):
130155
if not isscalar(var):
131156
return 0
@@ -426,6 +451,7 @@ def isintent_hide(var):
426451
('out' in var['intent'] and 'in' not in var['intent'] and
427452
(not l_or(isintent_inout, isintent_inplace)(var)))))
428453

454+
429455
def isintent_nothide(var):
430456
return not isintent_hide(var)
431457

@@ -469,6 +495,7 @@ def isintent_aligned8(var):
469495
def isintent_aligned16(var):
470496
return 'aligned16' in var.get('intent', [])
471497

498+
472499
isintent_dict = {isintent_in: 'INTENT_IN', isintent_inout: 'INTENT_INOUT',
473500
isintent_out: 'INTENT_OUT', isintent_hide: 'INTENT_HIDE',
474501
isintent_cache: 'INTENT_CACHE',
@@ -566,19 +593,19 @@ def __call__(self, var):
566593

567594

568595
def l_and(*f):
569-
l, l2 = 'lambda v', []
596+
l1, l2 = 'lambda v', []
570597
for i in range(len(f)):
571-
l = '%s,f%d=f[%d]' % (l, i, i)
598+
l1 = '%s,f%d=f[%d]' % (l1, i, i)
572599
l2.append('f%d(v)' % (i))
573-
return eval('%s:%s' % (l, ' and '.join(l2)))
600+
return eval('%s:%s' % (l1, ' and '.join(l2)))
574601

575602

576603
def l_or(*f):
577-
l, l2 = 'lambda v', []
604+
l1, l2 = 'lambda v', []
578605
for i in range(len(f)):
579-
l = '%s,f%d=f[%d]' % (l, i, i)
606+
l1 = '%s,f%d=f[%d]' % (l1, i, i)
580607
l2.append('f%d(v)' % (i))
581-
return eval('%s:%s' % (l, ' or '.join(l2)))
608+
return eval('%s:%s' % (l1, ' or '.join(l2)))
582609

583610

584611
def l_not(f):
@@ -666,7 +693,9 @@ def getcallprotoargument(rout, cb_map={}):
666693
pass
667694
else:
668695
ctype = ctype + '*'
669-
if isstring(var) or isarrayofstrings(var):
696+
if ((isstring(var)
697+
or isarrayofstrings(var) # obsolete?
698+
or isstringarray(var))):
670699
arg_types2.append('size_t')
671700
arg_types.append(ctype)
672701

@@ -731,14 +760,14 @@ def getrestdoc(rout):
731760

732761

733762
def gentitle(name):
734-
l = (80 - len(name) - 6) // 2
735-
return '/*%s %s %s*/' % (l * '*', name, l * '*')
763+
ln = (80 - len(name) - 6) // 2
764+
return '/*%s %s %s*/' % (ln * '*', name, ln * '*')
736765

737766

738-
def flatlist(l):
739-
if isinstance(l, list):
740-
return reduce(lambda x, y, f=flatlist: x + f(y), l, [])
741-
return [l]
767+
def flatlist(lst):
768+
if isinstance(lst, list):
769+
return reduce(lambda x, y, f=flatlist: x + f(y), lst, [])
770+
return [lst]
742771

743772

744773
def stripcomma(s):

0 commit comments

Comments
 (0)
0