8000 First draft of design · markshannon/New-C-API-for-Python@1ce144b · GitHub
[go: up one dir, main page]

Skip 8000 to content

Commit 1ce144b

Browse files
committed
First draft of design
1 parent a9ffb20 commit 1ce144b

File tree

2 files changed

+274
-0
lines changed

2 files changed

+274
-0
lines changed

DesignPrinciples.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
2+
3+
## Design principles
4+
5+
The API will adhere, as much as possible, to the following design principles.
6+
The priniciples are listed in order of importance, with the most important ones first.
7+
8+
### No invalid states
9+
10+
No safe input[1] to the an API can create an invalid state in the virtual machine.
11+
This principle must be applied to all API functions, without exception.
12+
Examples of this are:
13+
14+
* If a function accepts pointers, it must accept `NULL`.
15+
* If a function takes a `signed` integer as an input for a length, it must handle negative numbers.
16+
17+
[1] A safe input is one that does not break obvious invariants.
18+
For example, if an API takes an array and a length, those values must match.
19+
Likewise, inputs must be type-safe.
20+
21+
### Minimize the chance of the user supplying invalid input.
22+
23+
This requires judgement, and depends on the use case. Nevertheless it is an important principle.
24+
25+
### Make it difficult for users to ignore error conditions
26+
27+
Generally this means returning error codes in a way that is difficult to ignore.
28+
29+
### The API should be efficient as possible
30+
31+
We assume that users of the API are using C, or other low-level language for performance.
32+
If users avoid the C-API, delving into CPython internals, it undermines the point of having
33+
the C-API
34+
35+
### Completeness
36+
37+
The C-API should be complete. There should be no need to access VM internals.
38+
This means that the C-API need to cover all language features, and most of the VM functionality.
39+
40+
### No privileged users
41+
42+
The standard library will only use the same API as third-pary code.
43+
This helps to ensure completeness and encourages implementers to make the API efficient.
44+
45+
### Consistency
46+
47+
The API should be consistent in naming, and usage. It should be possible to know
48+
what an API function does from its name and argument. At least, once familiar with the API.
49+
50+
### API and ABI equivalence
51+
52+
Any code written using the C-API will conform to the ABI.
53+
This a forwards-compatibility guarantee only.
54+
E.g., code written using the C-API for 3.15 will work unmodified
55+
on 3.16, but the opposite is not true.
56+
57+
This doesn't mean that all API functions are part of the ABI, but that they must call down to the ABI,
58+
and that once in the ABI, they must remain there.
59+
60+
### API stability
61+
62+
Once added to the C-API, a feature will be removed only if there is a very strong reason (read security)
63+
issue to do so.
64+
The semantics and interface of a function or struct will never change.
65+
It will either remain unchanged, or be removed, and possibly replaced.
66+
67+
### The API should be portable and future proof
68+
69+
The current design of CPython is constrained by the C-API.
70+
We want to provide a faster Python for all users, and the C-API
71+
should not prevent that. The HPy project shows that this can be done efficinetly.
72+
73+
### The API should be pure C.
74+
75+
That means no `#ifdef __cplusplus` or similar constructs.
76+
We do this, not because we don't want people to use C++, but because
77+
we want them to be able to use Rust, Nim, or whatever other low-level
78+
language they want. A C API offers a common basis for wrapping.

DesignRules.md

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
2+
## API design rules
3+
4+
These rules should be applied when adding to the API.
5+
They exist to provide a consid
6+
7+
The overall design of the API must adhere to the desing principles.
8+
Here is one possible design
9+
10+
### Return structs
11+
12+
Many functions return a result, but may also raise an exception.
13+
To handle thsi API functions should return a `struct` containing both
14+
the error code and result or exception.
15+
16+
```C
17+
typedef struct _py_returnval {
18+
int kind;
19+
PyRef value;
20+
} PyResult;
21+
```
22+
23+
Different functions may return variations on the above, but the `kind`
24+
must obey the following rules:
25+
26+
A value of zero is always a success, and `value` must be the result.
27+
A negative value is always an error, and `value` must be the exception raised.
28+
29+
Positive values can either be failures or additional success codes.
30+
No function may return both failures and additional success codes.
31+
32+
For exaample, to get a value from a dictionary might have the following API:
33+
34+
```C
35+
typedef enum _py_lookup_kind {
36+
ERROR = -1,
37+
FOUND = 0,
38+
MISSING = 1,
39+
} PyLookupKind;
40+
41+
typedef struct _py_lookup {
42+
PyLookupKind kind;
43+
PyRef value;
44+
} PyLookupResult;
45+
46+
PyLookupResult PyAPi_Dict_Get(PyDictRef dict, PyRef key);
47+
```
48+
49+
Even in the case of `MISSING`, `value` should be set to a valid value to
50+
minimize the chance of crashes should `value` be used.
51+
The following use, although incorrect, will not corrupt the VM or memory:
52+
```C
53+
PyRef result = PyAPi_Dict_Get(self, k).value;
54+
```
55+
56+
### Naming
57+
58+
All API function and struct names should adhere to simple rules.
59+
For example, function names should take the form:
60+
Prefix_NameSpace_Operation[_REF_CONSUMPTION]
61+
E.g.
62+
```C
63+
PyResult PyApi_Tuple_FromArray(uintptr_t len, PyRef *array);
64+
```
65+
66+
### Use standard C99 types, not custom ones.
67+
68+
In other words, use `intptr_t` not `Py_ssize_t`.
69+
70+
71+
### Consumption of argument references
72+
73+
For effficiency, there is a natural consumption of references in some API
74+
functions. For example, appending an item to a list naturally consumes the
75+
reference to the item, but not the list.
76+
We denote borrowed references by `B` and consumed references by `C`.
77+
78+
Consequently we want the low-level API/ABI function to be:
79+
80+
```C
81+
int PyApi_List_Append_BC(PyListRef list, PyRef item);
82+
```
83+
84+
All ABI functionsshoudl get a higher level API function without a suffix.
85+
All non-suffix functions borrow the references to all their arguments.
86+
87+
```C
88+
int PyApi_List_Append(PyListRef list, PyRef item);
89+
```
90+
is equivalent to `PyApi_List_Append_BB`.
91+
92+
Functions taking arrays must consume all the references in the array,
93+
or borrow all references in the array.
94+
95+
The reference behavior must be the safe regardless of the return value or
96+
result.
97+
98+
Note that this doesn't impact the portability of the API as the borrow
99+
or consume forms can be mechanically create from the other.
100+
101+
102+
### Opaque, linear references
103+
104+
The C-API will refer to Python objects through opaque references
105+
which must have exactly one owner. This design has been shown to
106+
be efficient, robust and portable by the HPy project, where the
107+
references are known as "handles".
108+
As each reference has exactly one owner, there will be no
109+
incrementing or decrementing of reference counts. References can
110+
be duplicated with
111+
```C
112+
PyRef PyRef_Dup(PyRef ref);
113+
```
114+
and destroyed by
115+
```C
116+
void PyRef_Clear(PyRef ref);
117+
```
118+
119+
Type specific variants will be provided for subtypes like `PyListRef`.
120+
121+
122+
### ABI functions should be efficient, API functions easy to use
123+
124+
There is a tension between ease of use and performance.
125+
For example, it is the common case when creating a tuple that
126+
the length is known, yet the function needs to treat length zero
127+
differently, returning the empty tuple singleton.
128+
129+
We handle this tension by providing an efficient, but difficult use
130+
ABI function:
131+
```C
132+
PyResult PyApi_Tuple_FromNonEmptyArray_nC(uintptr_tlen, PyRef *array);
133+
```
134+
and the easier to use API function
135+
```C
136+
PyResult PyApi_Tuple_FromArray(uintptr_tlen, PyRef *array);
137+
```
138+
139+
But we can do better, as the API can include macros, we can implement
140+
```C
141+
PyTupleResult PyApi_Tuple_FromFixedArray(array);
142+
```
143+
something like this:
144+
```
145+
#define PyResult PyApi_Tuple_FromFixedArray(array) \
146+
((sizeof(array) == 0) ? \
147+
PyApi_NewRefEmptyTuple() \
148+
: \
149+
PyApi_Tuple_FromNonEmptyArray(sizeof(array)/sizeof(PyRef), &array)
150+
F987 )
151+
```
152+
Allowing it be used like this:
153+
```
154+
PyRef args[4] = {
155+
PyNone,
156+
arg1,
157+
arg2,
158+
PyNone
159+
};
160+
PyTupleResult new_tuple = PyApi_Tuple_FromFixedArray(args);
161+
```
162+
163+
### The API should include versions of functions that take result types.
164+
165+
For most* API functions, at least those that take one or two `PyRef` arguments,
166+
there should be a version that takes a `PyResult` as the first argument.
167+
168+
This function gets an `M` suffix.
169+
170+
This allows chaining of calls without being overwhelmed by error handling.
171+
172+
Suppose we want to write a function that returns the name of the class of
173+
the argument.
174+
175+
Using the `M` forms we can implement this as:
176+
```C
177+
PyStrResult pop_and_pair(PyRef o)
178+
{
179+
return Py_Type_GetName_M(PyApi_Object_GetType(o));
180+
}
181+
```
182+
183+
The implementation is straightforward and can be automatically generated:
184+
```
185+
inline PyResult Py_Type_GetName_M(PyResult r)
186+
{
187+
if (r.kind < 0) {
188+
return r;
189+
}
190+
return Py_Type_GetName(r.value);
191+
}
192+
```
193+
194+
For the technically minded, this pattern is known as the "error monad".
195+
196+
*Probably all, as we automatically generate these.

0 commit comments

Comments
 (0)
0