8000 bpo-31626: Mark ends of the reallocated block in debug build. by serhiy-storchaka · Pull Request #4210 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-31626: Mark ends of the reallocated block in debug build. #4210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 7, 2017
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 72 additions & 17 deletions Objects/obmalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -1516,45 +1516,100 @@ _PyMem_DebugRawRealloc(void *ctx, void *p, size_t nbytes)
}

debug_alloc_api_t *api = (debug_alloc_api_t *)ctx;
uint8_t *q; /* base address of malloc'epad d block */
uint8_t *data; /* p + 2*SST == pointer to data bytes */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the data variable to make the code easier to follow and understand.

With "q += 2*SST;" in the middle of the function, it's hard to follow which bytes are modified in the buffer.

Using the data variable, you can use my ASCII-art schema in _PyMem_DebugRawAlloc() to see what the memory block contains.

Please keep my data variable ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I restored the using of the q variable for symmetry in saving/restoring code. Do you prefer destroying the symmetry or using the data variable in saving too? Using the data variable will make one line too long, it should be splitted on two lines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand your symmetry question. I would like to have similar code in _PyMem_DebugRawAlloc() and _PyMem_DebugRawRealloc(). With "q += 2*SST;", the q documentation is wrong:

    uint8_t *q;           /* base address of malloc'epad d block */

With "q += 2*SST;", the meaning of q depends where your are in the function. For me, it's hard to follow :-( The memory block layout is complex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation was wrong before. q is not a base address of malloc'ed d block before calling realloc().

uint8_t *head; /* base address of malloc'ed d block */
uint8_t *data; /* pointer to data bytes */
uint8_t *r;
uint8_t *tail; /* data + nbytes == pointer to tail pad bytes */
size_t total; /* 2 * SST + nbytes + 2 * SST */
size_t original_nbytes;
int i;
size_t serialno;
#define ERASED_SIZE 64
uint8_t save[2*ERASED_SIZE]; /* A copy of erased bytes. */

_PyMem_DebugCheckAddress(api->api_id, p);

q = (uint8_t *)p;
original_nbytes = read_size_t(q - 2*SST);
data = (uint8_t *)p;
head = data - 2*SST;
original_nbytes = read_size_t(head);
if (nbytes > (size_t)PY_SSIZE_T_MAX - 4*SST) {
/* integer overflow: can't represent total as a Py_ssize_t */
return NULL;
}
total = nbytes + 4*SST;

tail = data + original_nbytes;
serialno = read_size_t(tail + SST);
/* Mark the header, the trailer, ERASED_SIZE bytes at the begin and
ERASED_SIZE bytes at the end as dead and save the copy of erased bytes.
*/
if (original_nbytes <= sizeof(save)) {
memcpy(save, data, original_nbytes);
memset(data - 2*SST, DEADBYTE, original_nbytes + 4*SST);
}
else {
memcpy(save, data, ERASED_SIZE);
memset(head, DEADBYTE, ERASED_SIZE + 2*SST);
memcpy(&save[ERASED_SIZE], tail - ERASED_SIZE, ERASED_SIZE);
memset(tail - ERASED_SIZE, DEADBYTE, ERASED_SIZE + 2*SST);
}

/* Resize and add decorations. */
q = (uint8_t *)api->alloc.realloc(api->alloc.ctx, q - 2*SST, total);
if (q == NULL) {
r = (uint8_t *)api->alloc.realloc(api->alloc.ctx, head, total);
if (r == NULL) {
write_size_t(head, original_nbytes);
head[SST] = (uint8_t)api->api_id;
memset(head + SST + 1, FORBIDDENBYTE, SST-1);

memset(tail, FORBIDDENBYTE, SST);
write_size_t(tail + SST, serialno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. We misunderstood each other. I asked why you would do that again if realloc() failed. I missed the fact that you erase the header and trailer... I understood that you only erased the start and end of the data block.

Now I understand why you didn't want to duplicate so much code, and I agree.


/* Restore saved bytes. */
if (original_nbytes <= sizeof(save)) {
memcpy(data, save, original_nbytes);
}
else {
memcpy(tail - ERASED_SIZE, &save[ERASED_SIZE], ERASED_SIZE);
memcpy(data, save, ERASED_SIZE);
}
_PyMem_DebugCheckAddress(api->api_id, data);
return NULL;
}

bumpserialno();
write_size_t(q, nbytes);
assert(q[SST] == (uint8_t)api->api_id);
for (i = 1; i < SST; ++i) {
assert(q[SST + i] == FORBIDDENBYTE);
else {
head = r;
bumpserialno();
serialno = _PyRuntime.mem.serialno;
}
data = q + 2*SST;

write_size_t(head, nbytes);
head[SST] = (uint8_t)api->api_id;
memset(head + SST + 1, FORBIDDENBYTE, SST-1);
data = head + 2*SST;

tail = data + nbytes;
memset(tail, FORBIDDENBYTE, SST);
write_size_t(tail + SST, _PyRuntime.mem.serialno);
write_size_t(tail + SST, serialno);

/* Restore saved bytes. */
if (original_nbytes <= sizeof(save)) {
memcpy(data, save, Py_MIN(nbytes, original_nbytes));
}
else {
size_t i = original_nbytes - ERASED_SIZE;
if (nbytes > i) {
memcpy(data + i, &save[ERASED_SIZE],
Py_MIN(nbytes - i, ERASED_SIZE));
}
memcpy(data, save, Py_MIN(nbytes, ERASED_SIZE));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind to move this memcpy() before the if, to keep (start, end) order?

}
_PyMem_DebugCheckAddress(api->api_id, data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I'm not sure that it's useful to check the header and trailer since we just wrote them, no? Calling _PyMem_DebugCheckAddress() here can impact performance, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was useful for writing this code. But since it has been written this check is not needed.


if (r == NULL) {
return NULL;
}

if (nbytes > original_nbytes) {
/* growing: mark new extra memory clean */
memset(data + original_nbytes, CLEANBYTE,
nbytes - original_nbytes);
memset(data + original_nbytes, CLEANBYTE, nbytes - original_nbytes);
}

return data;
Expand Down
0