8000 Fix #200 Add support for copying Uri structures by kocsismate · Pull Request #230 · uriparser/uriparser · GitHub
[go: up one dir, main page]

Skip to content

Fix #200 Add support for copying Uri structures #230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 1, 2025

Conversation

kocsismate
Copy link
Contributor

I also need this functionality for https://wiki.php.net/rfc/url_parsing_api because of multiple reasons:

  • In order to maintain immutability of URIs, the structs have to be copied first before any modification
  • The PHP Uri class exposes both the normalized and non-normalized URI components (therefore it internally stores two Uri structs). When a normalized component is requested, normalization is done on demand, by copying the original struct first

The patch needs a last few touches, including tests, but I wanted to propose it nevertheless to get early feedback.

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate I'm only starting to have a closer look. From what I see so far:

  • the pull request does multiple things in a single commit: e.g. move code around and add new code — that's too much for a single commit and the move itself does not seem strictly necessary to me as exposing needed functions through their related .h files would also make them available for use and then the move is fully optional (and something I would want to do as a potential refactoring myself later to keep this pull request to the point). I would ask to revert/drop all moves and expose through existing headers instead, and to add the new copy code to new files UriCopy.{c,h}, and to then get back to detail level review. If some move is more necessary than I understood, it should be necessary moves only for now and in a dedicated commit please. Thank you!
  • The way that function PreventLeakage is used looks mistaken: the revertMask input would need to grow with each allocation a la doneMask |= URI_NORMALIZE_SCHEME;git grep -F '|=' -- src/UriNormalize.c yields examples of that.

I'll pause review here for now, waiting for feedback.

@kocsismate
Copy link
Contributor Author

I'll pause review here for now, waiting for feedback.

Thanks for the input! I'll try to address your suggestion ASAP, possibly around Wednesday!

@hartwork
Copy link
Member

I'll pause review here for now, waiting for feedback.

Thanks for the input! I'll try to address your suggestion ASAP, possibly around Wednesday!

@kocsismate cool! 👍

@kocsismate kocsismate force-pushed the copy branch 2 times, most recently from 976a2b0 to e9264b2 Compare June 20, 2025 06:26
Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate flushing some comments from review and taking a break from review for new, more to come…

URI_CHAR * dup = memory->malloc(memory, lenInBytes);
if (dup == NULL) {
return URI_FALSE; /* Raises malloc error */
if (URI_FUNC(CopyRangeEngine)(range, range, memory) == URI_FALSE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this refactoring "extract function" a dedicated first separate commit. Files UriCommon{c,h} seem like a good target for this. I'll comment on the function name further up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reorder my commits a little bit later, when the code is about to be finalized. :)

@kocsismate
Copy link
Contributor Author

I think it means that we need a new function preventLeakageAfterCopy that calls out to preventLeakage internally but also handles .ip4 and .ip6 (and .portText)?

Yes, I had a similar plan in mind too! :)

Copy link
Member
8000 @hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate I have yet to review the big loop. Flushing some comments…

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate the loop needs more work. Please see details including a draft for an alternative approach with ideas to re-use or pick from below…

@hartwork
Copy link
Member

@kocsismate PS: I have gone forward merging #231 just now because…

  • it was no longer in draft state and I felt sure enough of it by now
  • it allows you to focus on the copy feature(but post-merge review is still welcome if you find time)
  • it allows rebasing Fix #200 Add support for copying Uri structures #230 here onto latest master with URI_NORMALIZE_PORT already in place
  • it adds no public API other than enum value URI_NORMALIZE_PORT so making adjustments post merge has few constraints

I hope that's good with you?

@kocsismate
Copy link
Contributor Author

PS: I have gone forward merging #231 just now because…

Yes, thank you! I'm happy with the merge. I don't think I'm able to review it meaningfully, but I'll have a bit deeper look at it soon.

8000

@kocsismate
Copy link
Contributor Author

I find this branch of the code overly complex even before being bullet proof

I agree, I needed quite some time to come up with it and understand it. So I'm happy with your suggestion. I'll try to implement it asap, but I'm on vacation, so my response time may be even more hectic than usuallly.

@hartwork
Copy link
Member
hartwork commented Jun 21, 2025

@kocsismate I'm looking forward to the next iteration. No need to hectify your vacation for me. I myself will also have some (on-)off-screen time in the coming 10 days. My worst expected response time will probably be 48 hours.

@kocsismate
Copy link
Contributor Author

As far as I see, all the existing review comments are now addressed. I'm now working on the tests.

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate I'm on the road with small screen, just two quick findings for now, more later:

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate made time for more review now, here's what I found:

@kocsismate kocsismate force-pushed the copy branch 3 times, most recently from d4196ea to eb17493 Compare June 29, 2025 21:02
@kocsismate
Copy link
Contributor Author

@hartwork I should have fixed all your comments by now. I'll wait until you have the chance to review my latest commit, and then I'll reorder my commits by creating a separate commit for the CopyRangeEngine() refactoring.

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate thanks for the adjustments and extensions, here is what I found more:

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate thanks for the adjustments! I'd like to slightly change the memcmp lines as following. Part of why it feels odd as-is is because sizeof(char) is defined to be 1 by C89. That made me consider it more:

Copy link
Member
@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kocsismate thank you! 🙏

@hartwork hartwork merged commit 7686549 into uriparser:master Jul 1, 2025
9 checks passed
@kocsismate kocsismate deleted the copy branch July 1, 2025 19:37
@hartwork hartwork linked an issue Jul 1, 2025 that may be closed by this pull request
hartwork added a commit that referenced this pull request Jul 3, 2025
…-copying

copying: Add missing initial reset to destUri (follow-up to #230)
hartwork added a commit that referenced this pull request Jul 3, 2025
Do not copy `.hostData.ipFuture` unnecessarily (follow-up to #230)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicating URIs?
3 participants
0