8000 Ensure that SseEmitter outputs UTF-8 encoded text by YMNNs · Pull Request #33529 · spring-projects/spring-framework · GitHub
[go: up one dir, main page]

Skip to content

< 8000 bdi class="js-issue-title markdown-title">Ensure that SseEmitter outputs UTF-8 encoded text #33529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

YMNNs
Copy link
@YMNNs YMNNs commented Sep 12, 2024

The character encoding is not specified in the extendResponse method of SseEmitter, so the character encoding actually received by the browser is ISO_8859_1.

This PR modifies the response header in the extendResponse method to encode it in UTF-8.

Should we add a constructor that accepts other character encodings?

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Sep 12, 2024
@bclozel
Copy link
Member
bclozel commented Sep 13, 2024

I don't believe this is necessary, ISO_8859_1 is the expected encoding for servlet applications.

@bclozel bclozel closed this Sep 13, 2024
@bclozel bclozel added status: declined A suggestion or change that we don't feel we should currently apply and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Sep 13, 2024
@YMNNs
Copy link
Author
YMNNs commented Sep 13, 2024

I don't believe this is necessary, ISO_8859_1 is the expected encoding for servlet applications.

We should consider that this is an international project, and users may send information in other languages ​​through SSE, which will cause the browser to receive garbled characters.

Another reason is that line 47 (private static final MediaType TEXT_PLAIN = new MediaType("text", "plain", StandardCharsets.UTF_8);) has specified the encoding as UTF-8, and using two encodings in the same class will destroy consistency. I believe this situation is caused by omissions in coding.

@YMNNs
Copy link
Author
YMNNs commented Sep 13, 2024

@bclozel What do you think then?

@bclozel
Copy link
Member
bclozel commented Sep 13, 2024

The TEXT_PLAIN media type you're pointing is not sent with the response, but merely used to let the message converters know how they should convert the data using UTF-8.

I think server sent events should always be decoded as UTF-8 content, so there is no need to add a charset parameter to the media type. If the data is garbled, it means that you didn't serialize data properly in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: declined A suggestion or change that we don't feel we should currently apply
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0