Description
Q | A |
---|---|
Bug report? | no |
Feature request? | yes |
BC Break report? | unsure |
RFC? | yes |
Symfony version | all |
It would be good to improve XmlEncoder so that it does not wrap content in a CDATA section, but provide some means for the user to direct the encoder to wrap specified content in a CDATA.
The following function currently determines whether or not to wrap:-
/**
* Checks if a value contains any characters which would require CDATA wrapping.
*
* @param string $val
*
* @return bool
*/
private function needsCdataWrapping($val)
{
return 0 < preg_match('/[<>&]/', $val);
}
The problem with this function is that the "<", ">" and "&" characters are a poor signal that the content should be wrapped in a CDATA. The xml-spec is somewhat clear about this:-
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively.
We should interpret this as meaning that the aforementioned characters appearing in textual content of an element must be escaped as entity or character references. We should not interpret it as meaning that such content should be wrapped in a CDATA.
That same doc has a good example of when to use CDATA section:-
<![CDATA[<greeting>Hello, world!</greeting>]]>
There should be a way to serialise to a CDATA section for such a use case as that example, but this decision should not be taken by the encoder in the manner done currently.