-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
Impossible to select non-DOMElements #16933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@EdgarPE Filtered nodes are added to a new crawler instance. That's the reason why the patch you referenced mentions "Prevent adding non-DOMElement elements in DomCrawler". We can't implement what you're asking for. We could try to throw an exception if someone tries to use a non-DOMElement in a wrong context (in a form for example). I haven't looked how complex this could be, and it might be too late for this change now. See #16058 for reasons why this change was introduced. |
So is it expected behavior that now I can do |
Hopefully the Pull request will be merged soon. If you need to update to 2.8 or 3.0 now, one workaround is to select the supported node (//div), get the DOMElement object, then use xpath to filter the DOMText object. It's not ideal, but it works. |
Thanks for your quick response, I will be glad if you also show me how to get the Also does that mean if your PR will be merged |
@armpogart |
…ion of every DOMNode object (EdgarPE) This PR was squashed before being merged into the 2.8 branch (closes #17035). Discussion ---------- [DomCrawler] Revert previous restriction, allow selection of every DOMNode object | Q | A | ------------- | --- | Bug fix? | no | New feature? | yes, revert to previous behaviour | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #16933 | License | MIT | Doc PR | This is a backport of PR #17021 Commits ------- d2872a3 [DomCrawler] Revert previous restriction, allow selection of every DOMNode object
This patch: 9f362a1 introduced a serious limitation, namely one can not select non-DOMElements. For example, now it is impossible to select only the text content of a specific node, like this:
$node->filterXPath('//text()')
The selected node never gets returned because of this:
I suggest a solution, where adding non-DOMElements are prevented but selecting one is possible. Is the community open for that? If yes, I am willing to work on it.
The text was updated successfully, but these errors were encountered: