E533 SOLR-16369: avoid xpath in parsing elevate.xml by noblepaul · Pull Request #994 · apache/solr · GitHub
[go: up one dir, main page]

Skip to content

SOLR-16369: avoid xpath in parsing elevate.xml#994

Closed
noblepaul wants to merge 2 commits intoapache:mainfrom
noblepaul:elevation_no_xpath
Closed

SOLR-16369: avoid xpath in parsing elevate.xml#994
noblepaul wants to merge 2 commits intoapache:mainfrom
noblepaul:elevation_no_xpath

Conversation

@noblepaul
Copy link
Contributor

No description provided.

@hxe-m
Copy link
hxe-m commented Sep 2, 2022

@noblepaul - this seems to be already fixed by this PR.

@dsmiley
Copy link
Contributor
dsmiley commented Sep 2, 2022

I'm glad we agree on XPath being not particularly useful for elevate.xml parsing (because it's so simple). @heythm and I did multiple rounds of code review in #962 ; I would prefer to use that PR. It also clarified some code readability and made it not use XmlConfigFile either. @noblepaul could you please review #962, especially just the QueryElevationComponent file.

#962 has evolved to one PR doing two distinct things that perhaps should be separated: QueryElevationComponent changes, and XmlConfigFile changes (to use SafeXMLParsing, involving another overload of parseConfigXML). We could split it. WDYT?

@noblepaul
Copy link
Contributor Author
noblepaul commented Sep 4, 2022

@dsmiley

I think #962 is about safely loading XML. Here, the intent is to standardise XML consumption to a common API. Already solrconfig and schema are not using XPATH and DOM . However, I would prefer to implement another ConfigNode impl that parses and loads data using SAX. SAX is way safer , faster & cheaper. Soon, we must get rid of the XmlConfigFile class altogether. It actually is not serving any purpose

@dsmiley
Copy link
Contributor
dsmiley commented Sep 4, 2022

ConfigNode (which I've never heard of until today) is just an interface; I don't see SAX in there. I do see DOMConfigNode which this PR is using, so DOM is still used (not SAX); right? BTW I have no complains of DOM really; I'm just trying to reconcile what you said vs what your PR here does.

Your PR leaves in place the use of XmlConfigFile for parsing elevate.xml. I like that #962 does away with that and uses SafeXMLParsing to produce the DOM Document. It'd be nice if XmlConfigFile could ultimately just go away. Maybe DOMConfigNode could/should be incorporated into this approach as well?

@dsmiley
Copy link
Contributor
dsmiley commented Sep 4, 2022

Nevermind RE SAX; I didn't read your message carefully enough ;-)

Any way, I like use of SafeXMLParsing in particular and the dis-use of XmlConfigFile so I think the approach in #962 is much further there.

@noblepaul
Copy link
Contributor Author

Both the tickets are complimentary. #962 is about parsing xml and getting rid of XmlConfigFIle . I fully endorse it. This PR is just about getting rid of XPath and use a common API for consuming the content.

@dsmiley
Copy link
Contributor
dsmiley commented Sep 5, 2022

I'm glad you endorse #962 but there is a clear conflict between both PRs on QueryElevationComponent. Maybe #962 can be reduced in scope to only XmlConfigFile changes, and maybe here you can avoid XmlConfigFile altogether?

@hxe-m
Copy link
hxe-m commented Sep 5, 2022

The main idea behind the other work is to update the parsing of elevate.xml. This induced cleaning up some things: removing XmlConfigFile and Xpath from QEC.

Suggestion: We can merge the other PR where we can consider that the xpath work done there is a prework to this one. There won't be many conflicts to resolve in this one anyway. WDYT?

@noblepaul
Copy link
Contributor Author

The main idea behind the other work is to update the parsing of elevate.xml. This induced cleaning up some things: removing XmlConfigFile and XPath from QEC.

Maybe #962 can be reduced in scope to only XmlConfigFile changes,

I like the fact that #962 gets rid of XmlConfigFile . It should not be limited to just elevate.xml , we need to expand its scope to solr.xml, schema.xml & solrconfig.xml. elevate.xml is a very simple XML file and it doesn't really matter how we actually parse it. I just wanted all configuration XMLs to use the standard pattern.

We can merge the other PR where we can consider that the xpath work done there is a prework to this one.

We can do in any order. It's OK. BTW can you please take a look at how the work done in #962 can be applied to the other XML files as well

@madrob
Copy link
Contributor
madrob commented Sep 6, 2022
  1. it's strange to be talking about a different PR on this PR instead of that one
  2. can we keep scope small instead of scope creeping on the PRs

@dsmiley
Copy link
Contributor
dsmiley commented Sep 6, 2022

Let's cancel this PR but keep the JIRA issue. @heythm I propose that you create a new PR for SOLR-16369 "Avoid XPath in parsing elevate.xml" with code from #962. It also avoids XmlConfigFile which is nice. I don't see the point of using the new "ConfigNode" in QEC.

@noblepaul
Copy link
Contributor Author

I don't see the point of using the new "ConfigNode" in QEC.

ConfigNode API is not new . It has been there for a while and it's already used in other places

Today we do it using XPath. So we have to move off of XPath (that is the primary motivation)

@dsmiley
Copy link
Contributor
dsmiley commented Sep 6, 2022

I don't mean to say ConfigNode has no purpose, just that in parsing elevation.xml in particular, there is no point. ConfigNode seems best suited for primary configuration files -- solr.xml, solrconfig.xml, schema.xml as there are substitutable properties and sometimes "overlay" which ConfigNode has some features. For simple data config files (like elevate.xml), plain XML processing is fine without unique Solr abstractions.

@noblepaul
Copy link
Contributor Author

ConfigNode seems best suited for primary configuration files

agree.

For simple data config files (like elevate.xml), plain XML processing is fine without unique Solr abstractions.

I agree with this as well. However, this is a slippery slope. We may end up with another feature which has an XML configuration and we will start debating whether the xml file is complex enough to warrant the use of ConfigNode API.

Anyhow, I'm +1 for anything as long as I don't see XPath

@noblepaul
Copy link
Contributor Author

refer #1002

@noblepaul noblepaul deleted the elevation_no_xpath branch February 14, 2023 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

0