E525 SOLR-16296: Load elevate.xml in a more secure way by hxe-m · Pull Request #962 · apache/solr · GitHub
[go: up one dir, main page]

Skip to content

SOLR-16296: Load elevate.xml in a more secure way#962

Merged
dsmiley merged 34 commits intoapache:mainfrom
hxe-m:hkhiri/elevate
Sep 7, 2022
Merged

SOLR-16296: Load elevate.xml in a more secure way#962
dsmiley merged 34 commits intoapache:mainfrom
hxe-m:hkhiri/elevate

Conversation

@hxe-m
Copy link
@hxe-m hxe-m commented Aug 4, 2022

https://issues.apache.org/jira/browse/SOLR-16296

This removes the use of XmlConfigFile by QueryElevationComponent.

Comment on lines +384 to +385
inputStream = core.getResourceLoader().openResource(configFileName);
xmlDocument = SafeXMLParsing.parseUntrustedXML(log, inputStream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I completely overlooked something (my bad). I was initially concerned about using XmlConfigFile because it did not use Solr's SafeXMLParsing utility, and I could also see it enabling some XInclude & entity resolver stuff, which concerned me in its use for elevate.xml. But this doesn't mean it's not "safe". Upon further inspection, it appears to be doing some of the same stuff that SafeXMLParsing does in the parseConfigXML method, which is relatively safe as it doesn't escape the resource loader. Do you see @HaythemKh ? If you agree, do you think XmlConfigFile could be modified to call SafeXMLParsing.parseConfigXml so that the safe-ness is clear? If not, we could at least leave a comment there.

At this place in QEC, we could replace these two lines with SafeXMLParsing.parseConfigXml and we would in fact continue to support these somewhat exotic XML features in a "safe" way.

Copy link
Author
@hxe-m hxe-m Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About modifying XmlConfigFile with SafeXMLParsing#parseConfigXML; It is indeed possible to apply these changes. However, in XmlConfigFile, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. Unlike SafeXMLParsing#parseConfigXML where we enable it by default. And, XmlConfigFile is used for reading Solr config files (see SolrConfig#readXml). would that have some impact?
  • SafeXMLParsing#parseConfigXML enables xinclude by default, whereas SafeXMLParsing#parseUntrustedXML doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uschindler -- you added the SafeXMLParsing library. My instinct is use use the parseConfigXML method for the elevate.xml because it conveniently takes a SolrResourceLoader. It supports x:include too but so-what; right? It's scoped to the SRL so I don't see how it could be a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i think that's the way to go. parseConfigXML is for stuff that is part of Solr's config. parseUntrustedXML is for stuff coming over the network like a POSTed document.
As all Solr config files supoort xinclude, it should also be supported for elevantion.xml

Copy link
Contributor
@uschindler uschindler Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • About modifying XmlConfigFile with SafeXMLParsing#parseConfigXML; It is indeed possible to apply these changes. However, in XmlConfigFile, we authorize the non-availability of SystemId so as not to activate xinclude for this purpose. Unlike SafeXMLParsing#parseConfigXML where we enable it by default. And, XmlConfigFile is used for reading Solr config files (see SolrConfig#readXml). would that have some impact?

We have a valid system id because the ResourceLoader provides one. XInclude will then go through SystemIdResolver provided.

  • SafeXMLParsing#parseConfigXML enables xinclude by default, whereas SafeXMLParsing#parseUntrustedXML doesn't. It doesn't seem very "safe", is it? I think if we don't need to activate it we don't have to.

If it is a config file, xinclude should be allowed, as this is the standard for SOlr config files. There's also no security risk in this because the xinclude only allows to load files from resource loader and not somewhere outside (it uses SystemIdResolver to resolve includes, see

throw new IOException(
"Cannot resolve absolute systemIDs / external entities (only relative paths work): "
+ systemId);
).

Copy link
Contributor
@uschindler uschindler Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course if the elevate.xml is coming over the network as part of the request, then it should be parsed with untrusted.

I am not familar with the code, but the rule for config files is:

  • if the config file is XML in ResourceLoader, it should use SafeXMLParsing#parseConfigXML with xinclude enabled for consistency
  • if the elevation file is passed as part of request body, then parse it using the request input stream using SafeXMLParsing#parseUntrustedXML

I'd tune XmlConfigFile to use those APIs. Back at that time it was too much work to me, so I left that one out. I just added the SystemIdResolver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @uschindler for the detailed context!

uschindler
uschindler previously approved these changes Aug 27, 2022
802E
@uschindler uschindler dismissed their stale review August 29, 2022 17:45

updated code, needs new review.

is(Paths.get("/path/to/solr/home/configsets").toAbsolutePath()));

NodeConfig absConfig =
SolrXmlConfig.fromString(
Copy link
Author
@hxe-m hxe-m Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need a XmlConfigFile constructor taking an inputstream as input

@hxe-m
Copy link
Author
hxe-m commented Aug 29, 2022

I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings.

Copy link
Contributor
@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done some more local refactorings but I'm thinking about either keeping an XmlConfigFile constructor taking an inputstream as input or continuing the refactorings.

If the PR doesn't get out of hand too much (i.e. increases scope a lot). Your latest change definitely did but maybe it was WIP/draft.

Copy link
Contributor
@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for scaling back the changes to just a few files. Tests pass for me; I think this is ready to merge.

Refactorings to simplify the unwieldily constructor parameters of XmlConfigFile can happen in follow-on work. And it's inconsistent/weird that XmlConfigFile only sets the zkVersion field when it's passed a fileSupplier in the constructor.

@dsmiley
Copy link
Contributor
dsmiley commented Sep 2, 2022

Waiting for some input from @noblepaul to deconflict his recent PRs.

@dsmiley
Copy link
Contributor
dsmiley commented Sep 6, 2022

@noblepaul asked:

BTW can you please take a look at how the work done in #962 can be applied to the other XML files as well

Do you mean, to not use XmlConfigFile at all? It's on a case-by-case basis what work is involved there. I think your XPath removing work could have incorporated the dis-use of XmlConfigFile because the former logic used many methods of XmlConfigFile whereas afterwards it's only used to pass along the XML Document. At that point, there's little need of XmlConfigFile; it could be replaced with some utility method on, say, DOMConfigNode that takes some of the same parameters of XmlConfigFile's constructors. Any way, we could have a new JIRA issue for exactly that.

@hxe-m
Copy link
Author
hxe-m commented Sep 6, 2022

Update: I will limit the scope of this work to only removing the use of XmlConfigFile from QEC following up on this discussion. And once that is merged I will open a new PR for the removal of the use of XPath.

@epugh
Copy link
Contributor
epugh commented Sep 6, 2022

Thanks for your persistence on this @heythm....

@hxe-m
Copy link
Author
hxe-m commented Sep 6, 2022

@noblepaul - I want your opinion on merging this PR? This now only includes not using XmlConfigFile in QEC, but also a small cleanup of XmlConfigFile/SafeXMLParsing for handling untrusted/trusted files.

Even if we want to delete XmlConfigFile in the long run, it wouldn't hurt not to lose those changes.

@dsmiley dsmiley merged commit 66c784f into apache:main Sep 7, 2022
@hxe-m hxe-m deleted the hkhiri/elevate branch September 7, 2022 16:40
dsmiley added a commit that referenced this pull request Sep 7, 2022
…ponent (#962)

Really a refactoring; no change in behavior or actual safety.

Co-authored-by: David Smiley <dsmiley@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

0