8000 Introduce the security-experimental CodeQL suite and experimental tag by turbo · Pull Request #11702 · github/codeql · GitHub
[go: up one dir, main page]

Skip to content

Introduce the security-experimental CodeQL suite and experimental tag #11702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 11, 2023

Conversation

turbo
Copy link
Member
@turbo turbo commented Dec 14, 2022

Note
👋 all that have been pinged by this. This requires a (hopefully) short look, but a look nonetheless from all language teams. If you have any questions or want to discuss this, please use our internal Slack channels 🙂.

This PR implements the experimental CodeQL suite, by introducing a new suite selector security-experimental and suite implementations for all supported (and beta) languages. For more details and caveats, please see our internal roadmap item.

Goal of the experimental suite

The security-experimental suite captures all queries captured by security-extended, but adds all existing queries that:

  • were present in each language's experimental folder
  • were tagged security

The experimental suite is not going to be advertised for use by code scanning users; not as part of the workflow, docs, neither the setup experience. It is meant as an internal or field testing shortcut. It does not change our current support level for experimental queries (which is "none"), and it does not impose any restrictions on contributions or experimental query writing, with the exception of adding a tag introduced by this PR.

Structure of this change

I've structured this PR in three distinct commits:

  1. 4ec401a introduces the tagging mechanism. Previously, experimental queries were just put into each language's experimental directory, each with a different structure. The standard suite selectors have excluded this directory by path, instead of metadata/tags. This commit adds a new experimental tag to all applicable queries (see Goal section above for criterion used) to allow for a change to select by tags instead
  2. b35a1d4 is a docs change and updates the contribution guidelines to require the experimental tag, as well as the experimental doc stub to indicate the existence of the experimental suite
  3. 5fd5ebc finally creates the security-experimental suite selector, and implementations for all supported languages. Plus Swift, which doesn't have an experimental folder, but gets this change anyway in anticipation.

It's worth expanding on how the logic in 3. works: We still exclude everything in all experimental directories, but then explicitly re-include all queries tagged both security and experimental. This ignores quality, debugging, and other queries that shouldn't be considered for this and allows some amount of freedom of use of the experimental folder without every single query immediately ending up in this suite.

Things to review in detail

While each single change is simple, the combination may cause complex consequences to arise, so please pay close attention to the following, in addition to the general review and comments 🙂

  • None of the changes in this PR should in any way affect the current code-scanning, security-extended, or security-and-quality suites, how users use them, and what results they produce. I'm somewhat confident that's the case, but I want to make sure.
  • Have I tagged something by mistake (as in it's not both an experimental and a security query)?
  • Have I forgotten to tag something?
  • Does this accidentally include a query for which a better/promoted version already exists? Then this should be excluded again
  • The query in question is known to cause issues (timeouts, crashes, blatantly wrong results). Note that these should be issues that massively degrade or block an analysis. It’s not required to apply the same standards that we have for our other suites (see Goal). For example, I've inherited the "slow queries" exclusion in the C++ query suite implementation, but may have missed that for other languages (or one needs to be created).
  • The query doesn’t produce any actual information or help files don’t exist and would be required, then this should be removed for now.

Timeline

I'll be offline until Dec 8000 17 to Jan 9, but please do add your review and questions in the meantime if you are around. I'm aiming to address most of them within January so this can ship in the same timeframe.

cc @NickLiffen

@turbo turbo added the enhancement New feature or request label Dec 14, 2022
@turbo turbo requested a review from adityasharad December 14, 2022 23:15
@turbo turbo self-assigned this Dec 14, 2022
@turbo turbo changed the title Introduce the security-extended CodeQL suite and experimental tag Introduce the security-experimental CodeQL suite and experimental tag Dec 14, 2022
@turbo turbo marked this pull request as ready for review December 15, 2022 12:48
@turbo turbo requested review from a team as code owners December 15, 2022 12:48
@geoffw0
Copy link
Contributor
geoffw0 commented Dec 15, 2022

The query in question is known to cause issues (timeouts, crashes, blatantly wrong results).

I'd be surprised if there aren't one or two CPP experimental queries with performance issues on larger projects, but these issues aren't exactly 'known', we don't have a list of them. Is it worth me putting aside an hour or two to identify them now? Or just deal with things as and when problems are reported?

@turbo
Copy link
Member Author
turbo commented Dec 15, 2022

one or two CPP experimental queries
an hour or two

That sounds reasonable, but if you find yourself spending more time on it, it's OK to just exclude them 🙂

@geoffw0
Copy link
Contributor
geoffw0 commented Dec 16, 2022

After a bit of experimentation: cpp/wrong-use-of-the-umask is the only experimental CPP query giving me consistent performance issues. I suggest we remove that from the experimental suite (and perhaps add a note somewhere that it can be problematic).

CPP and I guess Swift LGTM. I will be away for Christmas so please don't wait for my final 👍.

@erik-krogh
Copy link
Contributor

You haven't added a javascript-security-experimental.qls file in javascript/ql/src/codeql-suites/?

@turbo
Copy link
Member Author
turbo commented Dec 18, 2022

@erik-krogh fixed

@geoffw0 excluded

Copy link
Contributor
@erik-krogh erik-krogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS 👍

All the new queries perform OK.
And we will get way more experimental queries once I finish creating some heuristic variants of our ordinary queries.

Copy link
Contributor
@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for Ruby

@calumgrant calumgrant requested a review from tausbn January 3, 2023 10:29
Copy link
Contributor
@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 CPP and Swift.

@cgaudette
Copy link

Will Java be an included language for these query pack?

Copy link
Contributor
@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go and Java 👍

@turbo
Copy link
Member Author
turbo commented Jan 9, 2023

@cgaudette Please see the diff - all languages are included.

Thanks for all your stamps. Now waiting for a final review from Python, C#, core (@adityasharad) and a stamp from another PM.

Copy link
Contributor
@tausbn tausbn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoff and I looked at the performance data for Python, and agree that it doesn't look like there are any huge performance issues lurking around.

@michaelnebel
Copy link
Contributor
michaelnebel commented Jan 11, 2023

C#: The changes look good, but I would like to run the new suite against our existing DCA projects. I will circle back on this.

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?
With the current structure of the experimental queries folder there might be other queries that are not security/experimental queries (eg. debugging like queries - which are truly internal), so it will require some restructuring. Even with the unsupported elevation of security/experimental queries we could consider to separate these out from the other type of queries that we might have in the experimental folder.

@michaelnebel michaelnebel self-requested a review January 11, 2023 09:04
Copy link
Contributor
@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# LGTM

@michaelnebel
Copy link
Contributor

The DCA experiment to run the new query suite did terminate in time.
@turbo : Do you want me to dig a bit deeper and compare the execution time of the experiment combined with the security extended and pinpoint the slowest queries?

@turbo
Copy link
Member Author
turbo commented Jan 11, 2023

@michaelnebel Thanks for looking into this, I think since the evaluation succeeded, we can move ahead with this and address any lower-level performance issues as they arise later.

On:

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?

We can definitely do that in a separate batch of work. If I understand your suggestion correctly and we were to implement this right now, that would actually pull in the non-security experimental queries into the existing suite selectors. To prevent that a thorough cleanup of the directory structure and tags would be in order, which is definitely worth doing, but probably not required now. Or did I misunderstand and you actually see this current PR impacting the existing selectors (something I wanted to avoid)?

@michaelnebel
Copy link
Contributor
michaelnebel commented Jan 11, 2023

@michaelnebel Thanks for looking into this, I think since the evaluation succeeded, we can move ahead with this and address any lower-level performance issues as they arise later.

On:

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?

We can definitely do that in a separate batch of work. If I understand your suggestion correctly and we were to implement this right now, that would actually pull in the non-security experimental queries into the existing suite selectors. To prevent that a thorough cleanup of the directory structure and tags would be in order, which is definitely worth doing, but probably not required now. Or did I misunderstand and you actually see this current PR impacting the existing selectors (something I wanted to avoid)?

I completely agree that it shouldn't be a part of this PR.
With the in 9E88 troduction of the experimental tag and the introduction of these queries in a selector based on the tag, the purpose of the experimental folder has become dual: It both contains experimental security queries that we want to use sometimes and experimental queries that are only for internal use.
That is, we could imagine something like:
Move all experimental queries that are not be exposed into an internal folder. Then we need to change all selectors accordingly.

@turbo turbo merged commit 4e1f772 into main Jan 11, 2023
@turbo turbo deleted the turbo/experimental/combined branch January 11, 2023 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants
0