Introduce the security-experimental CodeQL suite and experimental tag #11702

turbo · 2022-12-14T23:15:58Z

Note
👋 all that have been pinged by this. This requires a (hopefully) short look, but a look nonetheless from all language teams. If you have any questions or want to discuss this, please use our internal Slack channels 🙂.

This PR implements the experimental CodeQL suite, by introducing a new suite selector security-experimental and suite implementations for all supported (and beta) languages. For more details and caveats, please see our internal roadmap item.

Goal of the experimental suite

The security-experimental suite captures all queries captured by security-extended, but adds all existing queries that:

were present in each language's experimental folder
were tagged security

The experimental suite is not going to be advertised for use by code scanning users; not as part of the workflow, docs, neither the setup experience. It is meant as an internal or field testing shortcut. It does not change our current support level for experimental queries (which is "none"), and it does not impose any restrictions on contributions or experimental query writing, with the exception of adding a tag introduced by this PR.

Structure of this change

I've structured this PR in three distinct commits:

4ec401a introduces the tagging mechanism. Previously, experimental queries were just put into each language's experimental directory, each with a different structure. The standard suite selectors have excluded this directory by path, instead of metadata/tags. This commit adds a new experimental tag to all applicable queries (see Goal section above for criterion used) to allow for a change to select by tags instead
b35a1d4 is a docs change and updates the contribution guidelines to require the experimental tag, as well as the experimental doc stub to indicate the existence of the experimental suite
5fd5ebc finally creates the security-experimental suite selector, and implementations for all supported languages. Plus Swift, which doesn't have an experimental folder, but gets this change anyway in anticipation.

It's worth expanding on how the logic in 3. works: We still exclude everything in all experimental directories, but then explicitly re-include all queries tagged both security and experimental. This ignores quality, debugging, and other queries that shouldn't be considered for this and allows some amount of freedom of use of the experimental folder without every single query immediately ending up in this suite.

Things to review in detail

While each single change is simple, the combination may cause complex consequences to arise, so please pay close attention to the following, in addition to the general review and comments 🙂

None of the changes in this PR should in any way affect the current code-scanning, security-extended, or security-and-quality suites, how users use them, and what results they produce. I'm somewhat confident that's the case, but I want to make sure.
Have I tagged something by mistake (as in it's not both an experimental and a security query)?
Have I forgotten to tag something?
Does this accidentally include a query for which a better/promoted version already exists? Then this should be excluded again
The query in question is known to cause issues (timeouts, crashes, blatantly wrong results). Note that these should be issues that massively degrade or block an analysis. It’s not required to apply the same standards that we have for our other suites (see Goal). For example, I've inherited the "slow queries" exclusion in the C++ query suite implementation, but may have missed that for other languages (or one needs to be created).
The query doesn’t produce any actual information or help files don’t exist and would be required, then this should be removed for now.

Timeline

I'll be offline until Dec 8000 17 to Jan 9, but please do add your review and questions in the meantime if you are around. I'm aiming to address most of them within January so this can ship in the same timeframe.

cc @NickLiffen

…ries with an experimental tag

…ew tagging system

…ementations

geoffw0 · 2022-12-15T15:12:08Z

The query in question is known to cause issues (timeouts, crashes, blatantly wrong results).

I'd be surprised if there aren't one or two CPP experimental queries with performance issues on larger projects, but these issues aren't exactly 'known', we don't have a list of them. Is it worth me putting aside an hour or two to identify them now? Or just deal with things as and when problems are reported?

turbo · 2022-12-15T15:16:15Z

one or two CPP experimental queries
an hour or two

That sounds reasonable, but if you find yourself spending more time on it, it's OK to just exclude them 🙂

geoffw0 · 2022-12-16T18:54:51Z

After a bit of experimentation: cpp/wrong-use-of-the-umask is the only experimental CPP query giving me consistent performance issues. I suggest we remove that from the experimental suite (and perhaps add a note somewhere that it can be problematic).

CPP and I guess Swift LGTM. I will be away for Christmas so please don't wait for my final 👍.

erik-krogh · 2022-12-18T09:24:23Z

You haven't added a javascript-security-experimental.qls file in javascript/ql/src/codeql-suites/?

…ementations

…o turbo/experimental/combined

turbo · 2022-12-18T14:55:53Z

@erik-krogh fixed

@geoffw0 excluded

erik-krogh

JS 👍

All the new queries perform OK.
And we will get way more experimental queries once I finish creating some heuristic variants of our ordinary queries.

aibaars

👍 for Ruby

geoffw0

👍 CPP and Swift.

cgaudette · 2023-01-04T21:23:15Z

Will Java be an included language for these query pack?

smowton

Go and Java 👍

turbo · 2023-01-09T12:13:47Z

@cgaudette Please see the diff - all languages are included.

Thanks for all your stamps. Now waiting for a final review from Python, C#, core (@adityasharad) and a stamp from another PM.

tausbn

@yoff and I looked at the performance data for Python, and agree that it doesn't look like there are any huge performance issues lurking around.

michaelnebel · 2023-01-11T08:30:04Z

C#: The changes look good, but I would like to run the new suite against our existing DCA projects. I will circle back on this.

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?
With the current structure of the experimental queries folder there might be other queries that are not security/experimental queries (eg. debugging like queries - which are truly internal), so it will require some restructuring. Even with the unsupported elevation of security/experimental queries we could consider to separate these out from the other type of queries that we might have in the experimental folder.

michaelnebel

C# LGTM

michaelnebel · 2023-01-11T13:02:33Z

The DCA experiment to run the new query suite did terminate in time.
@turbo : Do you want me to dig a bit deeper and compare the execution time of the experiment combined with the security extended and pinpoint the slowest queries?

turbo · 2023-01-11T13:23:18Z

@michaelnebel Thanks for looking into this, I think since the evaluation succeeded, we can move ahead with this and address any lower-level performance issues as they arise later.

On:

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?

We can definitely do that in a separate batch of work. If I understand your suggestion correctly and we were to implement this right now, that would actually pull in the non-security experimental queries into the existing suite selectors. To prevent that a thorough cleanup of the directory structure and tags would be in order, which is definitely worth doing, but probably not required now. Or did I misunderstand and you actually see this current PR impacting the existing selectors (something I wanted to avoid)?

michaelnebel · 2023-01-11T14:25:37Z

@michaelnebel Thanks for looking into this, I think since the evaluation succeeded, we can move ahead with this and address any lower-level performance issues as they arise later.

On:

Also a question: Are there any plans on changing the existing selectors such that they are not explicitly excluding the experimental queries by folder but instead using tags?

We can definitely do that in a separate batch of work. If I understand your suggestion correctly and we were to implement this right now, that would actually pull in the non-security experimental queries into the existing suite selectors. To prevent that a thorough cleanup of the directory structure and tags would be in order, which is definitely worth doing, but probably not required now. Or did I misunderstand and you actually see this current PR impacting the existing selectors (something I wanted to avoid)?

I completely agree that it shouldn't be a part of this PR.
With the in 9E88 troduction of the experimental tag and the introduction of these queries in a selector based on the tag, the purpose of the experimental folder has become dual: It both contains experimental security queries that we want to use sometimes and experimental queries that are only for internal use.
That is, we could imagine something like:
Move all experimental queries that are not be exposed into an internal folder. Then we need to change all selectors accordingly.

turbo added 3 commits December 14, 2022 17:15

Tag all security queries in supported languages' experimental directo…

4ec401a

…ries with an experimental tag

Adjust docs referring to experimental queries to include details on n…

b35a1d4

…ew tagging system

Create security-experimental suite helper and all language suite impl…

5fd5ebc

…ementations

turbo added the enhancement New feature or request label Dec 14, 2022

turbo requested a review from adityasharad December 14, 2022 23:15

turbo self-assigned this Dec 14, 2022

github-actions bot added C# C++ documentation Go Java JS Python Ruby Swift labels Dec 14, 2022

turbo changed the title ~~Introduce the security-extended CodeQL suite and experimental tag~~ Introduce the security-experimental CodeQL suite and experimental tag Dec 14, 2022

turbo marked this pull request as ready for review December 15, 2022 12:48

turbo requested review from a team as code owners December 15, 2022 12:48

turbo added 3 commits December 18, 2022 15:44

Create security-experimental suite helper and all language suite impl…

1e5426f

…ementations

Merge remote-tracking branch 'origin/turbo/experimental/combined' int…

b7c3373

…o turbo/experimental/combined

Exclude cpp/wrong-use-of-the-umask

d1d4163

erik-krogh approved these changes Dec 19, 2022

View reviewed changes

calumgrant requested review from aibaars and yoff December 19, 2022 09:42

erik-krogh mentioned this pull request Dec 19, 2022

JS: Add experimental variants of common security queries with more sources #11582

Merged

aibaars approved these changes Dec 19, 2022

View reviewed changes

calumgrant requested a review from tausbn January 3, 2023 10:29

geoffw0 approved these changes Jan 3, 2023

View reviewed changes

smowton approved these changes Jan 5, 2023

View reviewed changes

tausbn approved these changes Jan 9, 2023

View reviewed changes

adityasharad approved these changes Jan 9, 2023

View reviewed changes

sj approved these changes Jan 10, 2023

View reviewed changes

michaelnebel self-requested a review January 11, 2023 09:04

michaelnebel approved these changes Jan 11, 2023

View reviewed changes

Merge branch 'main' into turbo/experimental/combined

c3116b3

turbo merged commit 4e1f772 into main Jan 11, 2023

turbo deleted the turbo/experimental/combined branch January 11, 2023 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce the security-experimental CodeQL suite and experimental tag #11702

Introduce the security-experimental CodeQL suite and experimental tag #11702

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Introduce the security-experimental CodeQL suite and experimental tag #11702

Introduce the security-experimental CodeQL suite and experimental tag #11702

Uh oh!

Conversation

Goal of the experimental suite

Structure of this change

Things to review in detail

Timeline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!