10000 Support filtering specific sqllogictests identified by line number by gabotechs · Pull Request #16029 · apache/datafusion · GitHub
[go: up one dir, main page]

Skip to content

Support filtering specific sqllogictests identified by line number #16029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 23, 2025

Conversation

gabotechs
Copy link
Contributor
@gabotechs gabotechs commented May 12, 2025

Which issue does this PR close?

Rationale for this change

Being able to execute only certain sqllogictest cases identified by line number.

As an example, the following command

cargo test --test sqllogictests -- aggregate:6991

will run the test in

and any other setup code needed for the test to run (CREATE TABLE, INSERT, SELECT * INTO, SET, DROP, etc...)

What changes are included in this PR?

Extend the filtering argument in the sqllogictest binary to support filtering tests identified by line number and file name substring

Are these changes tested?

N/A

Are there any user-facing changes?

Just extensions of the sqllogictest execution CLI for local development

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label May 12, 2025
@gabotechs gabotechs force-pushed the allow-filtering-specific-sqllogictests branch from 7169d21 to 9dbba35 Compare May 12, 2025 14:57
@gabotechs gabotechs marked this pull request as ready for review May 19, 2025 12:43
@gabotechs gabotechs force-pushed the allow-filtering-specific-sqllogictests branch from ae5c3c4 to 51ab0eb Compare May 19, 2025 12:45
file_substring: parts[1].to_string(),
line_number: Some(line),
}),
Err(_) => Err(format!("Cannot parse line number from '{s}'")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use DataFusionError::External(...) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one we should be fine with plain strings, as this is just used by clap for parsing whatever we provide in CLI arguments, so not really DataFusion related.

// statement might be populating tables that future test cases will use.
if !matches!(
sql_stmt.as_ref(),
SqlStatement::Query(_) | SqlStatement::Explain { .. }
Copy link
Contributor
@LiaCastaneda LiaCastaneda May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also consider set statements (SetVariable) per query like: set datafusion.sql_parser.dialect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "consider" you mean "make them skippable"? As SET statements dictate the behavior of future test cases, I don't think we can skip them

return false;
};

if matches!(statement, Statement::CreateExternalTable(_)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move this logic to another function like statement_is_skippable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice suggestion! I factored it out and it looks much clearer now

Copy link
Contributor
@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gabotechs and @LiaCastaneda -- this looks very nice to me

I had one small comment which we can do as a follow on PR (or never).

I also wonder about testing the test harness, but I think that may be unecessary given how often these tools are run

@gabotechs
Copy link
Contributor Author

I also wonder about testing the test harness, but I think that may be unecessary given how often these tools are run

My guess is that it's not too bad to not have tests in the new Filter struct, but not just my call though. What I'd do is if somebody runs into an issue, add the tests at that moment.

@alamb
Copy link
Contributor
alamb commented May 23, 2025

I agree -- we can add unit tests if there is an issue or we want to make more sophisticated behavior

@alamb alamb merged commit 17fe504 into apache:main May 23, 2025
27 checks passed
@alamb
Copy link
Contributor
alamb commented May 23, 2025

Thanks again @gabotechs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow filtering specific sqllogictests
3 participants
0