-
Notifications
You must be signed in to change notification settings - Fork 19
Adding Sanitizer interface and implementation #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@alexeykudinkin Hi, can you take a look? I have addressed the comments from the previous PR. LMK if you have any others. |
|
looks like unit tests are failing. Working on fixing |
|
@alexeykudinkin Hi, can you take a look? I have fixed all the tests. Thanks in advance. |
|
@alexeykudinkin @andrewmains12 Hello, sorry to bother you guys again. I would really appreciate if you can help review it. |
|
@longquanzheng we would also need to write a benchmark for this one. |
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizerImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizerImpl.java
Outdated
Show resolved
Hide resolved
| * StringSanitizer is to sanitize strings | ||
| * It has a Sanitize method which returns a sanitized version of the input string value. | ||
| */ | ||
| public interface StringSanitizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need standalone interface for this, Function interface should be reflective
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed. It's too heavy for a function to have interface.
You meant functional interface/lambda, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Function<String,String>
core/src/main/java/com/uber/m3/tally/sanitizers/SanitizeRange.java
Outdated
Show resolved
Hide resolved
|
|
||
| private SanitizeRange(char low, char high) { | ||
| this.low = low; | ||
| this.high = high; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add precondition asserting that high >= low
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an example of "Precondition"(maybe I didn't find in a correct way) so I throw a Runtime exception here. LMK if you want to assert in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, precondition reference was mostly notional (please do not import Guava for that)
|
@alexeykudinkin thanks so much for your reviews. I have addressed your comments except bench tests. Could you give me more about how to write/add it? |
| protected String separator = DEFAULT_SEPARATOR; | ||
| protected ImmutableMap<String, String> tags; | ||
| protected Buckets defaultBuckets = DEFAULT_SCOPE_BUCKETS; | ||
| protected ScopeSanitizer sanitizer = new ScopeSanitizerBuilder().build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this opt-in -- by default sanitizer should be no-op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean having it as null by default in the builder?
This would be a little bit tricky. Then the ScopeImpl will have to do this null check everywhere, is that okay? (It would be easier if this is Kotlin and we can use ? operator, sigh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVM, looks like I will apply another comment from you and Andrey in #97 (comment)
|
@longquanzheng LGTM, minor comment. Regarding benchmarks, you can take a look at the benchmarks package for some examples; |
| } | ||
|
|
||
| ImmutableMap.Builder<String, String> builder = new ImmutableMap.Builder<>(); | ||
| if (tags != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this check is redundant. If tags is null it will fail on tags.entrySet()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather re-phrase this: @longquanzheng please move this check at the beginning to make sure there's no NPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good point. Thanks
|
|
||
| private CharRange(char low, char high) { | ||
| if (low > high) { | ||
| throw new RuntimeException("invalid CharRange"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/RuntimException/IllegalArgumentException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
| private final Function<String, String> keySanitizer; | ||
| private final Function<String, String> valueSanitizer; | ||
|
|
||
| SanitizerImpl(Function<String, String> nameSanitizer, Function<String, String> keySanitizer, Function<String, String> valueSanitizer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd make arguments name more explicit: tagKeySanitizer, tagValueSanitizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good catch -- I should have done it, hehe
| */ | ||
| public class ScopeSanitizerBuilder { | ||
|
|
||
| private Function<String, String> nameSanitizer = value -> value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider Function.identity()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! TIL
|
|
||
| /** | ||
| * The SanitizerBuilder returns a Sanitizer for the name, key and value. By | ||
| 3411 | * default, the name, key and value sanitize functions returns all the input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe create a NoopSanitizer implements ScopeSanitizer and use it as a default one instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like this idea better.
Otherwise having null in ScopeImpl will require lots of null checking in the code.
| @@ -0,0 +1,89 @@ | |||
| // Copyright (c) 2020 Uber Technologies, Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls update copyright to 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| import static org.junit.Assert.assertEquals; | ||
| import static org.junit.Assert.assertNotNull; | ||
|
|
||
| public class SanitizeRangeTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/SanitizeRangeTest/CharRangeTest
| private static final char HIGH = 'z'; | ||
|
|
||
| @Test | ||
| public void sanitizeRange() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a test when HIGH is smaller than LOW and maybe when HIGH == LOW
|
|
||
| import static org.junit.Assert.assertEquals; | ||
|
|
||
| public class SanitizerTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/SanitizerTest/SanitizerImplTest or ScopeSanitizerTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to ScopeSanitizerTest
| .withKeyCharacters( | ||
| ValidCharacters.of( | ||
| ValidCharacters.ALPHANUMERIC_RANGE, | ||
| ValidCharacters.UNDERSCORE_DASH_CHARACTERS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc dot is also a valid character but not recommended
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to UNDERSCORE_DASH_CHARACTERS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're modifying ScopeImpl default behaviour (even though it's a noop by default) would be nice to see how does it affect the benchmarks.
You can run those benchmark tests using the following command:
./gradlew tally-core:runJmhTests
Could you run it without and with your changes and add both results to the PR?
It would be nice to have separate benchmarks for sanitizer implementation though, but I don't think it has to be a part of this PR. @alexeykudinkin wdyt?
you can find more info/examples on JMH benchmarks here
|
Thanks both for the review. I will address the comments this week. |
|
@alexeykudinkin @SokolAndrey thank you so much again for the reviewing. I have addressed all the comments. Below is the bench results on my laptop: Before the PR(using master) After the PR: My second run on current PR: |
|
@longquanzheng i assume these runs are w/ no-op sanitizer? Can you also paste Benchmarks w/ the sanitizer actually used? |
Yes. Thanks for pointing it out. |
Just ran with the commit of sanitizer: |
|
@longquanzheng i don't think we have any Benchmarks before targeting specifically code paths you're changing. Let's make sure that Benchmarks are stressing code paths that are changing. |
|||
I got what you mean. Just added benchmark code. Can you take another look? |
|
@alexeykudinkin here is the results:
|
| @@ -0,0 +1,4 @@ | |||
| Benchmark Mode Cnt Score Error Units | |||
| ScopeImplBenchmark.recordingBenchmark thrpt 10 74.819 ± 4.573 ops/ms | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check how other benchmarks are reported (we need to capture all data-points offered by JMH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't really understand how exactly to do it.
Would you mind taking it over? I am not convinced that this should be done by this PR, as the benchmark is already missing. The most problem is I don't have a clear picture of what you really want for benchmarking(with out detailed documentation), and it should be much easier if your team just do it.
LMK if you agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be done by this PR. This is a critical library laying in the hot-path of execution of many services, hence we need to maintain the focus on its performance.
Over the last year a lot of efforts have been put in to optimize, streamline, and make this library robust to serve the needs it was originally built for. Therefore, as a new contribution guideline we require any non-trivial change to adhere to the same basic principles of assuring library's correctness and performance.
Totally appreciate the amount of incremental effort that is required to adhere to this heightened standards from every contribution, but unfortunately there's no other way to guarantee high-level of reliability and performance of the open-source library otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check how other benchmarks are reported
Can you give more details here? I did check but don't know how that works(though I see other reports have the details like gc time).
For example another benchmark is just
public class M3ReporterBenchmark extends AbstractReporterBenchmark. So I don't know what am I missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SokolAndrey can you help in here?
| .reportEvery(Duration.MAX_VALUE); | ||
| } | ||
|
|
||
| public void recordTestMetrics(final ScopeImpl scope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a good routine to check -- this was used as pre-init seq, and i would suggest to keep it as such.
Instead create separate benchmarks and routines to update counter/gauge/histogram separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as my pervious response. I appreciate your time reviewing and commenting it.
But it would be nice if you can help the benchmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my comment above. We're more than happy to help w/ guidance, but benchmarking is now a standard requirement from non-trivial contributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will revert that one back.
Can you give me details about what you want to benchmark? -- which method/function/routines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i have called out prior, let's test creation and recording fro each type of metric individually (counter/gauge/histogram).
// Counter
scope.counter("counter").inc(1)
// Timer
scope.timer("timer").record(...)
// Histogram
...
This is replacement of #70