8000 Comparing v1.15.15...v1.16.0 · klauspost/compress · GitHub
[go: up one dir, main page]

Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: klauspost/compress
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.15.15
Choose a base ref
...
head repository: klauspost/compress
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v1.16.0
Choose a head ref
  • 10 commits
  • 44 files changed
  • 2 contributors

Commits on Jan 21, 2023

  1. Update README.md

    klauspost authored Jan 21, 2023
    Configuration menu
    Copy the full SHA
    fe37dc6 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2023

  1. s2c/s2sx: Use concurrent decoding (#746)

    Use concurrent decoding when recompressing and verifying.
    
    s2sx: Use concurrent decoding with single files (non-tar)
    klauspost authored Jan 24, 2023
    Configuration menu
    Copy the full SHA
    69922df View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2023

  1. s2: Support ReadAt in ReadSeeker (#747)

    Also simplifies seeking.
    klauspost authored Feb 5, 2023
    Configuration menu
    Copy the full SHA
    c847bde View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2023

  1. tests: Upgrade to Go 1.20 (#749)

    * tests: Upgrade to Go 1.20
    * Disable tests for 1.17
    * Upgrade garble in goreleaser as well.
    * Upgrade garble
    klauspost authored Feb 13, 2023
    Configuration menu
    Copy the full SHA
    0793ca1 View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2023

  1. s2: Add LZ4 block converter (#748)

    This allows converting compressed LZ4 blocks to S2 (or snappy) blocks without decompression.
    
    LZ4 -> S2 seems to be same size on average.
    LZ4 -> Snappy is usually worse.
    
    ## Single threaded performance
    
    Speed (excluding LZ4 encoding):
    
    ```
    BenchmarkLZ4Converter_ConvertBlock/html-32         	   28237	     42827 ns/op	2390.99 MB/s	       559.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/urls-32         	    2138	    541816 ns/op	1295.80 MB/s	     -3943 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/jpg-32          	  514826	      2328 ns/op	52874.24 MB/s	       482.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/jpg_200b-32     	34821668	        33.48 ns/op	5973.00 MB/s	         2.000 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/pdf-32          	  198241	      5975 ns/op	17136.81 MB/s	       136.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/html4-32        	    7002	    173440 ns/op	2361.63 MB/s	      1840 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1-32         	    5940	    196951 ns/op	 772.22 MB/s	       106.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt2-32         	    6656	    177228 ns/op	 706.32 MB/s	     -1427 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt3-32         	    2355	    510435 ns/op	 836.06 MB/s	       384.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt4-32         	    1700	    694444 ns/op	 693.88 MB/s	     -9125 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/pb-32           	   37118	     32141 ns/op	3689.60 MB/s	         1.000 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/gaviota-32      	    6961	    172253 ns/op	1070.05 MB/s	      9303 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_128b-32    	19923691	        59.82 ns/op	2139.66 MB/s	         0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_1000b-32   	 3180837	       375.2 ns/op	2665.40 MB/s	        16.00 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_10000b-32  	  184214	      6350 ns/op	1574.70 MB/s	        90.00 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_20000b-32  	   74031	     15521 ns/op	1288.54 MB/s	        -5.000 b_saved	       0 B/op	       0 allocs/op
    ```
    
    Assembly speed (amd64)
    ```
    BenchmarkLZ4Converter_ConvertBlock/html-32         	   47457	     24463 ns/op	4185.89 MB/s	       559.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/urls-32         	    3506	    330277 ns/op	2125.75 MB/s	     -3943 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/jpg-32          	  450177	      2718 ns/op	45294.89 MB/s	       482.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/jpg_200b-32     	76887589	        15.52 ns/op	12887.16 MB/s	         2.000 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/pdf-32          	  279540	      4322 ns/op	23694.21 MB/s	       136.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/html4-32        	   10000	    107485 ns/op	3810.75 MB/s	      1840 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1-32         	   10000	    117764 ns/op	1291.47 MB/s	       106.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt2-32         	   10000	    100578 ns/op	1244.60 MB/s	     -1427 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt3-32         	    3793	    313021 ns/op	1363.34 MB/s	       384.0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt4-32         	    2988	    399888 ns/op	1204.99 MB/s	     -9125 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/pb-32           	   57486	     19277 ns/op	6151.76 MB/s	         1.000 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/gaviota-32      	   10000	    115641 ns/op	1593.90 MB/s	      9303 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_128b-32    	38400122	        31.21 ns/op	4101.10 MB/s	         0 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_1000b-32   	 6509028	       179.9 ns/op	5559.38 MB/s	        16.00 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_10000b-32  	  368212	      3244 ns/op	3082.28 MB/s	        90.00 b_saved	       0 B/op	       0 allocs/op
    BenchmarkLZ4Converter_ConvertBlock/txt1_20000b-32  	  141013	      8303 ns/op	2408.69 MB/s	        -5.000 b_saved	       0 B/op	       0 allocs/op
    ```
    
    
    Reference compression speed:
    
    ```
    BenchmarkCompressBlockReference/html/default-32                    14070         82449 ns/op    1241.98 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/urls/default-32                     1215        970890 ns/op     723.14 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/jpg/default-32                    193770          5904 ns/op    20849.30 MB/s          0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/jpg_200b/default-32              8297767           144.1 ns/op  1387.77 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/pdf/default-32                     94203         12694 ns/op    8066.76 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/html4/default-32                   12174         97969 ns/op    4180.90 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1/default-32                     3613        333851 ns/op     455.56 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt2/default-32                     4683        260579 ns/op     480.39 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt3/default-32                     1268        947209 ns/op     450.54 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt4/default-32                     1083       1097426 ns/op     439.08 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/pb/default-32                      18357         64771 ns/op    1830.90 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/gaviota/default-32                  3942        295275 ns/op     624.23 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_128b/default-32            11448295           105.7 ns/op  1210.45 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_1000b/default-32            1000000          1021 ns/op     979.26 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_10000b/default-32            116739         10114 ns/op     988.68 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_20000b/default-32             49216         23409 ns/op     854.39 MB/s           0 B/op          0 allocs/op
    
    BenchmarkCompressBlockReference/html/better-32                      6649        174667 ns/op     586.26 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/urls/better-32                       627       1905706 ns/op     368.41 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/jpg/better-32                      52425         22783 ns/op    5402.88 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/jpg_200b/better-32               2772865           433.3 ns/op   461.61 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/pdf/better-32                       9210        127051 ns/op     805.97 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/html4/better-32                     5835        201146 ns/op    2036.33 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1/better-32                      2034        566702 ns/op     268.38 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt2/better-32                      2386        500580 ns/op     250.07 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt3/better-32                       758       1556541 ns/op     274.17 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt4/better-32                       591       2013515 ns/op     239.31 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/pb/better-32                        7836        155117 ns/op     764.51 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/gaviota/better-32                   2473        484975 ns/op     380.06 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_128b/better-32              4322678           275.5 ns/op   464.59 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_1000b/better-32              468687          2533 ns/op     394.76 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_10000b/better-32              49606         23720 ns/op     421.59 MB/s           0 B/op          0 allocs/op
    BenchmarkCompressBlockReference/txt1_20000b/better-32              14823         81300 ns/op     246.00 MB/s           0 B/op          0 allocs/op
    
    
    ```
    
    Size comparisons (using Go lz4 encoder):
    
    ```
    === RUN   TestLZ4Converter_ConvertBlock/html
        lz4convert_test.go:42: input size: 102400
        lz4convert_test.go:43: lz4 size: 21195
        lz4convert_test.go:60: lz4->snappy size: 21828
        lz4convert_test.go:79: lz4->s2 size: 20636
        lz4convert_test.go:91: s2 (default) size: 20865
        lz4convert_test.go:95: s2 (better) size: 18969
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 559
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -633
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 330
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 2226
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -229
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 1667
        --- PASS: TestLZ4Converter_ConvertBlock/html (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/urls
        lz4convert_test.go:42: input size: 702087
        lz4convert_test.go:43: lz4 size: 292514
        lz4convert_test.go:60: lz4->snappy size: 297926
        lz4convert_test.go:79: lz4->s2 size: 296457
        lz4convert_test.go:91: s2 (default) size: 286538
        lz4convert_test.go:95: s2 (better) size: 248076
        lz4convert_test.go:97: lz4 -> s2 bytes saved: -3943
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -5412
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 5976
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 44438
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 9919
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 48381
        --- PASS: TestLZ4Converter_ConvertBlock/urls (0.01s)
    === RUN   TestLZ4Converter_ConvertBlock/jpg
        lz4convert_test.go:42: input size: 123093
        lz4convert_test.go:43: lz4 size: 123522
        lz4convert_test.go:60: lz4->snappy size: 123040
        lz4convert_test.go:79: lz4->s2 size: 123040
        lz4convert_test.go:91: s2 (default) size: 123097
        lz4convert_test.go:95: s2 (better) size: 123097
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 482
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 482
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 425
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 425
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -57
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: -57
        --- PASS: TestLZ4Converter_ConvertBlock/jpg (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/jpg_200b
        lz4convert_test.go:42: input size: 200
        lz4convert_test.go:43: lz4 size: 155
        lz4convert_test.go:60: lz4->snappy size: 153
        lz4convert_test.go:79: lz4->s2 size: 153
        lz4convert_test.go:91: s2 (default) size: 153
        lz4convert_test.go:95: s2 (better) size: 147
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 2
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 2
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 2
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 8
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 0
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 6
        --- PASS: TestLZ4Converter_ConvertBlock/jpg_200b (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/pdf
        lz4convert_test.go:42: input size: 102400
        lz4convert_test.go:43: lz4 size: 83152
        lz4convert_test.go:60: lz4->snappy size: 83428
        lz4convert_test.go:79: lz4->s2 size: 83016
        lz4convert_test.go:91: s2 (default) size: 84199
        lz4convert_test.go:95: s2 (better) size: 82884
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 136
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -276
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -1047
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 268
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -1183
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 132
        --- PASS: TestLZ4Converter_ConvertBlock/pdf (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/html4
        lz4convert_test.go:42: input size: 409600
        lz4convert_test.go:43: lz4 size: 81908
        lz4convert_test.go:60: lz4->snappy size: 84886
        lz4convert_test.go:79: lz4->s2 size: 80068
        lz4convert_test.go:91: s2 (default) size: 20867
        lz4convert_test.go:95: s2 (better) size: 18979
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 1840
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -2978
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 61041
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 62929
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 59201
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 61089
        --- PASS: TestLZ4Converter_ConvertBlock/html4 (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt1
        lz4convert_test.go:42: input size: 152089
        lz4convert_test.go:43: lz4 size: 79672
        lz4convert_test.go:60: lz4->snappy size: 79567
        lz4convert_test.go:79: lz4->s2 size: 79566
        lz4convert_test.go:91: s2 (default) size: 85931
        lz4convert_test.go:95: s2 (better) size: 71608
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 106
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 105
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -6259
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 8064
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -6365
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 7958
        --- PASS: TestLZ4Converter_ConvertBlock/txt1 (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt2
        lz4convert_test.go:42: input size: 125179
        lz4convert_test.go:43: lz4 size: 70801
        lz4convert_test.go:60: lz4->snappy size: 72231
        lz4convert_test.go:79: lz4->s2 size: 72228
        lz4convert_test.go:91: s2 (default) size: 79572
        lz4convert_test.go:95: s2 (better) size: 65938
        lz4convert_test.go:97: lz4 -> s2 bytes saved: -1427
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -1430
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -8771
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 4863
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -7344
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 6290
        --- PASS: TestLZ4Converter_ConvertBlock/txt2 (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt3
        lz4convert_test.go:42: input size: 426754
        lz4convert_test.go:43: lz4 size: 207038
        lz4convert_test.go:60: lz4->snappy size: 206693
        lz4convert_test.go:79: lz4->s2 size: 206654
        lz4convert_test.go:91: s2 (default) size: 220380
        lz4convert_test.go:95: s2 (better) size: 184936
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 384
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 345
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -13342
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 22102
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -13726
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 21718
        --- PASS: TestLZ4Converter_ConvertBlock/txt3 (0.01s)
    === RUN   TestLZ4Converter_ConvertBlock/txt4
        lz4convert_test.go:42: input size: 481861
        lz4convert_test.go:43: lz4 size: 277731
        lz4convert_test.go:60: lz4->snappy size: 286863
        lz4convert_test.go:79: lz4->s2 size: 286856
        lz4convert_test.go:91: s2 (default) size: 318193
        lz4convert_test.go:95: s2 (better) size: 264987
        lz4convert_test.go:97: lz4 -> s2 bytes saved: -9125
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -9132
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -40462
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 12744
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -31337
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 21869
        --- PASS: TestLZ4Converter_ConvertBlock/txt4 (0.01s)
    === RUN   TestLZ4Converter_ConvertBlock/pb
        lz4convert_test.go:42: input size: 118588
        lz4convert_test.go:43: lz4 size: 19003
        lz4convert_test.go:60: lz4->snappy size: 21130
        lz4convert_test.go:79: lz4->s2 size: 19002
        lz4convert_test.go:91: s2 (default) size: 18603
        lz4convert_test.go:95: s2 (better) size: 17686
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 1
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -2127
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 400
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 1317
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 399
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 1316
        --- PASS: TestLZ4Converter_ConvertBlock/pb (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/gaviota
        lz4convert_test.go:42: input size: 184320
        lz4convert_test.go:43: lz4 size: 71749
        lz4convert_test.go:60: lz4->snappy size: 63392
        lz4convert_test.go:79: lz4->s2 size: 62446
        lz4convert_test.go:91: s2 (default) size: 65016
        lz4convert_test.go:95: s2 (better) size: 55395
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 9303
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 8357
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 6733
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 16354
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -2570
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 7051
        --- PASS: TestLZ4Converter_ConvertBlock/gaviota (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt1_128b
        lz4convert_test.go:42: input size: 128
        lz4convert_test.go:43: lz4 size: 84
        lz4convert_test.go:60: lz4->snappy size: 84
        lz4convert_test.go:79: lz4->
    8000
    s2 size: 84
        lz4convert_test.go:91: s2 (default) size: 80
        lz4convert_test.go:95: s2 (better) size: 76
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 0
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 0
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 4
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 8
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 4
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 8
        --- PASS: TestLZ4Converter_ConvertBlock/txt1_128b (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt1_1000b
        lz4convert_test.go:42: input size: 1000
        lz4convert_test.go:43: lz4 size: 807
        lz4convert_test.go:60: lz4->snappy size: 791
        lz4convert_test.go:79: lz4->s2 size: 791
        lz4convert_test.go:91: s2 (default) size: 772
        lz4convert_test.go:95: s2 (better) size: 744
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 16
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 16
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 35
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 63
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: 19
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 47
        --- PASS: TestLZ4Converter_ConvertBlock/txt1_1000b (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt1_10000b
        lz4convert_test.go:42: input size: 10000
        lz4convert_test.go:43: lz4 size: 6969
        lz4convert_test.go:60: lz4->snappy size: 6879
        lz4convert_test.go:79: lz4->s2 size: 6879
        lz4convert_test.go:91: s2 (default) size: 6931
        lz4convert_test.go:95: s2 (better) size: 6216
        lz4convert_test.go:97: lz4 -> s2 bytes saved: 90
        lz4convert_test.go:98: lz4 -> snappy bytes saved: 90
        lz4convert_test.go:99: data -> s2 (default) bytes saved: 38
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 753
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -52
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 663
        --- PASS: TestLZ4Converter_ConvertBlock/txt1_10000b (0.00s)
    === RUN   TestLZ4Converter_ConvertBlock/txt1_20000b
        lz4convert_test.go:42: input size: 20000
        lz4convert_test.go:43: lz4 size: 12750
        lz4convert_test.go:60: lz4->snappy size: 12755
        lz4convert_test.go:79: lz4->s2 size: 12755
        lz4convert_test.go:91: s2 (default) size: 13513
        lz4convert_test.go:95: s2 (better) size: 11489
        lz4convert_test.go:97: lz4 -> s2 bytes saved: -5
        lz4convert_test.go:98: lz4 -> snappy bytes saved: -5
        lz4convert_test.go:99: data -> s2 (default) bytes saved: -763
        lz4convert_test.go:100: data -> s2 (better) bytes saved: 1261
        lz4convert_test.go:101: direct data -> s2 (default) compared to converted from lz4: -758
        lz4convert_test.go:102: direct data -> s2 (better) compared to converted from lz4: 1266
    ```
    klauspost authored Feb 17, 2023
    Configuration menu
    Copy the full SHA
    7e21226 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2023

  1. s2: Add compression estimate (#752)

    Add `EstimateBlockSize` that will perform a very fast compression
    without outputting the result and return the compressed output size.
    
    The function returns -1 if no improvement could be achieved.
    
    Using actual compression will most often produce better compression than the estimate.
    
    ```
    BenchmarkEncodeS2BlockParallel/0-html/est-size-16         	  207572	      5746 ns/op	17822.54 MB/s	     22123 B	        21.60 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-16            	  190375	      6058 ns/op	16904.40 MB/s	     20868 B	        20.38 pct	      19 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-better-16     	   89342	     13496 ns/op	7587.43 MB/s	     18972 B	        18.53 pct	      40 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-best-16       	    5635	    204202 ns/op	 501.46 MB/s	     17403 B	        17.00 pct	     643 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/est-size-16         	   18522	     63202 ns/op	11108.55 MB/s	    313575 B	        44.66 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-16            	   16382	     72096 ns/op	9738.19 MB/s	    286541 B	        40.81 pct	    1462 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-better-16     	    8055	    148821 ns/op	4717.66 MB/s	    248079 B	        35.33 pct	    2974 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-best-16       	     446	   2436297 ns/op	 288.18 MB/s	    229314 B	        32.66 pct	   53724 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/est-size-16          	10415599	       113.6 ns/op	1083574.52 MB/s	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-16             	 2596231	       456.3 ns/op	269737.56 MB/s	    123100 B	       100.0 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-better-16      	  704770	      1561 ns/op	78870.09 MB/s	    123100 B	       100.0 pct	       6 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-best-16        	   14368	     74375 ns/op	1655.04 MB/s	    123025 B	        99.94 pct	     310 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/est-size-16     	99533023	        12.35 ns/op	16198.92 MB/s	       151.0 B	        75.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-16        	107843694	        11.02 ns/op	18151.08 MB/s	       155.0 B	        77.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-better-16 	39622100	        29.80 ns/op	6710.82 MB/s	       149.0 B	        74.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-best-16   	   46939	     24423 ns/op	   8.19 MB/s	       142.0 B	        71.00 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/est-size-16          	 1563739	       761.8 ns/op	134418.16 MB/s	     84867 B	        82.88 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-16             	 1000000	      1013 ns/op	101124.34 MB/s	     84202 B	        82.23 pct	       3 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-better-16      	  114718	     10198 ns/op	10041.40 MB/s	     82887 B	        80.94 pct	      31 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-best-16        	    3022	    397183 ns/op	 257.82 MB/s	     82327 B	        80.40 pct	    1198 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/est-size-16        	  167383	      7194 ns/op	56935.67 MB/s	     46130 B	        11.26 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-16           	  137107	      8084 ns/op	50670.44 MB/s	     20870 B	         5.095 pct	     102 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-better-16    	   80835	     14473 ns/op	28301.20 MB/s	     18982 B	         4.634 pct	     174 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-best-16      	    5659	    214283 ns/op	1911.49 MB/s	     17411 B	         4.251 pct	    2487 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/est-size-16         	   54757	     21891 ns/op	6947.42 MB/s	     90661 B	        59.61 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-16            	   49647	     24292 ns/op	6260.82 MB/s	     85934 B	        56.50 pct	     106 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-better-16     	   27152	     44545 ns/op	3414.30 MB/s	     71611 B	        47.08 pct	     194 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-best-16       	    1551	    760707 ns/op	 199.93 MB/s	     66182 B	        43.52 pct	    3413 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/est-size-16         	   69514	     17415 ns/op	7188.19 MB/s	     83328 B	        66.57 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-16            	   62397	     19237 ns/op	6507.18 MB/s	     79575 B	        63.57 pct	      71 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-better-16     	   30704	     38667 ns/op	3237.35 MB/s	     65941 B	        52.68 pct	     145 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-best-16       	    1665	    680136 ns/op	 184.05 MB/s	     61870 B	        49.43 pct	    2677 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/est-size-16         	   19372	     61609 ns/op	6926.83 MB/s	    233737 B	        54.77 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-16            	   16821	     69995 ns/op	6096.93 MB/s	    220383 B	        51.64 pct	     877 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-better-16     	    9943	    121114 ns/op	3523.56 MB/s	    184939 B	        43.34 pct	    1484 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-best-16       	     550	   1956728 ns/op	 218.10 MB/s	    167926 B	        39.35 pct	   26843 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/est-size-16         	   17184	     70208 ns/op	6863.34 MB/s	    341582 B	        70.89 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-16            	   14602	     81018 ns/op	5947.60 MB/s	    318196 B	        66.03 pct	    1125 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-better-16     	    7519	    160529 ns/op	3001.70 MB/s	    264990 B	        54.99 pct	    2185 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-best-16       	     351	   2914298 ns/op	 165.34 MB/s	    242003 B	        50.22 pct	   46822 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/est-size-16          	  267162	      4371 ns/op	27130.39 MB/s	     21121 B	        17.81 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-16             	  255237	      4636 ns/op	25579.35 MB/s	     18606 B	        15.69 pct	      16 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-better-16      	   97609	     11808 ns/op	10042.76 MB/s	     17689 B	        14.92 pct	      42 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-best-16        	    5970	    189710 ns/op	 625.10 MB/s	     16011 B	        13.50 pct	     700 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/est-size-16     	   56253	     21214 ns/op	8688.63 MB/s	     67091 B	        36.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-16        	   55222	     21493 ns/op	8575.72 MB/s	     65019 B	        35.28 pct	     116 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-better-16 	   32078	     37589 ns/op	4903.59 MB/s	     55398 B	        30.06 pct	     199 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-best-16   	    2006	    576210 ns/op	 319.88 MB/s	     49728 B	        26.98 pct	    3194 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/est-size-16   	141980487	         8.482 ns/op	15090.24 MB/s	        82.00 B	        64.06 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-16      	148818110	         8.218 ns/op	15575.59 MB/s	        82.00 B	        64.06 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-better-16         	62333452	        18.34 ns/op	6979.91 MB/s	        78.00 B	        60.94 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-best-16           	   46005	     28094 ns/op	   4.56 MB/s	        78.00 B	        60.94 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/est-size-16            	17325642	        70.00 ns/op	14286.61 MB/s	       794.0 B	        79.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-16               	14739968	        79.66 ns/op	12553.29 MB/s	       774.0 B	        77.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-better-16        	 6493140	       182.6 ns/op	5475.71 MB/s	       746.0 B	        74.60 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-best-16          	   28593	     42199 ns/op	  23.70 MB/s	       742.0 B	        74.20 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/est-size-16           	 1650962	       731.2 ns/op	13675.38 MB/s	      7357 B	        73.57 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-16              	 1504424	       794.1 ns/op	12592.21 MB/s	      6933 B	        69.33 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-better-16       	  683931	      1731 ns/op	5778.64 MB/s	      6218 B	        62.18 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-best-16         	    8996	    127511 ns/op	  78.42 MB/s	      6018 B	        60.18 pct	      38 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/est-size-16           	  746936	      1475 ns/op	13562.68 MB/s	     13736 B	        68.68 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-16              	  683935	      1699 ns/op	11770.06 MB/s	     13516 B	        67.58 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-better-16       	  199888	      6283 ns/op	3183.04 MB/s	     11492 B	        57.46 pct	       3 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-best-16         	    8083	    148602 ns/op	 134.59 MB/s	     11013 B	        55.06 pct	      86 B/op	       0 allocs/op
    ```
    
    Noasm:
    
    ```
    goos: windows
    goarch: amd64
    pkg: github.com/klauspost/compress/s2
    cpu: AMD Ryzen 9 3950X 16-Core Processor
    BenchmarkEncodeS2BlockParallel/0-html/est-size-16         	  206064	      5611 ns/op	18251.05 MB/s	     22123 B	        21.60 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-16            	  193548	      6043 ns/op	16944.16 MB/s	     20868 B	        20.38 pct	      18 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-better-16     	   92278	     13271 ns/op	7716.26 MB/s	     18972 B	        18.53 pct	      39 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/0-html/block-best-16       	    6012	    211476 ns/op	 484.22 MB/s	     17403 B	        17.00 pct	     602 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/est-size-16         	   18085	     64505 ns/op	10884.25 MB/s	    313575 B	        44.66 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-16            	   16641	     72254 ns/op	9716.94 MB/s	    286541 B	        40.81 pct	    1440 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-better-16     	    8190	    147021 ns/op	4775.42 MB/s	    248079 B	        35.33 pct	    2925 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/1-urls/block-best-16       	     457	   2346591 ns/op	 299.19 MB/s	    229314 B	        32.66 pct	   52431 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/est-size-16          	 9996766	       114.1 ns/op	1078424.08 MB/s	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-16             	 2554573	       481.3 ns/op	255727.67 MB/s	    123100 B	       100.0 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-better-16      	  725973	      1620 ns/op	75983.90 MB/s	    123100 B	       100.0 pct	       6 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/2-jpg/block-best-16        	   15980	     86360 ns/op	1425.35 MB/s	    123025 B	        99.94 pct	     279 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/est-size-16     	97503404	        12.30 ns/op	16262.94 MB/s	       151.0 B	        75.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-16        	100000000	        11.17 ns/op	17906.04 MB/s	       155.0 B	        77.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-better-16 	42084589	        29.70 ns/op	6733.93 MB/s	       149.0 B	        74.50 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/3-jpg_200b/block-best-16   	   46666	     30233 ns/op	   6.62 MB/s	       142.0 B	        71.00 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/est-size-16          	 1603496	       767.1 ns/op	133496.35 MB/s	     84867 B	        82.88 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-16             	 1209092	       976.1 ns/op	104904.83 MB/s	     84202 B	        82.23 pct	       2 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-better-16      	  120744	      9637 ns/op	10626.04 MB/s	     82887 B	        80.94 pct	      30 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/4-pdf/block-best-16        	    3207	    360495 ns/op	 284.05 MB/s	     82327 B	        80.40 pct	    1129 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/est-size-16        	  171427	      6953 ns/op	58913.45 MB/s	     46130 B	        11.26 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-16           	  147021	      7847 ns/op	52195.52 MB/s	     20870 B	         5.095 pct	      95 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-better-16    	   77374	     14038 ns/op	29178.20 MB/s	     18982 B	         4.634 pct	     181 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/5-html4/block-best-16      	    6410	    195077 ns/op	2099.68 MB/s	     17411 B	         4.251 pct	    2195 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/est-size-16         	   56328	     21296 ns/op	7141.57 MB/s	     90661 B	        59.61 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-16            	   49772	     23576 ns/op	6451.05 MB/s	     85934 B	        56.50 pct	     106 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-better-16     	   28126	     42782 ns/op	3554.96 MB/s	     71611 B	        47.08 pct	     188 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/6-txt1/block-best-16       	    1561	    695629 ns/op	 218.64 MB/s	     66182 B	        43.52 pct	    3391 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/est-size-16         	   72219	     16478 ns/op	7596.93 MB/s	     83328 B	        66.57 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-16            	   63639	     18683 ns/op	6700.25 MB/s	     79575 B	        63.57 pct	      70 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-better-16     	   32029	     37419 ns/op	3345.29 MB/s	     65941 B	        52.68 pct	     139 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/7-txt2/block-best-16       	    1807	    637784 ns/op	 196.27 MB/s	     61870 B	        49.43 pct	    2467 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/est-size-16         	   19794	     60379 ns/op	7067.93 MB/s	    233737 B	        54.77 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-16            	   17246	     69432 ns/op	6146.33 MB/s	    220383 B	        51.64 pct	     856 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-better-16     	   10083	    119233 ns/op	3579.17 MB/s	   
    10000
     184939 B	        43.34 pct	    1464 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/8-txt3/block-best-16       	     590	   1945892 ns/op	 219.31 MB/s	    167926 B	        39.35 pct	   25023 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/est-size-16         	   17264	     69558 ns/op	6927.52 MB/s	    341582 B	        70.89 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-16            	   14724	     80751 ns/op	5967.24 MB/s	    318196 B	        66.03 pct	    1116 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-better-16     	    7543	    158070 ns/op	3048.41 MB/s	    264990 B	        54.99 pct	    2178 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/9-txt4/block-best-16       	     384	   2815967 ns/op	 171.12 MB/s	    242003 B	        50.22 pct	   42799 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/est-size-16          	  266250	      4315 ns/op	27479.71 MB/s	     21121 B	        17.81 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-16             	  244342	      4566 ns/op	25973.35 MB/s	     18606 B	        15.69 pct	      17 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-better-16      	   97837	     11494 ns/op	10317.35 MB/s	     17689 B	        14.92 pct	      42 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/10-pb/block-best-16        	    6297	    185490 ns/op	 639.32 MB/s	     16011 B	        13.50 pct	     663 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/est-size-16     	   56702	     21138 ns/op	8719.77 MB/s	     67091 B	        36.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-16        	   55410	     21386 ns/op	8618.87 MB/s	     65019 B	        35.28 pct	     115 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-better-16 	   31664	     36989 ns/op	4983.05 MB/s	     55398 B	        30.06 pct	     202 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/11-gaviota/block-best-16   	    2055	    562326 ns/op	 327.78 MB/s	     49728 B	        26.98 pct	    3118 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/est-size-16   	142262313	         8.389 ns/op	15257.94 MB/s	        82.00 B	        64.06 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-16      	149785786	         8.201 ns/op	15607.45 MB/s	        82.00 B	        64.06 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-better-16         	65669124	        18.24 ns/op	7018.39 MB/s	        78.00 B	        60.94 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/12-txt1_128b/block-best-16           	   47752	     23642 ns/op	   5.41 MB/s	        78.00 B	        60.94 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/est-size-16            	16548277	        66.20 ns/op	15106.26 MB/s	       794.0 B	        79.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-16               	15555889	        76.06 ns/op	13148.16 MB/s	       774.0 B	        77.40 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-better-16        	 6623119	       183.0 ns/op	5465.18 MB/s	       746.0 B	        74.60 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/13-txt1_1000b/block-best-16          	   35971	     35347 ns/op	  28.29 MB/s	       742.0 B	        74.20 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/est-size-16           	 1632084	       723.1 ns/op	13828.64 MB/s	      7357 B	        73.57 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-16              	 1413682	       786.7 ns/op	12711.68 MB/s	      6933 B	        69.33 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-better-16       	  682802	      1710 ns/op	5848.59 MB/s	      6218 B	        62.18 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/14-txt1_10000b/block-best-16         	    9460	    115376 ns/op	  86.67 MB/s	      6018 B	        60.18 pct	      36 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/est-size-16           	  750022	      1462 ns/op	13677.87 MB/s	     13736 B	        68.68 pct	       0 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-16              	  682224	      1696 ns/op	11791.21 MB/s	     13516 B	        67.58 pct	       1 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-better-16       	  190477	      6060 ns/op	3300.47 MB/s	     11492 B	        57.46 pct	       3 B/op	       0 allocs/op
    BenchmarkEncodeS2BlockParallel/15-txt1_20000b/block-best-16         	    8642	    148052 ns/op	 135.09 MB/s	     11013 B	        55.06 pct	      80 B/op	       0 allocs/op
    
    ```
    klauspost authored Feb 18, 2023
    Configuration menu
    Copy the full SHA
    fdc8ab0 View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2023

  1. s2: Add support for custom stream encoder (#755)

    ```Go
    // WriterCustomEncoder allows to override the encoder for blocks on the stream.
    // The function must compress 'src' into 'dst' and return the bytes used in dst as an integer.
    // Block size (initial varint) should not be added by the encoder.
    // Returning value 0 indicates the block could not be compressed.
    // The function should expect to be called concurrently.
    func WriterCustomEncoder(fn func(dst, src []byte) int) WriterOption
    ```
    klauspost authored Feb 24, 2023
    Configuration menu
    Copy the full SHA
    47158f2 View commit details
    Browse the repository at this point in the history

Commits on Feb 25, 2023

  1. build(deps): bump golang.org/x/sys in /s2/_generate (#756)

    Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.0.0-20211030160813-b3129d9d1021 to 0.1.0.
    - [Release notes](https://github.com/golang/sys/releases)
    - [Commits](https://github.com/golang/sys/commits/v0.1.0)
    
    ---
    updated-dependencies:
    - dependency-name: golang.org/x/sys
      dependency-type: indirect
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Feb 25, 2023
    Configuration menu
    Copy the full SHA
    8f278c6 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0faa2d1 View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2023

  1. s2: Add Dictionary support. (#685)

    # Compression Improvement
    
    ## [github_users_sample_set](https://github.com/facebook/zstd/releases/download/v1.1.3/github_users_sample_set.tar.zst)
    
    From https://github.com/facebook/zstd/releases/tag/v1.1.3
    
    With 64K dictionary trained with zstd:
    
    9114 files, 7484607 bytes input:
    
    Default Compression:  3362023 (44.92%) -> 921524 (12.31%) 
    Better:  3083163 (41.19%)  -> 873154 (11.67%) 
    Best: 3057944 (40.86%) -> 785503 bytes (10.49%)
    
    ## Go Sources
    
    8912 files, 51253563 bytes input:
    
    Default: 22955767 (44.79%) -> 19654568 (38.35%)
    Better:  20189613 (39.39%) -> 16289357 (31.78%)
    Best: 19482828 (38.01%) ->  15184589 (29.63%)
    
    # Status:
    
    * [x] Format Specification
    * [x] Reference Decoder
    * [x] Encoders (default, better, best)
    * [x] Roundtrip tests
    * [x] Fuzz tests
    
    ## Non-goals
    
    There will be no assembly for initial release. Also some compression may still be left on the table.
    
    There will be no Snappy implementation, since it will be incompatible anyway.
    
    
    # DOCUMENTATION
    
    *Note: S2 dictionary compression is currently at an early implementation stage, with no assembly for
    neither encoding nor decoding. Performance improvements can be expected in the future.*
    
    Adding dictionaries allow providing a custom dictionary that will serve as lookup in the beginning of blocks.
    
    The same dictionary *must* be used for both encoding and decoding. 
    S2 does not keep track of whether the same dictionary is used,
    and using the wrong dictionary will most often not result in an error when decompressing.
    
    Blocks encoded *without* dictionaries can be decompressed seamlessly *with* a dictionary.
    This means it is possible to switch from an encoding without dictionaries to an encoding with dictionaries
    and treat the blocks similarly.
    
    Similar to [zStandard dictionaries](https://github.com/facebook/zstd#the-case-for-small-data-compression), 
    the same usage scenario applies to S2 dictionaries.  
    
    > Training works if there is some correlation in a family of small data samples. The more data-specific a dictionary is, the more efficient it is (there is no universal dictionary). Hence, deploying one dictionary per type of data will provide the greatest benefits. Dictionary gains are mostly effective in the first few KB. Then, the compression algorithm will gradually use previously decoded content to better compress the rest of the file.
    
    S2 further limits the dictionary to only be enabled on the first 64KB of a block.
    This will remove any negative (speed) impacts of the dictionaries on bigger blocks. 
    
    ### Compression
    
    Using the [github_users_sample_set](https://github.com/facebook/zstd/releases/download/v1.1.3/github_users_sample_set.tar.zst) and a 64KB dictionary trained with zStandard the following sizes can be achieved. 
    
    |                    | Default          | Better           | Best                  |
    |--------------------|------------------|------------------|-----------------------|
    | Without Dictionary | 3362023 (44.92%) | 3083163 (41.19%) | 3057944 (40.86%)      |
    | With Dictionary    | 921524 (12.31%)  | 873154 (11.67%)  | 785503 bytes (10.49%) |
    
    So for highly repetitive content, this case provides an almost 3x reduction in size.
    
    For less uniform data we will use the Go source code tree.
    Compressing First 64KB of all `.go` files in `go/src`, Go 1.19.5, 8912 files, 51253563 bytes input:
    
    |                    | Default           | Better            | Best              |
    |--------------------|-------------------|-------------------|-------------------|
    | Without Dictionary | 22955767 (44.79%) | 20189613 (39.39%  | 19482828 (38.01%) |
    | With Dictionary    | 19654568 (38.35%) | 16289357 (31.78%) | 15184589 (29.63%) |
    | Saving/file        | 362 bytes         | 428 bytes         | 472 bytes         |
    
    
    ### Creating Dictionaries
    
    There are no tools to create dictionaries in S2. 
    However, there are multiple ways to create a useful dictionary:
    
    #### Using a Sample File
    
    If your input is very uniform, you can just use a sample file as the dictionary.
    
    For example in the `github_users_sample_set` above, the average compression only goes up from 
    10.49% to 11.48% by using the first file as dictionary compared to using a dedicated dictionary.
    
    ```Go
        // Read a sample
        sample, err := os.ReadFile("sample.json")
    
        // Create a dictionary.
        dict := s2.MakeDict(sample, nil)
    	
        // b := dict.Bytes() will provide a dictionary that can be saved
        // and reloaded with s2.NewDict(b).
    	
        // To encode:
        encoded := dict.Encode(nil, file)
    
        // To decode:
        decoded, err := dict.Decode(nil, file)
    ```
    
    #### Using Zstandard
    
    Zstandard dictionaries can easily be converted to S2 dictionaries.
    
    This can be helpful to generate dictionaries for files that don't have a fixed structure.
    
    
    Example, with training set files  placed in `./training-set`: 
    
    `λ zstd -r --train-fastcover training-set/* --maxdict=65536 -o name.dict`
    
    This will create a dictionary of 64KB, that can be converted to a dictionary like this:
    
    ```Go
        // Decode the Zstandard dictionary.
        insp, err := zstd.InspectDictionary(zdict)
        if err != nil {
            panic(err)
        }
    	
        // We are only interested in the contents.
        // Assume that files start with "// Copyright (c) 2023".
        // Search for the longest match for that.
        // This may save a few bytes.
        dict := s2.MakeDict(insp.Content(), []byte("// Copyright (c) 2023"))
    
        // b := dict.Bytes() will provide a dictionary that can be saved
        // and reloaded with s2.NewDict(b).
    
        // We can now encode using this dictionary
        encodedWithDict := dict.Encode(nil, payload)
    
        // To decode content:
        decoded, err := dict.Decode(nil, encodedWithDict)
    ```
    
    It is recommended to save the dictionary returned by ` b:= dict.Bytes()`, since that will contain only the S2 dictionary.
    
    This dictionary can later be loaded using `s2.NewDict(b)`. The dictionary then no longer requires `zstd` to be initialized.
    
    Also note how `s2.MakeDict` allows you to search for a common starting sequence of your files.
    This can be omitted, at the expense of a few bytes.
    
    # Dictionary Encoding
    
    Adding dictionaries allow providing a custom dictionary that will serve as lookup in the beginning of blocks.
    
    A dictionary provides an initial repeat value that can be used to point to a common header.
    
    Other than that the dictionary contains values that can be used as back-references.
    
    Often used data should be placed at the *end* of the dictionary since offsets < 2048 bytes will be smaller.
    
    ## Format
    
    Dictionary *content* must at least 16 bytes and less or equal to 64KiB (65536 bytes). 
    
    Encoding: `[repeat value (uvarint)][dictionary content...]`
    
    Before the dictionary content, an unsigned base-128 (uvarint) encoded value specifying the initial repeat offset.
    This value is an offset into the dictionary content and not a back-reference offset, 
    so setting this to 0 will make the repeat value point to the first value of the dictionary. 
    
    The value must be less than the dictionary length-8.
    
    ## Encoding
    
    From the decoder point of view the dictionary content is seen as preceding the encoded content.
    
    `[dictionary content][decoded output]`
    
    Backreferences to the dictionary are encoded as ordinary backreferences that have an offset before the start of the decoded block.
    
    Matches copying from the dictionary are **not** allowed to cross from the dictionary into the decoded data.
    However, if a copy ends at the end of the dictionary the next repeat will point to the start of the decoded buffer, which is allowed.
    
    The first match can be a repeat value, which will use the repeat offset stored in the dictionary.
    
    When 64KB (65536 bytes) has been en/decoded it is no longer allowed to reference the dictionary, 
    neither by a copy nor repeat operations. 
    If the boundary is crossed while copying from the dictionary, the operation should complete, 
    but the next instruction is not allowed to reference the dictionary.
    
    Valid blocks encoded *without* a dictionary can be decoded with any dictionary. 
    There are no checks whether the supplied dictionary 
    5F3E
    is the correct for a block.
    Because of this there is no overhead by using a dictionary.
    
    ## Streams
    
    For streams each block can use the dictionary.
    
    The dictionary is not provided on the stream.
    klauspost authored Feb 26, 2023
    Configuration menu
    Copy the full SHA
    5a210a0 View commit details
    Browse the repository at this point in the history
Loading
0