| Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
organization
~ 9% speed improvement seen on 10980x10980 uint16 image, T36JTT_20160914T074612_B02.tif
opj_compress time from 17.2s to 15.8s
|
|
`bench_dwt -I -encode` times goes from 8.6s to 2.1s
|
|
`bench_dwt -encode` times goes from 7.9s to 1.7s
|
|
vertical pass
|
|
|
|
vertical pass
|
|
horizontal pass
|
|
"bench_dwt -I" time goes from 2.2s to 1.5s
|
|
Nothing in code analysis nor test suite shows that this margin is
needed.
It dates back to commit dbeebe72b9d35f6ff807c21c7f217b569fa894f6
where vector 9x7 decoding was introduced.
|
|
"bench_dwt -I" time goes from 2.8s to 2.2s
|
|
|
|
test suite
|
|
The previous constant opj_c13318 was mysteriously equal to 2/K , and in
the DWT, we had to divide K and opj_c13318 by 2... The issue was that the
band->stepsize computation in tcd.c didn't take into account the log2gain of
the band.
The effect of this change is expected to be mostly equivalent to the previous
situation, except some difference in rounding. But it leads to a dramatic
reduction of the mean square error and peak error in the irreversible encoding
of issue141.tif !
|
|
messing up with stepsize (no functional change)
|
|
side
|
|
potential division by zero
|
|
|
|
original image
|
|
|
|
|
|
|
|
functional change)
|
|
Update the bench_dwt utility to have a -decode/-encode switch
Measured performance gains for DWT encoder on a
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded)
Encoding time:
$ ./bin/bench_dwt -encode -num_threads 1
time for dwt_encode: total = 8.348 s, wallclock = 8.352 s
$ ./bin/bench_dwt -encode -num_threads 2
time for dwt_encode: total = 9.776 s, wallclock = 4.904 s
$ ./bin/bench_dwt -encode -num_threads 4
time for dwt_encode: total = 13.188 s, wallclock = 3.310 s
$ ./bin/bench_dwt -encode -num_threads 8
time for dwt_encode: total = 30.024 s, wallclock = 4.064 s
Scaling is probably limited by memory access patterns causing
memory access to be the bottleneck.
The slightly worse results with threads==8 than with thread==4
is due to hyperthreading being not appropriate here.
|
|
- API wise, opj_codec_set_threads() can be used on the encoding side
- opj_compress has a -threads switch similar to opj_uncompress
|
|
Add support for generation of PLT markers in encoder
|
|
* -PLT switch added to opj_compress
* Add a opj_encoder_set_extra_options() function that
accepts a PLT=YES option, and could be expanded later
for other uses.
-------
Testing with a Sentinel2 10m band, T36JTT_20160914T074612_B02.jp2,
coming from S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE
Decompress it to TIFF:
```
opj_uncompress -i T36JTT_20160914T074612_B02.jp2 -o T36JTT_20160914T074612_B02.tif
```
Recompress it with similar parameters as original:
```
opj_compress -n 5 -c [256,256],[256,256],[256,256],[256,256],[256,256] -t 1024,1024 -PLT -i T36JTT_20160914T074612_B02.tif -o T36JTT_20160914T074612_B02_PLT.jp2
```
Dump codestream detail with GDAL dump_jp2.py utility (https://github.com/OSGeo/gdal/blob/master/gdal/swig/python/samples/dump_jp2.py)
```
python dump_jp2.py T36JTT_20160914T074612_B02.jp2 > /tmp/dump_sentinel2_ori.txt
python dump_jp2.py T36JTT_20160914T074612_B02_PLT.jp2 > /tmp/dump_sentinel2_openjpeg_plt.txt
```
The diff between both show very similar structure, and identical number of packets in PLT markers
Now testing with Kakadu (KDU803_Demo_Apps_for_Linux-x86-64_200210)
Full file decompression:
```
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp.tif
Consumed 121 tile-part(s) from a total of 121 tile(s).
Consumed 80,318,806 codestream bytes (excluding any file format) = 5.329697
bits/pel.
Processed using the multi-threaded environment, with
8 parallel threads of execution
```
Partial decompresson (presumably using PLT markers):
```
kdu_expand -i T36JTT_20160914T074612_B02.jp2 -o tmp.pgm -region "{0.5,0.5},{0.01,0.01}"
kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp2.pgm -region "{0.5,0.5},{0.01,0.01}"
diff tmp.pgm tmp2.pgm && echo "same !"
```
-------
Funded by ESA for S2-MPC project
|
|
|
|
Fix warnings about signed/unsigned casts in pi.c
|
|
|
|
This issues were found by cppcheck and coverity.
|
|
|
|
opj_tcd_get_encoder_input_buffer_size()
|
|
opj_decompress: add sanity checks to avoid segfault in case of decoding error
|
|
Prevent crashes like:
opj_decompress -i 0722_5-1_2019.jp2 -o out.ppm -r 4 -t 0
where 0722_5-1_2019.jp2 is
https://drive.google.com/file/d/1ZxOUZg2-FKjYwa257VFLMpTXRWxEoP0a/view?usp=sharing
|
|
|
|
Implement writing of IMF profiles
|
|
Add -IMF switch to opj_compress as well
|
|
|
|
tests: add alternate checksums for libtiff 4.1
|
|
Fixes #1233
libtiff 4.1 slightly modifies the way it generates files. So
add the new expected md5sum.
Not super elegant solution admitedly.
|
|
opj_tcd_init_tile(): avoid integer overflow
|
|
That could lead to later assertion failures.
Fixes #1231 / CVE-2020-8112
|
|
This was changed some time ago (https://google.github.io/oss-fuzz/getting-started/new-project-guide/) but the build didn't fail as there is a fallback mechanism. The main advantage of the new approach is that for libFuzzer this produces more performant binaries (as `$LIB_FUZZING_ENGINE` expands into `-fsanitize=fuzzer`, which links libFuzzer from the compiler-rt, allowing better optimization tricks).
I'm also experimenting with dataflow (https://github.com/google/oss-fuzz/issues/1632) on your project, and the dataflow config doesn't have a fallback (as it's a new configuration), therefore I'm proposing a change to migrate from `-lFuzzingEngine` to `$LIB_FUZZING_ENGINE`.
|
|
opj_j2k_update_image_dimensions(): reject images whose coordinates are beyond INT_MAX (fixes #1228)
|
|
beyond INT_MAX (fixes #1228)
|
|
pi.c: avoid integer overflow, resulting in later invalid access to memory in opj_t2_decode_packets()
|
|
characters (#1196)
Fixes #1068
|