| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 2024-02-18 | opj_dwt_decode_tile(): avoid potential UndefinedBehaviorSanitizer 'applying ↵ | Even Rouault | |
| zero offset to null pointer' (fixes #1505) | |||
| 2022-02-10 | Avoid integer overflows in DWT. Fixes ↵ | Even Rouault | |
| https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=44544 | |||
| 2021-12-05 | Fix some typos (found by codespell) | Stefan Weil | |
| Signed-off-by: Stefan Weil <sw@weilnetz.de> | |||
| 2021-09-03 | Avoid integer overflows in DWT. Fixes ↵ | Even Rouault | |
| https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11700 and https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=30646 | |||
| 2020-11-30 | Encoder: avoid global buffer overflow on irreversible conversion when too ↵ | Even Rouault | |
| many decomposition levels are specified (fixes #1286) | |||
| 2020-05-23 | Forward DWT 9-7: major speed up by vectorizing vertical pass | Even Rouault | |
| `bench_dwt -I -encode` times goes from 8.6s to 2.1s | |||
| 2020-05-23 | Forward DWT 5-3: major speed up by vectorizing vertical pass | Even Rouault | |
| `bench_dwt -encode` times goes from 7.9s to 1.7s | |||
| 2020-05-22 | Forward DWT: small code refactoring to allow future improvements for the ↵ | Even Rouault | |
| vertical pass | |||
| 2020-05-22 | dwt.c: remove unused typedef | Even Rouault | |
| 2020-05-22 | Forward DWT 5x3: performance improvements in horizontal pass, and modest in ↵ | Even Rouault | |
| vertical pass | |||
| 2020-05-22 | Forward DWT: small code refactoring to allow future improvements for the ↵ | Even Rouault | |
| horizontal pass | |||
| 2020-05-21 | Speed-up 9x7 IDWD by ~30% with OPJ_NUM_THREADS=2 | Even Rouault | |
| "bench_dwt -I" time goes from 2.2s to 1.5s | |||
| 2020-05-21 | Remove useless + 5U margin in opj_dwt_decode_tile_97() | Even Rouault | |
| Nothing in code analysis nor test suite shows that this margin is needed. It dates back to commit dbeebe72b9d35f6ff807c21c7f217b569fa894f6 where vector 9x7 decoding was introduced. | |||
| 2020-05-21 | Speed-up 9x7 IDWD by ~20% | Even Rouault | |
| "bench_dwt -I" time goes from 2.8s to 2.2s | |||
| 2020-05-20 | Irreversible decoding: partially revert previous commit, to fix failures in ↵ | Even Rouault | |
| test suite | |||
| 2020-05-20 | Irreversible compression/decompression DWT: use 1/K constant as per standard | Even Rouault | |
| The previous constant opj_c13318 was mysteriously equal to 2/K , and in the DWT, we had to divide K and opj_c13318 by 2... The issue was that the band->stepsize computation in tcd.c didn't take into account the log2gain of the band. The effect of this change is expected to be mostly equivalent to the previous situation, except some difference in rounding. But it leads to a dramatic reduction of the mean square error and peak error in the irreversible encoding of issue141.tif ! | |||
| 2020-05-20 | opj_dwt_encode_1_real(): avoid many bound comparisons, similarly to decoding ↵ | Even Rouault | |
| side | |||
| 2020-05-20 | Encoder: use floating-point operations for irreversible transformation | Even Rouault | |
| 2020-05-20 | dwt.c: change sign of constants to match standard and compensate (no ↵ | Even Rouault | |
| functional change) | |||
| 2020-05-20 | Add multithreaded support in the DWT encoder. | Even Rouault | |
| Update the bench_dwt utility to have a -decode/-encode switch Measured performance gains for DWT encoder on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded) Encoding time: $ ./bin/bench_dwt -encode -num_threads 1 time for dwt_encode: total = 8.348 s, wallclock = 8.352 s $ ./bin/bench_dwt -encode -num_threads 2 time for dwt_encode: total = 9.776 s, wallclock = 4.904 s $ ./bin/bench_dwt -encode -num_threads 4 time for dwt_encode: total = 13.188 s, wallclock = 3.310 s $ ./bin/bench_dwt -encode -num_threads 8 time for dwt_encode: total = 30.024 s, wallclock = 4.064 s Scaling is probably limited by memory access patterns causing memory access to be the bottleneck. The slightly worse results with threads==8 than with thread==4 is due to hyperthreading being not appropriate here. | |||
| 2018-10-31 | Fix several memory and resource leaks | Nikola Forró | |
| Signed-off-by: Nikola Forró <nforro@redhat.com> | |||
| 2018-09-05 | Fix some typos in code comments and documentation | Stefan Weil | |
| All typos were found by Codespell. Signed-off-by: Stefan Weil <sw@weilnetz.de> | |||
| 2017-09-20 | Avoid index-out-of-bounds access when invoking opj_compress with -n 11 or ↵ | Even Rouault | |
| higher. But not a proper fix itself (refs #493) | |||
| 2017-09-06 | Fix null pointer dereference on partial tile decoding when they are empty. ↵ | Even Rouault | |
| Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3297 (master only) | |||
| 2017-09-04 | Replace uses of size_t by OPJ_SIZE_T | Even Rouault | |
| 2017-09-01 | opj_v4dwt_decode_step1_sse(): rework a bit to improve code generation | Even Rouault | |
| 2017-09-01 | opj_v4dwt_decode_step2_sse(): loop unroll | Even Rouault | |
| 2017-09-01 | opj_dwt_decode_partial_97(): simplify/more efficient use of sparse arrays in ↵ | Even Rouault | |
| vertical pass | |||
| 2017-09-01 | opj_dwt_decode_partial_1_parallel(): add SSE2 optimization | Even Rouault | |
| 2017-09-01 | Sub-tile decoding: speed up vertical pass in IDWT5x3 by processing 4 cols at ↵ | Even Rouault | |
| a time | |||
| 2017-09-01 | Optimize opj_dwt_decode_partial_1() when cas == 0 | Even Rouault | |
| 2017-09-01 | Various changes to allow tile buffers of more than 4giga pixels | Even Rouault | |
| Untested though, since that means a tile buffer of at least 16 GB. So there might be places where uint32 overflow on multiplication still occur... | |||
| 2017-09-01 | Fix compiler warning in release mode | Even Rouault | |
| 2017-09-01 | opj_dwt_decode_partial_tile(): avoid undefined behaviour in lifting ↵ | Even Rouault | |
| operation by properly initializing working buffer | |||
| 2017-09-01 | Sub-tile decoding: only allocate tile component buffer of the needed dimension | Even Rouault | |
| Instead of being the full tile size. * Use a sparse array mechanism to store code-blocks and intermediate stages of IDWT. * IDWT, DC level shift and MCT stages are done just on that smaller array. * Improve copy of tile component array to final image, by saving an intermediate buffer. * For full-tile decoding at reduced resolution, only allocate the tile buffer to the reduced size, instead of the full-resolution size. | |||
| 2017-09-01 | Fix undefined shift behaviour in opj_dwt_is_whole_tile_decoding(). Fixes ↵ | Even Rouault | |
| https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3255. Credit to OSS Fuzz | |||
| 2017-08-29 | Use IDWT whole tile decoding if the area of interest equals to the image ↵ | Even Rouault | |
| bounds, taking into account the reduced resolution factor | |||
| 2017-08-28 | Subtile decoding: fix overflows in subband coordinate computation that cause ↵ | Even Rouault | |
| later buffer overflow. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3115. Credit to OSS Fuzz. master only | |||
| 2017-08-23 | opj_dwt_decode_partial_97(): perf improvement: limit copy of coefficients at ↵ | Even Rouault | |
| end of horizontal pass to actual range of interest | |||
| 2017-08-21 | Add comments for filter_width values | Even Rouault | |
| 2017-08-20 | Subtile decoding: only do 9x7 IDWT computations on relevant areas of ↵ | Even Rouault | |
| tile-component buffer. | |||
| 2017-08-18 | Subtile decoding: only do 5x3 IDWT computations on relevant areas of ↵ | Even Rouault | |
| tile-component buffer. This lowers 'bin/opj_decompress -i ../MAPA.jp2 -o out.tif -d 0,0,256,256' down to 0.860s | |||
| 2017-07-06 | Comment fix | Even Rouault | |
| 2017-06-30 | IDWT 5x3: fix bug in AVX2 implementation (#953, #957) | Even Rouault | |
| 2017-06-21 | IDWT 5x3: generalize SSE2 version for AVX2 | Even Rouault | |
| Thanks to our macros that abstract SSE use, the functions can use AVX2 when available (at compile time) This brings an extra 23% speed improvement on bench_dwt in 64bit builds with AVX2 compared to SSE2. | |||
| 2017-06-21 | dwt.c: small cleanup | Even Rouault | |
| 2017-06-20 | Improve performance of inverse DWT 5x3 (#953) | Even Rouault | |
| * Use single-pass lifting inverse wavelet transform. * For vertical pass, use SSE2 when available so as to process 8 columns in parallel. This is the most beneficial improvement, since the vertical pass involves a lot of cache trashing. With the bench_dwt utility with default arguments (16383x16383 image), time goes from 4.064 s to 1.212 s. | |||
| 2017-06-17 | Fix astyle issue | Even Rouault | |
| 2017-06-17 | Fix warnings with recent GCC versions | Even Rouault | |
| 2017-05-09 | Reformat whole codebase with astyle.options (#128) | Even Rouault | |
