summaryrefslogtreecommitdiff
path: root/src/lib/openjp2/tcd.c
AgeCommit message (Collapse)Author
2024-02-28Fix some typos (found by `codespell` and `typos`)Stefan Weil
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2023-12-08Merge pull request #1496 from rouault/fix_1480Even Rouault
opj_tcd_dc_level_shift_decode(): avoid increment nullptr (fixes #1480)
2023-12-08opj_tcd_dc_level_shift_decode(): avoid increment nullptr (fixes #1480)Even Rouault
(likely harmless issue as we don't dereference it)
2023-12-08suppress warning during build using clangTomoaki Teshima
2022-08-11Cleanup code related to quality layer allocation, and add a few safety checksEven Rouault
2022-08-11Significant speed-up rate allocation by rate/distoratio ratioEven Rouault
- Avoid doing 128 iterations all the time, and stop when the threshold doesn't vary much - Avoid calling costly opj_t2_encode_packets() repeatdly when bisecting the layer ratio if the truncation points haven't changed since the last iteration. When used with the GDAL gdal_translate application to convert a 11977 x 8745 raster with data type UInt16 and 8 channels, the conversion time to JPEG2000 with 20 quality layers using disto/rate allocation ( -co "IC=C8" -co "JPEG2000_DRIVER=JP2OPENJPEG" -co "PROFILE=NPJE_NUMERICALLY_LOSSLESS" creation options of the GDAL NITF driver) goes from 5m56 wall clock (8m20s total, 12 vCPUs) down to 1m16 wall clock (3m45 total).
2020-12-04Encoder: grow again buffer size in opj_tcd_code_block_enc_allocate_data() ↵yuan
(fixes #1283)
2020-11-26Free p_tcd_marker_info to avoid memory leakyuan
2020-11-25Encoder: grow again buffer size in opj_tcd_code_block_enc_allocate_data() ↵yuan
(fixes #1283)
2020-11-23Encoder: grow again buffer size in opj_tcd_code_block_enc_allocate_data() ↵Even Rouault
(fixes #1283)
2020-11-23Encoder: grow buffer size in opj_tcd_code_block_enc_allocate_data() to avoid ↵Even Rouault
write heap buffer overflow in opj_mqc_flush (fixes #1283)
2020-05-20Irreversible decoding: partially revert previous commit, to fix failures in ↵Even Rouault
test suite
2020-05-20Irreversible compression/decompression DWT: use 1/K constant as per standardEven Rouault
The previous constant opj_c13318 was mysteriously equal to 2/K , and in the DWT, we had to divide K and opj_c13318 by 2... The issue was that the band->stepsize computation in tcd.c didn't take into account the log2gain of the band. The effect of this change is expected to be mostly equivalent to the previous situation, except some difference in rounding. But it leads to a dramatic reduction of the mean square error and peak error in the irreversible encoding of issue141.tif !
2020-05-20Irreversible decoding: align code more closely to the standard by avoid ↵Even Rouault
messing up with stepsize (no functional change)
2020-05-20tcd.c: add commentEven Rouault
2020-05-20Encoder: use floating-point operations for irreversible transformationEven Rouault
2020-05-20Add multithreaded support in the DWT encoder.Even Rouault
Update the bench_dwt utility to have a -decode/-encode switch Measured performance gains for DWT encoder on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded) Encoding time: $ ./bin/bench_dwt -encode -num_threads 1 time for dwt_encode: total = 8.348 s, wallclock = 8.352 s $ ./bin/bench_dwt -encode -num_threads 2 time for dwt_encode: total = 9.776 s, wallclock = 4.904 s $ ./bin/bench_dwt -encode -num_threads 4 time for dwt_encode: total = 13.188 s, wallclock = 3.310 s $ ./bin/bench_dwt -encode -num_threads 8 time for dwt_encode: total = 30.024 s, wallclock = 4.064 s Scaling is probably limited by memory access patterns causing memory access to be the bottleneck. The slightly worse results with threads==8 than with thread==4 is due to hyperthreading being not appropriate here.
2020-05-20Add multithreading support in the T1 (entropy phase) encoderEven Rouault
- API wise, opj_codec_set_threads() can be used on the encoding side - opj_compress has a -threads switch similar to opj_uncompress
2020-04-21Add support for generation of PLT markers in encoderEven Rouault
* -PLT switch added to opj_compress * Add a opj_encoder_set_extra_options() function that accepts a PLT=YES option, and could be expanded later for other uses. ------- Testing with a Sentinel2 10m band, T36JTT_20160914T074612_B02.jp2, coming from S2A_MSIL1C_20160914T074612_N0204_R135_T36JTT_20160914T081456.SAFE Decompress it to TIFF: ``` opj_uncompress -i T36JTT_20160914T074612_B02.jp2 -o T36JTT_20160914T074612_B02.tif ``` Recompress it with similar parameters as original: ``` opj_compress -n 5 -c [256,256],[256,256],[256,256],[256,256],[256,256] -t 1024,1024 -PLT -i T36JTT_20160914T074612_B02.tif -o T36JTT_20160914T074612_B02_PLT.jp2 ``` Dump codestream detail with GDAL dump_jp2.py utility (https://github.com/OSGeo/gdal/blob/master/gdal/swig/python/samples/dump_jp2.py) ``` python dump_jp2.py T36JTT_20160914T074612_B02.jp2 > /tmp/dump_sentinel2_ori.txt python dump_jp2.py T36JTT_20160914T074612_B02_PLT.jp2 > /tmp/dump_sentinel2_openjpeg_plt.txt ``` The diff between both show very similar structure, and identical number of packets in PLT markers Now testing with Kakadu (KDU803_Demo_Apps_for_Linux-x86-64_200210) Full file decompression: ``` kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp.tif Consumed 121 tile-part(s) from a total of 121 tile(s). Consumed 80,318,806 codestream bytes (excluding any file format) = 5.329697 bits/pel. Processed using the multi-threaded environment, with 8 parallel threads of execution ``` Partial decompresson (presumably using PLT markers): ``` kdu_expand -i T36JTT_20160914T074612_B02.jp2 -o tmp.pgm -region "{0.5,0.5},{0.01,0.01}" kdu_expand -i T36JTT_20160914T074612_B02_PLT.jp2 -o tmp2.pgm -region "{0.5,0.5},{0.01,0.01}" diff tmp.pgm tmp2.pgm && echo "same !" ``` ------- Funded by ESA for S2-MPC project
2020-04-16Rename mis-named function opj_tcd_get_encoded_tile_size() to ↵Even Rouault
opj_tcd_get_encoder_input_buffer_size()
2020-02-12Implement writing of IMF profilesEven Rouault
Add -IMF switch to opj_compress as well
2020-01-30opj_tcd_init_tile(): avoid integer overflowEven Rouault
That could lead to later assertion failures. Fixes #1231 / CVE-2020-8112
2019-10-03opj_tcd_mct_decode()/opj_mct_decode()/opj_mct_encode_real()/opj_mct_decode_r ↵Even Rouault
eal(): proper deal with a number of samples larger than 4 billion (refs #1151)
2018-02-11Avoid out-of-bounds write overflow due to uint32 overflow computation on ↵Even Rouault
images with huge dimensions. Credit to Google Autofuzz project for providing test case
2017-09-19Add capability to decode only a subset of all components of an image.Even Rouault
This adds a opj_set_decoded_components(opj_codec_t *p_codec, OPJ_UINT32 numcomps, const OPJ_UINT32* comps_indices) function, and equivalent "opj_decompress -c compno[,compno]*" option. When specified, neither the MCT transform nor JP2 channel transformations will be applied. Tests added for various combinations of whole image vs tiled-based decoding, full or reduced resolution, use of decode area or not.
2017-09-19Fix warnings and errors when compiling with a c++ compiler (#1021)Even Rouault
2017-09-08opj_tcd_mct_decode(): avoid heap buffer overflow when components have not ↵Even Rouault
the same number of resolutions. Also fixes an issue with subtile decoding. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3331. Credit to OSS Fuzz
2017-09-07Properly fix cc893a4ebfaf8c42cf1221ac82c83df91e77340b to avoid ↵Even Rouault
heap-buffer-overflow when numcomps < 3
2017-09-07opj_tcd_mct_decode(): fix checks to verify MCT can be done safely. Fixes ↵Even Rouault
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3305 (master only)
2017-09-05Merge pull request #1010 from rouault/subtile_decoding_stage3Even Rouault
Subtile decoding: memory use reduction and perf improvements
2017-09-04Replace uses of size_t by OPJ_SIZE_TEven Rouault
2017-09-01opj_j2k_update_image_data(): avoid allocating image buffer if we can just ↵Even Rouault
reuse the tile buffer one
2017-09-01Replace error message 'Not enough memory for tile data' by 'Size of tile ↵Even Rouault
data exceeds system limits' (refs https://github.com/uclouvain/openjpeg/pull/730#issuecomment-326654188)
2017-09-01opj_tcd_rateallocate(): make sure to use all passes for a lossless layer (#1009)Even Rouault
And save a useless loop, which should be a tiny faster.
2017-09-01opj_tcd_dc_level_shift_decode(): optimize lossy caseEven Rouault
2017-09-01Tiny perf improvement in T1 stage for subtile decodingEven Rouault
2017-09-01Various changes to allow tile buffers of more than 4giga pixelsEven Rouault
Untested though, since that means a tile buffer of at least 16 GB. So there might be places where uint32 overflow on multiplication still occur...
2017-09-01TCD: allow tile buffer to be greater than 4GB on 64 bit hosts (but number of ↵Even Rouault
pixels must remain under 4 billion)
2017-09-01Remove limitation that prevents from opening images bigger than 4 billion pixelsEven Rouault
However the intermediate buffer for decoding must still be smaller than 4 billion pixels, so this is useful for decoding at a lower resolution level, or subtile decoding.
2017-09-01opj_tcd_init_tile(): fix typo on overflow detection condition (introduced in ↵Even Rouault
previous commit)
2017-09-01Sub-tile decoding: only allocate tile component buffer of the needed dimensionEven Rouault
Instead of being the full tile size. * Use a sparse array mechanism to store code-blocks and intermediate stages of IDWT. * IDWT, DC level shift and MCT stages are done just on that smaller array. * Improve copy of tile component array to final image, by saving an intermediate buffer. * For full-tile decoding at reduced resolution, only allocate the tile buffer to the reduced size, instead of the full-resolution size.
2017-08-28Subtile decoding: fix overflows in subband coordinate computation that cause ↵Even Rouault
later buffer overflow. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3115. Credit to OSS Fuzz. master only
2017-08-21Add comments for filter_width valuesEven Rouault
2017-08-20Subtile decoding: only do 9x7 IDWT computations on relevant areas of ↵Even Rouault
tile-component buffer.
2017-08-18Subtile decoding: only do 5x3 IDWT computations on relevant areas of ↵Even Rouault
tile-component buffer. This lowers 'bin/opj_decompress -i ../MAPA.jp2 -o out.tif -d 0,0,256,256' down to 0.860s
2017-08-17Sub-tile decoding: only decode precincts and codeblocks that intersect the ↵Even Rouault
window specified in opj_set_decode_area()
2017-08-14Encoder: grow buffer size in opj_tcd_code_block_enc_allocate_data() to avoid ↵Even Rouault
write heap buffer overflow in opj_mqc_flush (#982)
2017-08-10Propagate event manager down to opj_t2_encode_packet() and use it to emit an ↵Even Rouault
error message when the output buffer is too small
2017-08-07Slight improvement in management of code block chunksEven Rouault
Instead of having the chunk array at the segment level, we can move it down to the codeblock itself since segments are filled in sequential order. Limit the number of memory allocation, and decrease slightly the memory usage. On MAPA_005.jp2 n4: 1871312549 (heap allocation functions) malloc/new/new[], --alloc-fns, etc. n1: 1610689344 0x4E781E7: opj_aligned_malloc (opj_malloc.c:61) n1: 1610689344 0x4E71D1B: opj_alloc_tile_component_data (tcd.c:676) n1: 1610689344 0x4E726CF: opj_tcd_init_decode_tile (tcd.c:816) n1: 1610689344 0x4E4BE39: opj_j2k_read_tile_header (j2k.c:8617) n1: 1610689344 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348) n1: 1610689344 0x4E4E3CE: opj_j2k_decode (j2k.c:7846) n1: 1610689344 0x4E53002: opj_jp2_decode (jp2.c:1564) n0: 1610689344 0x40374E: main (opj_decompress.c:1459) n1: 219232541 0x4E4BC50: opj_j2k_read_tile_header (j2k.c:4683) n1: 219232541 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348) n1: 219232541 0x4E4E3CE: opj_j2k_decode (j2k.c:7846) n1: 219232541 0x4E53002: opj_jp2_decode (jp2.c:1564) n0: 219232541 0x40374E: main (opj_decompress.c:1459) n1: 23893200 0x4E72735: opj_tcd_init_decode_tile (tcd.c:1225) n1: 23893200 0x4E4BE39: opj_j2k_read_tile_header (j2k.c:8617) n1: 23893200 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348) n1: 23893200 0x4E4E3CE: opj_j2k_decode (j2k.c:7846) n1: 23893200 0x4E53002: opj_jp2_decode (jp2.c:1564) n0: 23893200 0x40374E: main (opj_decompress.c:1459) n0: 17497464 in 52 places, all below massif's threshold (1.00%)
2017-08-07Decoding: do not allocate memory for the codestream of each codeblockEven Rouault
Currently we allocate at least 8192 bytes for each codeblock, and copy the relevant parts of the codestream in that per-codeblock buffer as we decode packets. As the whole codestream for the tile is ingested in memory and alive during the decoding, we can directly point to it instead of copying. But to do that, we need an intermediate concept, a 'chunk' of code-stream segment, given that segments may be made of data at different places in the code-stream when quality layers are used. With that change, the decoding of MAPA_005.jp2 goes down from the previous improvement of 2.7 GB down to 1.9 GB. New profile: n4: 1885648469 (heap allocation functions) malloc/new/new[], --alloc-fns, etc. n1: 1610689344 0x4E78287: opj_aligned_malloc (opj_malloc.c:61) n1: 1610689344 0x4E71D7B: opj_alloc_tile_component_data (tcd.c:676) n1: 1610689344 0x4E7272C: opj_tcd_init_decode_tile (tcd.c:816) n1: 1610689344 0x4E4BDD9: opj_j2k_read_tile_header (j2k.c:8618) n1: 1610689344 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349) n1: 1610689344 0x4E4E36E: opj_j2k_decode (j2k.c:7847) n1: 1610689344 0x4E52FA2: opj_jp2_decode (jp2.c:1564) n0: 1610689344 0x40374E: main (opj_decompress.c:1459) n1: 219232541 0x4E4BBF0: opj_j2k_read_tile_header (j2k.c:4685) n1: 219232541 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349) n1: 219232541 0x4E4E36E: opj_j2k_decode (j2k.c:7847) n1: 219232541 0x4E52FA2: opj_jp2_decode (jp2.c:1564) n0: 219232541 0x40374E: main (opj_decompress.c:1459) n1: 39822000 0x4E727A9: opj_tcd_init_decode_tile (tcd.c:1219) n1: 39822000 0x4E4BDD9: opj_j2k_read_tile_header (j2k.c:8618) n1: 39822000 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349) n1: 39822000 0x4E4E36E: opj_j2k_decode (j2k.c:7847) n1: 39822000 0x4E52FA2: opj_jp2_decode (jp2.c:1564) n0: 39822000 0x40374E: main (opj_decompress.c:1459) n0: 15904584 in 52 places, all below massif's threshold (1.00%)