| Age | Commit message (Collapse) | Author |
|
Signed-off-by: Stefan Weil <sw@weilnetz.de>
|
|
organization
~ 9% speed improvement seen on 10980x10980 uint16 image, T36JTT_20160914T074612_B02.tif
opj_compress time from 17.2s to 15.8s
|
|
- API wise, opj_codec_set_threads() can be used on the encoding side
- opj_compress has a -threads switch similar to opj_uncompress
|
|
window specified in opj_set_decode_area()
|
|
Instead of having the chunk array at the segment level, we can move it down to
the codeblock itself since segments are filled in sequential order.
Limit the number of memory allocation, and decrease slightly the memory usage.
On MAPA_005.jp2
n4: 1871312549 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 1610689344 0x4E781E7: opj_aligned_malloc (opj_malloc.c:61)
n1: 1610689344 0x4E71D1B: opj_alloc_tile_component_data (tcd.c:676)
n1: 1610689344 0x4E726CF: opj_tcd_init_decode_tile (tcd.c:816)
n1: 1610689344 0x4E4BE39: opj_j2k_read_tile_header (j2k.c:8617)
n1: 1610689344 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348)
n1: 1610689344 0x4E4E3CE: opj_j2k_decode (j2k.c:7846)
n1: 1610689344 0x4E53002: opj_jp2_decode (jp2.c:1564)
n0: 1610689344 0x40374E: main (opj_decompress.c:1459)
n1: 219232541 0x4E4BC50: opj_j2k_read_tile_header (j2k.c:4683)
n1: 219232541 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348)
n1: 219232541 0x4E4E3CE: opj_j2k_decode (j2k.c:7846)
n1: 219232541 0x4E53002: opj_jp2_decode (jp2.c:1564)
n0: 219232541 0x40374E: main (opj_decompress.c:1459)
n1: 23893200 0x4E72735: opj_tcd_init_decode_tile (tcd.c:1225)
n1: 23893200 0x4E4BE39: opj_j2k_read_tile_header (j2k.c:8617)
n1: 23893200 0x4E4C902: opj_j2k_decode_tiles (j2k.c:10348)
n1: 23893200 0x4E4E3CE: opj_j2k_decode (j2k.c:7846)
n1: 23893200 0x4E53002: opj_jp2_decode (jp2.c:1564)
n0: 23893200 0x40374E: main (opj_decompress.c:1459)
n0: 17497464 in 52 places, all below massif's threshold (1.00%)
|
|
Currently we allocate at least 8192 bytes for each codeblock, and copy
the relevant parts of the codestream in that per-codeblock buffer as we
decode packets.
As the whole codestream for the tile is ingested in memory and alive
during the decoding, we can directly point to it instead of copying. But
to do that, we need an intermediate concept, a 'chunk' of code-stream segment,
given that segments may be made of data at different places in the code-stream
when quality layers are used.
With that change, the decoding of MAPA_005.jp2 goes down from the previous
improvement of 2.7 GB down to 1.9 GB.
New profile:
n4: 1885648469 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 1610689344 0x4E78287: opj_aligned_malloc (opj_malloc.c:61)
n1: 1610689344 0x4E71D7B: opj_alloc_tile_component_data (tcd.c:676)
n1: 1610689344 0x4E7272C: opj_tcd_init_decode_tile (tcd.c:816)
n1: 1610689344 0x4E4BDD9: opj_j2k_read_tile_header (j2k.c:8618)
n1: 1610689344 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349)
n1: 1610689344 0x4E4E36E: opj_j2k_decode (j2k.c:7847)
n1: 1610689344 0x4E52FA2: opj_jp2_decode (jp2.c:1564)
n0: 1610689344 0x40374E: main (opj_decompress.c:1459)
n1: 219232541 0x4E4BBF0: opj_j2k_read_tile_header (j2k.c:4685)
n1: 219232541 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349)
n1: 219232541 0x4E4E36E: opj_j2k_decode (j2k.c:7847)
n1: 219232541 0x4E52FA2: opj_jp2_decode (jp2.c:1564)
n0: 219232541 0x40374E: main (opj_decompress.c:1459)
n1: 39822000 0x4E727A9: opj_tcd_init_decode_tile (tcd.c:1219)
n1: 39822000 0x4E4BDD9: opj_j2k_read_tile_header (j2k.c:8618)
n1: 39822000 0x4E4C8A2: opj_j2k_decode_tiles (j2k.c:10349)
n1: 39822000 0x4E4E36E: opj_j2k_decode (j2k.c:7847)
n1: 39822000 0x4E52FA2: opj_jp2_decode (jp2.c:1564)
n0: 39822000 0x40374E: main (opj_decompress.c:1459)
n0: 15904584 in 52 places, all below massif's threshold (1.00%)
|
|
reserved __ (#587)
|
|
|
|
and emit a warning when errors are found
|
|
This saves comparing the current pointer with the end of buffer pointer.
This results at least in tiny speed improvement for raw decoding, and
smaller code size for MQC as well.
This kills the remains of the raw.h/.c files that were only used for
decoding. Encoding using the mqc structure already.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ported from Carl Hetherington work (actually through Matthieu Darbois's port
on top of OpenJPEG 2.1.0)
Can reduce total encoding time by 10-15%
WARNING: VSC mode is not implemented, and so is a temporary regression
that must be fixed.
|
|
|
|
This will remove some conversion warnings
|
|
|
|
Addition flag array such that colflags[1+0] is for state of col=0,row=0..3,
colflags[1+1] for col=1, row=0..3, colflags[1+flags_stride] for col=0,row=4..7, ...
This array avoids too much cache trashing when processing by 4 vertical samples
as done in the various decoding steps.
|
|
We can avoid using a loop-up table with some shift arithmetics.
|
|
|
|
issue 375)
|
|
each file; updated AUTHORS, NEWS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Update issue 177
|