Thursday, April 25, 2024
 Popular · Latest · Hot · Upcoming
14
rated 0 times [  14] [ 0]  / answers: 1 / hits: 15479  / 3 Years ago, thu, september 16, 2021, 8:36:02

As bzip2 claims to compress best (in size), I decided to use it. The working server can offer 24 (virtuals) CPUs (4 real X5650 @ 2.67GHz) - and thus I decided to look for parallel variants.

Using debian stable - sorry, but I found best matches here in askubuntu - I decided to take a closer look at pbzip2and lbzip2.

But what to select? In actual stable pbzip2 is in version 1.1.1-1and lbzip2 in version 0.23-1. That might cosmetically tend to pbzip2 - but lbzip2 says it is even on single-core computers faster. On the other hand pbzip2 claims to be completely compatible with bzip2 v1.0.2.

Additionally I have some timing-values of a big local job:

Using lbzip2



Command being timed: "tar -cjf /tmp/mapleTAsicherung.lbzip2.tar /bin /etc /lib /lib32 /opt /sbin /selinux /usr"
User time (seconds): 2134.32
System time (seconds): 39.24
Percent of CPU this job got: 2099%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:43.51
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1509088
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1054467
Voluntary context switches: 153901
Involuntary context switches: 235285
Swaps: 0
File system inputs: 0
File system outputs: 3460632
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0


Using pbzip2



    Command being timed: "tar -cjf /tmp/mapleTAsicherung.pbzip2.tar /bin /etc /lib /lib32 /opt /sbin /selinux /usr"
User time (seconds): 3158.18
System time (seconds): 59.80
Percent of CPU this job got: 2095%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:33.56
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1436320
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 477683
Voluntary context switches: 151326
Involuntary context switches: 339246
Swaps: 0
File system inputs: 0
File system outputs: 3460536
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0


What should one use? What are the major differences? At the moment I tend towards lbzip2.


More From » multi-core

 Answers
7

Here's a basic idea how to evaluate them.



Take a big tarball of the kind you usually work with. Compress it with bzip2, pbzip2, lbzip2. Measure the (wall clock) times and save all the outputs in different files. This will give you three times and three file sizes.



Then iterate over all three output files (ie. the compression outputs of bzip2, pbzip2, lbzip2), and decompress each with all three utilities (bzip2, pbzip2, and lbzip2). This will give you further nine times.



Re-run the twelve tests under some profiler and get a peak memory usage (virtual and RSS) for each. Again, this will yield 12 values. (If your Linux is configured to no overcommit, then you're interested in VSZ. Otherwise you care about RSS).



Make a table with 12 rows for these data points -- col1: 3 compressed sizes, col2: 3 compression times / 9 decompression times, col3: 12 peak mems -- and choose what suits you best. You should factor in how often you compress vs. how often you decompress.



I use lbzip2-0.23, but I wrote it, so it doesn't count.



Finally, no matter which one proves best for you, always save a checksum of the uncompressed tarball, plus verify your saved file before declaring the backup "done".



FILES=...
OUTDIR=/mnt/archive
BZ2_UTIL=...

(
tar -c -- $FILES
| tee >(sha256sum >"$OUTDIR"/myfiles.tar.sha256)
| pv -c -N plain 2>/dev/tty
| "$BZ2_UTIL"
| pv -c -N compr 2>/dev/tty
> "$OUTDIR"/myfiles.tar.bz2
) 2>"$OUTDIR"/myfiles.err

"$BZ2_UTIL" -dc -- "$OUTDIR"/myfiles.tar.bz2
| sha256sum -c -- "$OUTDIR"/myfiles.tar.sha256

[#43313] Saturday, September 18, 2021, 3 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
tudatchful

Total Points: 270
Total Questions: 109
Total Answers: 122

Location: Palau
Member since Tue, May 30, 2023
12 Months ago
;