SPO600 Stage 3

The Culmination of the Course:

It has been a while since my last blog post due to exam week being the same week that all the assignments are due, oppose to have a separate exam week appended onto the end of the semester so myself and many others were even busier than usual.

I have been able to run some tests, analyze some more code and do some more research but have unfortunately not had the time to blog it.

So as a quick refresher from the previous Blog post for Stage 2, I was able to get gprof as well as the display working, however was getting odd results, as displayed here:

spo600stage2

As well as the gprof display benchmark:

gprofspo600

I discussed how the run time, culmination seconds and self sections was 0.00 despite the program having a runtime of roughly 0m20.000s. I also discussed as to why it showed every function at 100%. I looked into why this may be the case and concluded it might be the case for a couple different reasons, one explanation I found online was from a gentleman who stated that: “This is due to the fact  that Linux uses a libc compatibility routine to fake the profile() system call for gprof. This needs the SIGPROF-based ITIMER_PROF interval timer but should generally work.” The other explanation is that the program runs so fast that the 20 odd seconds it takes to finish running the program is due to the disk read speed, an idea that was suggest by my professor and I believe it so be the case with bzip2.

I mentioned the 5 functions that run the most amount of times in the program: BZ2_ indexIntoF(120245), BZ2_bzDecompress(25), BZ2_bzRead(25), myfeof(25) and unRLE_obuf_to_output_SMALL(25). I pasted the smaller functions in my code, and the larger functions I simply described their make up. In my last post I also discussed the one optimization I could find in one of the 5 functions called multiple times. I omitted an extra nested while loop from unRLE_obuf_to_output_SMALL (which runs 5 times), however not only did I get the same benchmarking results from gprof, but upon further analysis I realized that the nested while needs to be in place to allow for data integrity during compression. The section of the function I am referring to is this:

while (True) { 
/* try to finish existing run */ 
 while (True) { 

   if (s->strm->avail_out==0) return False; 
   if (s->state_out_len==0) break; 
   *( (UChar*)(s->strm->next_out) ) = s->state_out_ch; 
   BZ_UPDATE_CRC ( s->calculatedBlockCRC, s->state_out_ch ); 
   s->state_out_len--; s->strm->next_out++; s->strm->avail_out--; 
   s->strm->total_out_lo32++; if (s->strm->total_out_lo32==0) s->strm->total_out_hi32++; 
}

I have come to the conclusion that this program cannot be further optimized with changes to a few lines here and there, you need to make substantial changes to the core compression algorithm. I thought I may have been able to optimize bzip2 as it states on the wikipedia site “bzip2 compresses most files more effectively than the older LZW (.Z) and Deflate (.zip and .gz) compression algorithms, but is considerably slower. LZMA is generally more space-efficient than bzip2 at the expense of even slower compression speed, while having much faster decompression.” I was under the assumption this did mean that I would be able to further optimize the program, however upon further investigation, I would have to change their entire compression algorithm.

I discovered an indepth article that supports my point.  https://www.rootusers.com/gzip-vs-bzip2-vs-xz-performance-comparison/ “Gzip vs Bzip2 vs XZ Performance Comparison” is the article and it does just that, it benchmarks, compares the compression rates, and runtimes of the three compression programs. The compression speed of the three programs is as shown below:

Capture.PNG

It seems that xz is the better choice, however that is the speed of compression at the higher levels, not the time it takes to Compress which is displayed here:Capture1.PNG

While at the 3+ levels of compression xz might be faster, it still takes more time to actually compress, and it is the opposite for gzip, the compression is slower, however it takes less time to run than bzip2. If we now delve into decompression we will see something similar:

Capture2

The decompression time for bzip2 seems to be the worst of the three, however the decompression for bzip seems to perform the best of the 3:

Capture3

The conclusions of the article from its benchmarking and testing are that bzip2 is a good middle ground for compression, bzip is only a little faster while xz may not be worth it at its higher default compression ratio of 6 as it takes ,uch longer to complete for a little extra gain. There are faster compression algorithms than bzip that is for sure, however to effectively optimize the program to make a notable difference I would have to edit the well documented Burrows-Wheeler algorithm. While on the other hand a company like gzip which does notable perform faster utilizes DEFLATE, a completely different compression algorithm that utilizes the LZ77 algorithm and Huffman coding.

Further research reveals that bzip2 is implemented by many programs such as: 7-zip, micro-bzip2, PBZIP2, bzip2smp, smpbzip2, pyflate, bz2, arnaud Bouchez BZip, Ibzip2, mpibzip2, Apache Commons Compress, jbzip2, DotNetZip, DotNetCompression(this is a streaming implementation of bzip2 in managed C# that conforms to the API of System.IO.Compression and includes assemblies for.NET Framework, .NET Compact Framework, Xamarin.iOS, Xamarin.Android, Xamarin.Mac, Windows Phone, Xbox 360, Silverlight, Mono as well as a Portable Class Library). It is safe to say that it is fairly popular, possibly due to the software itself, due to its reliable compression, or the fact that it has a BSD-style license and is patent free.

I researched how to upstream regarding bzip2, and I found the only way is to contact Julian Seward at jseward@bzip.org (from github documentation) or julian@bzip.org (from bzip2 website) who owns the copyright to bzip2 as of September 2010. I have emailed him regarding my benchmarking, research, and results. In hopes to open lines of communication, and hopefully he can point me into a certain direction of the code that is in need of optimization that I could not find. Or perhaps even a section of code that could use some more substantial reworking, see that I have now finished the semester, it would be good practice to keep coding and getting involved in optimizing an open source project would be great for that.

Reflection:

The project process was the simple 3 stages, however I would have liked for Stage 1 to be longer, to allow us more time to do research and find a project not only in need of optimization, but also to allow us the time to properly benchmark the software. I believe falling behind in Stage 1 benchmarking  led me to many mistakes later, seeing as I has to somewhat rush and find an open source software to optimize. That being said, I did spend a lot of time to try and properly benchmark the software using gprof, as well as the display for gprof, and while I was able to get it to work, I was still unsure of many things as I utilized gprof benchmarking with two pieces of software: groonga as well as bzip2.

I did feel this project was a great learning experience though, to get us to delve into patching and developing in open source. Finding software, functions and code for optimization overlooked by other people, get us to benchmarking software to see where there are deficiencies and areas to improve in code. The idea to get students to try and implement optimizations in already widely used code I feel is a great learning tool as in the industry we will have to keep these optimization methods in mind to ensure the efficiency of our code. I am not happy with the fact that I fell a bit behind with the project, and was not able to find areas to optimize as I would have liked to noticeably optimized software and get it to the upstream, however I still feel it was a good learning experience overall.

-Adam C

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s