More Updates regarding Stage 2

In my previous post I attested to the fact that it was showing in the gprof display that every function was running at 100.00%. I forgot to also include the gprof flat file to help textualize things further (also it is a bit easier to read the calls).

spo600stage2.PNG

Now as I stated before it is hard to see how long everything runs when it is all listed at 0. I did run bzip2 correctly, but I have a feeling that bzip2 runs so quickly that the time stamp I included in the previous post has to do with the time it takes to read from the disk, not for the actual functions to operate. However, you can observe how many times functions are called. BZ2_indexInfoF is called 120245 times, BZ2_bzDecompress is called 25 times, same with BZ2_bzread, myfeof, and unRLE_obuf_to_output_SMALL. These are the functions, due to how many times they are run that I should really observe to make changes to.

For the function that is called the most: BZ2_indexInfoF it is a fairly simple function, despite the fact that it is called so many times, I do not believe there is room for any optimization here. I have included the function in case I find a way to further optimize it.

__inline__ Int32 BZ2_indexIntoF ( Int32 indx, Int32 *cftab )
{
 Int32 nb, na, mid;
 nb = 0;
 na = 256;
   do {
     mid = (nb + na) >> 1;
       if (indx >= cftab[mid]) nb = mid; else na = mid;
      }
   while (na - nb !=1);
 return nb;
}

I really do not think it is the case that bzip2 runs so fast, after looking at BZ2_bzDecompress there are close to 50 variables declared in this function, a massive switch statement, close to 20 if statements etc. There are a lot of operations going on, so I think there is some optimization that can be still performed. I may have to simply resort to the time function to measure my optimization.

For bz2_bzRead I could only locate that in the manual.ps file despite the fact there is documentation for it on the bzip2 website, I will have to come back to that.

The myfeof function only consists of four lines, fairly simple, an if statement, a couple declarations and a return statement. I am almost certain there is no room for optimization here, however, I will include it in case a classmate finds a resolution, or I can revisit it at a later date for consideration.

static
Bool myfeof ( FILE* f )
{
   Int32 c = fgetc ( f );
   if (c == EOF) return True;
   ungetc ( c, f );
   return False;
}

ungetc() seems like it does not perform any function as there is a return False statement immediately after, however how the function operates is that if the file (c) is at the EOF it returns true and exits the function, if it does not, it places whatever was put into c(the file), into f(variable pointing to the file).

 

Okay so in the Bool unRLE_obuf_to_output_SMALL ( DState* s ) function which runs 25 times, I was able to find an oddly placed while statement:

while (True) {

/* try to finish existing run */
  while (True) {

   if (s->strm->avail_out==0) return False;
   if (s->state_out_len==0) break;

   *( (UChar*)(s->strm->next_out) ) = s->state_out_ch;

   BZ_UPDATE_CRC ( s->calculatedBlockCRC, s->state_out_ch );

   s->state_out_len--;
   s->strm->next_out++;
   s->strm->avail_out--;
   s->strm->total_out_lo32++;
   if (s->strm->total_out_lo32==0) s->strm->total_out_hi32++;

}

Then there is another closing brace for the first while loop. I thought it was odd to have two loops nested inside one another with the same parameters, so I simply removed the nested while loop, ran make, make install again, then utilized the time.

I decided to run with time, three times compressed, 3 times decompressed (with -d flag). The results are as follows:

For Compression(with my alterations):

run1
real 0m22.131s
user 0m20.853s
sys 0m2.259s

run2
real 0m21.601s
user 0m21.664s
sys 0m0.919s

run3
real 0m22.866s
user 0m21.483s
sys 0m2.380s

For Decompression(with my alterations):

Run1
real 0m7.585s
user 0m5.462s
sys 0m2.117s

Run2
real 0m7.527s
user 0m5.489s
sys 0m2.030s

Run3
real 0m10.740s
user 0m5.379s
sys 0m2.380s

Now to run bipz2 before my alterations. With the time function the results are as follows:

For Compression(without my alterations):

run1
real 0m22.357s
user 0m21.269s
sys 0m1.070s

run2 
real 0m22.024s
user 0m21.047s
sys 0m0.960s

run3
real 0m21.987s
user 0m21.051s
sys 0m0.919s

For Decompression(without my alterations):

run1
real 0m7.515s
user 0m5.180s
sys 0m2.330s

run2
real 0m7.521s
user 0m4.977s
sys 0m2.539s
run3
real 0m7.510s
user 0m5.086s
sys 0m2.419s

I really did not anticipate a difference with decompression, but I did expect a bigger difference with compression, not having to run a while block inside of a while block with the same parameters in the while statement. The file I used again was generated with:

dd if=/dev/zero of=zero bs=1 count=0 seek=1000M

Bzip2 also has a bzcmp to ensure that the files are the same and have not been altered, however since I am using dd I am not even sure if bzcmp will work to ensure the files are remaining intact and my alterations have not incorrectly impacted the code. I have tried the bzcmp with the -l flag which causes the comparison and display to continue to the end. I have also tried with the -s flag which suppresses output and returns a non-zero status if the files differ. However if I run it with either flag, or without any flags I get no output at all so I am going to have to test the validity of the file another way, however I may not even bother because the optimization I have tested is not nearly enough of an optimization to be content with.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s