001/*
002 * $RCSfile: MQCoder.java,v $
003 * $Revision: 1.1 $
004 * $Date: 2005/02/11 05:02:09 $
005 * $State: Exp $
006 *
007 * Class:                   MQCoder
008 *
009 * Description:             Class that encodes a number of bits using the
010 *                          MQ arithmetic coder
011 *
012 *
013 *                          Diego SANTA CRUZ, Jul-26-1999 (improved speed)
014 *
015 * COPYRIGHT:
016 *
017 * This software module was originally developed by Raphaël Grosbois and
018 * Diego Santa Cruz (Swiss Federal Institute of Technology-EPFL); Joel
019 * Askelöf (Ericsson Radio Systems AB); and Bertrand Berthelot, David
020 * Bouchard, Félix Henry, Gerard Mozelle and Patrice Onno (Canon Research
021 * Centre France S.A) in the course of development of the JPEG2000
022 * standard as specified by ISO/IEC 15444 (JPEG 2000 Standard). This
023 * software module is an implementation of a part of the JPEG 2000
024 * Standard. Swiss Federal Institute of Technology-EPFL, Ericsson Radio
025 * Systems AB and Canon Research Centre France S.A (collectively JJ2000
026 * Partners) agree not to assert against ISO/IEC and users of the JPEG
027 * 2000 Standard (Users) any of their rights under the copyright, not
028 * including other intellectual property rights, for this software module
029 * with respect to the usage by ISO/IEC and Users of this software module
030 * or modifications thereof for use in hardware or software products
031 * claiming conformance to the JPEG 2000 Standard. Those intending to use
032 * this software module in hardware or software products are advised that
033 * their use may infringe existing patents. The original developers of
034 * this software module, JJ2000 Partners and ISO/IEC assume no liability
035 * for use of this software module or modifications thereof. No license
036 * or right to this software module is granted for non JPEG 2000 Standard
037 * conforming products. JJ2000 Partners have full right to use this
038 * software module for his/her own purpose, assign or donate this
039 * software module to any third party and to inhibit third parties from
040 * using this software module for non JPEG 2000 Standard conforming
041 * products. This copyright notice must be included in all copies or
042 * derivative works of this software module.
043 *
044 * Copyright (c) 1999/2000 JJ2000 Partners.
045 * */
046package jj2000.j2k.entropy.encoder;
047
048import jj2000.j2k.entropy.encoder.*;
049import jj2000.j2k.entropy.*;
050import jj2000.j2k.util.*;
051import jj2000.j2k.*;
052
053import java.io.*;
054
055/**
056 * This class implements the MQ arithmetic coder. When initialized a specific
057 * state can be specified for each context, which may be adapted to the
058 * probability distribution that is expected for that context.
059 *
060 * <P>The type of length calculation and termination can be chosen at
061 * construction time.
062 *
063 * ---- Tricks that have been tried to improve speed ----
064 *
065 * 1) Merging Qe and mPS and doubling the lookup tables
066 *
067 * Merge the mPS into Qe, as the sign bit (if Qe>=0 the sense of MPS is 0, if
068 * Qe<0 the sense is 1), and double the lookup tables. The first half of the
069 * lookup tables correspond to Qe>=0 (i.e. the sense of MPS is 0) and the
070 * second half to Qe<0 (i.e. the sense of MPS is 1). The nLPS lookup table is
071 * modified to incorporate the changes in the sense of MPS, by making it jump
072 * from the first to the second half and vice-versa, when a change is
073 * specified by the swicthLM lookup table. See JPEG book, section 13.2, page
074 * 225.
075 *
076 * There is NO speed improvement in doing this, actually there is a slight
077 * decrease, probably due to the fact that often Q has to be negated. Also the
078 * fact that a brach of the type "if (bit==mPS[li])" is replaced by two
079 * simpler braches of the type "if (bit==0)" and "if (q<0)" may contribute to
080 * that.
081 *
082 * 2) Removing cT
083 *
084 * It is possible to remove the cT counter by setting a flag bit in the high
085 * bits of the C register. This bit will be automatically shifted left
086 * whenever a renormalization shift occurs, which is equivalent to decreasing
087 * cT. When the flag bit reaches the sign bit (leftmost bit), which is
088 * equivalenet to cT==0, the byteOut() procedure is called. This test can be
089 * done efficiently with "c<0" since C is a signed quantity. Care must be
090 * taken in byteOut() to reset the bit in order to not interfere with other
091 * bits in the C register. See JPEG book, page 228.
092 *
093 * There is NO speed improvement in doing this. I don't really know why since
094 * the number of operations whenever a renormalization occurs is
095 * decreased. Maybe it is due to the number of extra operations in the
096 * byteOut(), terminate() and getNumCodedBytes() procedures.
097 *
098 *
099 * 3) Change the convention of MPS and LPS.
100 *
101 * Making the LPS interval be above the MPS interval (MQ coder convention is
102 * the opposite) can reduce the number of operations along the MPS path. In
103 * order to generate the same bit stream as with the MQ convention the output
104 * bytes need to be modified accordingly. The basic rule for this is that C =
105 * (C'^0xFF...FF)-A, where C is the codestream for the MQ convention and C' is
106 * the codestream generated by this other convention. Note that this affects
107 * bit-stuffing as well.
108 *
109 * This has not been tested yet.
110 *
111 * 4) Removing normalization while loop on MPS path
112 *
113 * Since in the MPS path Q is guaranteed to be always greater than 0x4000
114 * (decimal 0.375) it is never necessary to do more than 1 renormalization
115 * shift. Therefore the test of the while loop, and the loop itself, can be
116 * removed.
117 *
118 * 5) Simplifying test on A register
119 *
120 * Since A is always less than or equal to 0xFFFF, the test "(a & 0x8000)==0"
121 * can be replaced by the simplete test "a < 0x8000". This test is simpler in
122 * Java since it involves only 1 operation (although the original test can be
123 * converted to only one operation by  smart Just-In-Time compilers)
124 *
125 * This change has been integrated in the decoding procedures.
126 *
127 * 6) Speedup mode
128 *
129 * Implemented a method that uses the speedup mode of the MQ-coder if
130 * possible. This should greately improve performance when coding long runs of
131 * MPS symbols that have high probability. However, to take advantage of this,
132 * the entropy coder implementation has to explicetely use it. The generated
133 * bit stream is the same as if no speedup mode would have been used.
134 *
135 * Implemented but performance not tested yet.
136 *
137 * 7) Multiple-symbol coding
138 *
139 * Since the time spent in a method call is non-negligable, coding several
140 * symbols with one method call reduces the overhead per coded symbol. The
141 * decodeSymbols() method implements this. However, to take advantage of it,
142 * the implementation of the entropy coder has to explicitely use it.
143 *
144 * Implemented but performance not tested yet.
145 *  */
146public class MQCoder {
147
148    /** Identifier for the lazy length calculation. The lazy length
149     * calculation is not optimal but is extremely simple. */
150    public static final int LENGTH_LAZY = 0;
151
152    /** Identifier for a very simple length calculation. This provides better
153     * results than the 'LENGTH_LAZY' computation. This is the old length
154     * calculation that was implemented in this class. */
155    public static final int LENGTH_LAZY_GOOD = 1;
156
157    /** Identifier for the near optimal length calculation. This calculation
158     * is more complex than the lazy one but provides an almost optimal length
159     * calculation. */
160    public static final int LENGTH_NEAR_OPT = 2;
161
162    /** The identifier fort the termination that uses a full flush. This is
163     * the less efficient termination. */
164    public static final int TERM_FULL = 0;
165
166    /** The identifier for the termination that uses the near optimal length
167     * calculation to terminate the arithmetic codewrod */
168    public static final int TERM_NEAR_OPT = 1;
169
170    /** The identifier for the easy termination that is simpler than the
171     * 'TERM_NEAR_OPT' one but slightly less efficient. */
172    public static final int TERM_EASY = 2;
173
174    /** The identifier for the predictable termination policy for error
175     * resilience. This is the same as the 'TERM_EASY' one but an special
176     * sequence of bits is embodied in the spare bits for error resilience
177     * purposes. */
178    public static final int TERM_PRED_ER = 3;
179
180    /** The data structures containing the probabilities for the LPS */
181    final static
182        int qe[]={0x5601, 0x3401, 0x1801, 0x0ac1, 0x0521, 0x0221, 0x5601,
183                  0x5401, 0x4801, 0x3801, 0x3001, 0x2401, 0x1c01, 0x1601,
184                  0x5601, 0x5401, 0x5101, 0x4801, 0x3801, 0x3401, 0x3001,
185                  0x2801, 0x2401, 0x2201, 0x1c01, 0x1801, 0x1601, 0x1401,
186                  0x1201, 0x1101, 0x0ac1, 0x09c1, 0x08a1, 0x0521, 0x0441,
187                  0x02a1, 0x0221, 0x0141, 0x0111, 0x0085, 0x0049, 0x0025,
188                  0x0015, 0x0009, 0x0005, 0x0001, 0x5601 };
189
190    /** The indexes of the next MPS */
191    final static
192        int nMPS[]={ 1 , 2, 3, 4, 5,38, 7, 8, 9,10,11,12,13,29,15,16,17,
193                     18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,
194                     35,36,37,38,39,40,41,42,43,44,45,45,46 };
195
196    /** The indexes of the next LPS */
197    final static
198        int nLPS[]={ 1 , 6, 9,12,29,33, 6,14,14,14,17,18,20,21,14,14,15,
199                     16,17,18,19,19,20,21,22,23,24,25,26,27,28,29,30,31,
200                     32,33,34,35,36,37,38,39,40,41,42,43,46 };
201
202    /** Whether LPS and MPS should be switched */
203    final static        // at indices 0, 6, and 14 we switch
204        int switchLM[]={ 1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,
205                         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 };
206    // Having ints proved to be more efficient than booleans
207
208    /** The ByteOutputBuffer used to write the compressed bit stream. */
209    ByteOutputBuffer out;
210
211    /** The current most probable signal for each context */
212    int[] mPS;
213
214    /** The current index of each context */
215    int[] I;
216
217    /** The current bit code */
218    int c;
219
220    /** The bit code counter */
221    int cT;
222
223    /** The current interval */
224    int a;
225
226    /** The last encoded byte of data */
227    int b;
228
229    /** If a 0xFF byte has been delayed and not yet been written to the output
230     * (in the MQ we can never have more than 1 0xFF byte in a row). */
231    boolean delFF;
232
233    /** The number of written bytes so far, excluding any delayed 0xFF
234     * bytes. Upon initialization it is -1 to indicated that the byte buffer
235     * 'b' is empty as well. */
236    int nrOfWrittenBytes = -1;
237
238    /** The initial state of each context */
239    int initStates[];
240
241    /** The termination type to use. One of 'TERM_FULL', 'TERM_NEAR_OPT',
242     * 'TERM_EASY' or 'TERM_PRED_ER'. */
243    int ttype;
244
245    /** The length calculation type to use. One of 'LENGTH_LAZY',
246     * 'LENGTH_LAZY_GOOD', 'LENGTH_NEAR_OPT'. */
247    int ltype;
248
249    /** Saved values of the C register. Used for the LENGTH_NEAR_OPT length
250     * calculation. */
251    int savedC[];
252
253    /** Saved values of CT counter. Used for the LENGTH_NEAR_OPT length
254     * calculation. */
255    int savedCT[];
256
257    /** Saved values of the A register. Used for the LENGTH_NEAR_OPT length
258     * calculation. */
259    int savedA[];
260
261    /** Saved values of the B byte buffer. Used for the LENGTH_NEAR_OPT length
262     * calculation. */
263    int savedB[];
264
265    /** Saved values of the delFF (i.e. delayed 0xFF) state. Used for the
266     * LENGTH_NEAR_OPT length calculation. */
267    boolean savedDelFF[];
268
269    /** Number of saved states. Used for the LENGTH_NEAR_OPT length
270     * calculation. */
271    int nSaved;
272
273    /** The initial length of the arrays to save sates */
274    final static int SAVED_LEN = 32*StdEntropyCoderOptions.NUM_PASSES;
275
276    /** The increase in length for the arrays to save states */
277    final static int SAVED_INC = 4*StdEntropyCoderOptions.NUM_PASSES;
278
279    /**
280     * Set the length calculation type to the specified type
281     *
282     * @param ltype The type of length calculation to use. One of
283     * 'LENGTH_LAZY', 'LENGTH_LAZY_GOOD' or 'LENGTH_NEAR_OPT'.
284     * */
285    public void setLenCalcType(int ltype){
286        // Verify the ttype and ltype
287        if (ltype != LENGTH_LAZY && ltype != LENGTH_LAZY_GOOD &&
288            ltype != LENGTH_NEAR_OPT) {
289            throw new IllegalArgumentException("Unrecognized length "+
290                                               "calculation type code: "+ltype);
291        }
292
293        if(ltype == LENGTH_NEAR_OPT){
294            if(savedC==null)
295                savedC = new int[SAVED_LEN];
296            if(savedCT==null)
297                savedCT = new int[SAVED_LEN];
298            if(savedA==null)
299                savedA = new int[SAVED_LEN];
300            if(savedB==null)
301                savedB = new int[SAVED_LEN];
302            if(savedDelFF==null)
303                savedDelFF = new boolean[SAVED_LEN];
304        }
305
306        this.ltype = ltype;
307    }
308
309    /**
310     * Set termination type to the specified type
311     *
312     * @param ttype The type of termination to use. One of 'TERM_FULL',
313     * 'TERM_NEAR_OPT', 'TERM_EASY' or 'TERM_PRED_ER'.
314     * */
315    public void setTermType(int ttype){
316        if (ttype != TERM_FULL && ttype != TERM_NEAR_OPT &&
317            ttype != TERM_EASY && ttype != TERM_PRED_ER ) {
318            throw new IllegalArgumentException("Unrecognized termination type "+
319                                               "code: "+ttype);
320        }
321
322        this.ttype = ttype;
323    }
324
325    /**
326     * Instantiates a new MQ-coder, with the specified number of contexts and
327     * initial states. The compressed bytestream is written to the 'oStream'
328     * object.
329     *
330     * @param oStream where to output the compressed data
331     *
332     * @param nrOfContexts The number of contexts used
333     *
334     * @param init The initial state for each context. A reference is kept to
335     * this array to reinitialize the contexts whenever 'reset()' or
336     * 'resetCtxts()' is called.
337     * */
338    public MQCoder(ByteOutputBuffer oStream, int nrOfContexts, int init[]){
339        out = oStream;
340
341        // --- INITENC
342
343        // Default initialization of the statistics bins is MPS=0 and
344        // I=0
345        I=new int[nrOfContexts];
346        mPS=new int[nrOfContexts];
347        initStates = init;
348
349        a=0x8000;
350        c=0;
351        if(b==0xFF)
352            cT=13;
353        else
354            cT=12;
355
356        resetCtxts();
357
358        // End of INITENC ---
359
360        b=0;
361    }
362
363    /**
364     * This method performs the coding of the symbol 'bit', using context
365     * 'ctxt', 'n' times, using the MQ-coder speedup mode if possible.
366     *
367     * <P>If the symbol 'bit' is the current more probable symbol (MPS) and
368     * qe[ctxt]<=0x4000, and (A-0x8000)>=qe[ctxt], speedup mode will be
369     * used. Otherwise the normal mode will be used. The speedup mode can
370     * significantly improve the speed of arithmetic coding when several MPS
371     * symbols, with a high probability distribution, must be coded with the
372     * same context. The generated bit stream is the same as if the normal mode
373     * was used.
374     *
375     * <P>This method is also faster than the 'codeSymbols()' and
376     * 'codeSymbol()' ones, for coding the same symbols with the same context
377     * several times, when speedup mode can not be used, although not
378     * significantly.
379     *
380     * @param bit The symbol do code, 0 or 1.
381     *
382     * @param ctxt The context to us in coding the symbol
383     *
384     * @param n The number of times that the symbol must be coded.
385     * */
386    public final void fastCodeSymbols(int bit, int ctxt, int n) {
387        int q;  // cache for context's Qe
388        int la; // cache for A register
389        int nc; // counter for renormalization shifts
390        int ns; // the maximum length of a speedup mode run
391        int li; // cache for I[ctxt]
392
393        li = I[ctxt]; // cache current index
394        q=qe[li];     // retrieve current LPS prob.
395
396        if ((q <= 0x4000) && (bit == mPS[ctxt])
397            && ((ns = (a-0x8000)/q+1) > 1)) { // Do speed up mode
398            // coding MPS, no conditional exchange can occur and
399            // speedup mode is possible for more than 1 symbol
400            do { // do as many speedup runs as necessary
401                if (n <= ns) { // All symbols in this run
402                    // code 'n' symbols
403                    la = n*q; // accumulated Q
404                    a -= la;
405                    c += la;
406                    if (a >= 0x8000) { // no renormalization
407                        I[ctxt] = li;  // save the current state
408                        return; // done
409                    }
410                    I[ctxt] = nMPS[li]; // goto next state and save it
411                    // -- Renormalization (MPS: no need for while loop)
412                    a <<= 1; // a is doubled
413                    c <<= 1; // c is doubled
414                    cT--;
415                    if(cT==0) {
416                        byteOut();
417                    }
418                    // -- End of renormalization
419                    return; // done
420                }
421                else { // Not all symbols in this run
422                    // code 'ns' symbols
423                    la = ns*q; // accumulated Q
424                    c += la;
425                    a -= la;
426                    // cache li and q for next iteration
427                    li = nMPS[li];
428                    q = qe[li]; // New q is always less than current one
429                    // new I[ctxt] is stored in last run
430                    // Renormalization always occurs since we exceed 'ns'
431                    // -- Renormalization (MPS: no need for while loop)
432                    a <<= 1; // a is doubled
433                    c <<= 1; // c is doubled
434                    cT--;
435                    if(cT==0) {
436                        byteOut();
437                    }
438                    // -- End of renormalization
439                    n -= ns; // symbols left to code
440                    ns = (a-0x8000)/q+1; // max length of next speedup run
441                    continue; // goto next iteration
442                }
443            } while (n>0);
444        } // end speed up mode
445        else { // No speedup mode
446            // Either speedup mode is not possible or not worth doing it
447            // because of probable conditional exchange
448            // Code everything as in normal mode
449            la = a;       // cache A register in local variable
450            do {
451                if (bit == mPS[ctxt]) { // -- code MPS
452                    la -= q; // Interval division associated with MPS coding
453                    if(la>=0x8000){ // Interval big enough
454                        c += q;
455                    }
456                    else { // Interval too short
457                        if(la<q) // Probabilities are inverted
458                            la = q;
459                        else
460                            c += q;
461                        // cache new li and q for next iteration
462                        li = nMPS[li];
463                        q = qe[li];
464                        // new I[ctxt] is stored after end of loop
465                        // -- Renormalization (MPS: no need for while loop)
466                        la <<= 1; // a is doubled
467                        c <<= 1;  // c is doubled
468                        cT--;
469                        if(cT==0) {
470                            byteOut();
471                        }
472                        // -- End of renormalization
473                    }
474                }
475                else { // -- code LPS
476                    la -= q; // Interval division according to LPS coding
477                    if(la<q)
478                        c += q;
479                    else
480                        la = q;
481                    if(switchLM[li]!=0){
482                        mPS[ctxt]=1-mPS[ctxt];
483                    }
484                    // cache new li and q for next iteration
485                    li = nLPS[li];
486                    q = qe[li];
487                    // new I[ctxt] is stored after end of loop
488                    // -- Renormalization
489                    // sligthly better than normal loop
490                    nc = 0;
491                    do {
492                        la <<= 1;
493                        nc++; // count number of necessary shifts
494                    } while (la<0x8000);
495                    if (cT > nc) {
496                        c <<= nc;
497                        cT -= nc;
498                    }
499                    else {
500                        do {
501                            c <<= cT;
502                            nc -= cT;
503                            // cT = 0; // not necessary
504                            byteOut();
505                        } while (cT <= nc);
506                        c <<= nc;
507                        cT -= nc;
508                    }
509                    // -- End of renormalization
510                }
511                n--;
512            } while (n>0);
513            I[ctxt] = li; // store new I[ctxt]
514            a = la; // save cached A register
515        }
516    }
517
518    /**
519     * This function performs the arithmetic encoding of several symbols
520     * together. The function receives an array of symbols that are to be
521     * encoded and an array containing the contexts with which to encode them.
522     *
523     * <P>The advantage of using this function is that the cost of the method
524     * call is amortized by the number of coded symbols per method call.
525     *
526     * <P>Each context has a current MPS and an index describing what the
527     * current probability is for the LPS. Each bit is encoded and if the
528     * probability of the LPS exceeds .5, the MPS and LPS are switched.
529     *
530     * @param bits An array containing the symbols to be encoded. Valid
531     * symbols are 0 and 1.
532     *
533     * @param cX The context for each of the symbols to be encoded
534     *
535     * @param n The number of symbols to encode.
536     * */
537    public final void codeSymbols(int[] bits, int[] cX, int n){
538        int q;
539        int li; // local cache of I[context]
540        int la;
541        int nc;
542        int ctxt; // context of current symbol
543        int i; // counter
544
545        // NOTE: here we could use symbol aggregation to speed things up.
546        // It remains to be studied.
547
548        la = a; // cache A register in local variable
549        for(i=0;i<n;i++){
550            // NOTE: (a < 0x8000) is equivalent to ((a & 0x8000)==0)
551            // since 'a' is always less than or equal to 0xFFFF
552
553            // NOTE: conditional exchange guarantees that A for MPS is
554            // always greater than 0x4000 (i.e. 0.375)
555            // => one renormalization shift is enough for MPS
556            // => no need to do a renormalization while loop for MPS
557
558            ctxt = cX[i];
559            li = I[ctxt];
560            q=qe[li]; // Retrieve current LPS prob.
561
562            if(bits[i]==mPS[ctxt]){ // -- Code MPS
563
564                la -= q; // Interval division associated with MPS coding
565
566                if(la>=0x8000){ // Interval big enough
567                    c += q;
568                }
569                else { // Interval too short
570                    if(la<q) // Probabilities are inverted
571                        la = q;
572                    else
573                        c += q;
574
575                    I[ctxt]=nMPS[li];
576
577                    // -- Renormalization (MPS: no need for while loop)
578                    la <<= 1; // a is doubled
579                    c <<= 1; // c is doubled
580                    cT--;
581                    if(cT==0) {
582                        byteOut();
583                    }
584                    // -- End of renormalization
585                }
586            }// End Code MPS --
587            else{ // -- Code LPS
588                la -= q; // Interval division according to LPS coding
589
590                if(la<q)
591                    c += q;
592                else
593                    la = q;
594                if(switchLM[li]!=0){
595                    mPS[ctxt]=1-mPS[ctxt];
596                }
597                I[ctxt]=nLPS[li];
598
599                // -- Renormalization
600
601                // sligthly better than normal loop
602                nc = 0;
603                do {
604                    la <<= 1;
605                    nc++; // count number of necessary shifts
606                } while (la<0x8000);
607                if (cT > nc) {
608                    c <<= nc;
609                    cT -= nc;
610                }
611                else {
612                    do {
613                        c <<= cT;
614                        nc -= cT;
615                        // cT = 0; // not necessary
616                        byteOut();
617                    } while (cT <= nc);
618                    c <<= nc;
619                    cT -= nc;
620                }
621
622                // -- End of renormalization
623            }
624        }
625        a = la; // save cached A register
626    }
627
628
629    /**
630     * This function performs the arithmetic encoding of one symbol. The
631     * function receives a bit that is to be encoded and a context with which
632     * to encode it.
633     *
634     * <P>Each context has a current MPS and an index describing what the
635     * current probability is for the LPS. Each bit is encoded and if the
636     * probability of the LPS exceeds .5, the MPS and LPS are switched.
637     *
638     * @param bit The symbol to be encoded, must be 0 or 1.
639     *
640     * @param context the context with which to encode the symbol.
641     * */
642    public final void codeSymbol(int bit, int context){
643        int q;
644        int li; // local cache of I[context]
645        int la;
646        int n;
647
648        // NOTE: (a < 0x8000) is equivalent to ((a & 0x8000)==0)
649        // since 'a' is always less than or equal to 0xFFFF
650
651        // NOTE: conditional exchange guarantees that A for MPS is
652        // always greater than 0x4000 (i.e. 0.375)
653        // => one renormalization shift is enough for MPS
654        // => no need to do a renormalization while loop for MPS
655
656        li = I[context];
657        q=qe[li]; // Retrieve current LPS prob.
658
659        if(bit==mPS[context]){// -- Code MPS
660
661            a -= q; // Interval division associated with MPS coding
662
663            if(a>=0x8000){ // Interval big enough
664                c += q;
665            }
666            else { // Interval too short
667                if(a<q) // Probabilities are inverted
668                    a = q;
669                else
670                    c += q;
671
672                I[context]=nMPS[li];
673
674                // -- Renormalization (MPS: no need for while loop)
675                a <<= 1; // a is doubled
676                c <<= 1; // c is doubled
677                cT--;
678                if(cT==0) {
679                    byteOut();
680                }
681                // -- End of renormalization
682            }
683        }// End Code MPS --
684        else{ // -- Code LPS
685
686            la = a; // cache A register in local variable
687            la -= q; // Interval division according to LPS coding
688
689            if(la<q)
690                c += q;
691            else
692                la = q;
693            if(switchLM[li]!=0){
694                mPS[context]=1-mPS[context];
695            }
696            I[context]=nLPS[li];
697
698            // -- Renormalization
699
700            // sligthly better than normal loop
701            n = 0;
702            do {
703                la <<= 1;
704                n++; // count number of necessary shifts
705            } while (la<0x8000);
706            if (cT > n) {
707                c <<= n;
708                cT -= n;
709            }
710            else {
711                do {
712                    c <<= cT;
713                    n -= cT;
714                    // cT = 0; // not necessary
715                    byteOut();
716                } while (cT <= n);
717                c <<= n;
718                cT -= n;
719            }
720
721            // -- End of renormalization
722            a = la; // save cached A register
723        }
724    }
725
726    /**
727     * This function puts one byte of compressed bits in the out out stream.
728     * the highest 8 bits of c are then put in b to be the next byte to
729     * write. This method delays the output of any 0xFF bytes until a non 0xFF
730     * byte has to be written to the output bit stream (the 'delFF' variable
731     * signals if there is a delayed 0xff byte).
732     * */
733    private void byteOut(){
734        if(nrOfWrittenBytes >= 0){
735            if(b==0xFF){
736                // Delay 0xFF byte
737                delFF = true;
738                b=c>>>20;
739                c &= 0xFFFFF;
740                cT=7;
741            }
742            else if(c < 0x8000000){
743                // Write delayed 0xFF bytes
744                if (delFF) {
745                    out.write(0xFF);
746                    delFF = false;
747                    nrOfWrittenBytes++;
748                }
749                out.write(b);
750                nrOfWrittenBytes++;
751                b=c>>>19;
752                c &= 0x7FFFF;
753                cT=8;
754            }
755            else{
756                b++;
757                if(b==0xFF){
758                    // Delay 0xFF byte
759                    delFF = true;
760                    c &= 0x7FFFFFF;
761                    b=c>>>20;
762                    c &= 0xFFFFF;
763                    cT=7;
764                }
765                else{
766                    // Write delayed 0xFF bytes
767                    if (delFF) {
768                        out.write(0xFF);
769                        delFF = false;
770                        nrOfWrittenBytes++;
771                    }
772                    out.write(b);
773                    nrOfWrittenBytes++;
774                    b=((c>>>19)&0xFF);
775                    c &= 0x7FFFF;
776                    cT=8;
777                }
778            }
779        }
780        else {
781            // NOTE: carry bit can never be set if the byte buffer was empty
782            b= (c>>>19);
783            c &= 0x7FFFF;
784            cT=8;
785            nrOfWrittenBytes++;
786        }
787    }
788
789    /**
790     * This function flushes the remaining encoded bits and makes sure that
791     * enough information is written to the bit stream to be able to finish
792     * decoding, and then it reinitializes the internal state of the MQ coder
793     * but without modifying the context states.
794     *
795     * <P>After calling this method the 'finishLengthCalculation()' method
796     * should be called, after cmopensating the returned length for the length
797     * of previous coded segments, so that the length calculation is finalized.
798     *
799     * <P>The type of termination used depends on the one specified at the
800     * constructor.
801     *
802     * @return The length of the arithmetic codeword after termination, in
803     * bytes.
804     * */
805    public int terminate(){
806        switch (ttype) {
807        case TERM_FULL:
808            //sets the remaining bits of the last byte of the coded bits.
809            int tempc=c+a;
810            c=c|0xFFFF;
811            if(c>=tempc)
812                c=c-0x8000;
813
814            int remainingBits = 27-cT;
815
816            // Flushes remainingBits
817            do{
818                c <<= cT;
819                if(b != 0xFF)
820                    remainingBits -= 8;
821                else
822                    remainingBits -= 7;
823                byteOut();
824            }
825            while(remainingBits > 0);
826
827            b |= (1<<(-remainingBits))-1;
828            if (b==0xFF) { // Delay 0xFF bytes
829                delFF = true;
830            }
831            else {
832                // Write delayed 0xFF bytes
833                if (delFF) {
834                    out.write(0xFF);
835                    delFF = false;
836                    nrOfWrittenBytes++;
837                }
838                out.write(b);
839                nrOfWrittenBytes++;
840            }
841            break;
842        case TERM_PRED_ER:
843        case TERM_EASY:
844            // The predictable error resilient and easy termination are the
845            // same, except for the fact that the easy one can modify the
846            // spare bits in the last byte to maximize the likelihood of
847            // having a 0xFF, while the error resilient one can not touch
848            // these bits.
849
850            // In the predictable error resilient case the spare bits will be
851            // recalculated by the decoder and it will check if they are the
852            // same as as in the codestream and then deduce an error
853            // probability from there.
854
855            int k; // number of bits to push out
856
857            k = (11-cT)+1;
858
859            c <<= cT;
860            for (; k > 0; k-=cT, c<<=cT){
861              byteOut();
862            }
863
864            // Make any spare bits 1s if in easy termination
865            if (k < 0 && ttype == TERM_EASY) {
866                // At this stage there is never a carry bit in C, so we can
867                // freely modify the (-k) least significant bits.
868                b |= (1<<(-k))-1;
869            }
870
871            byteOut(); // Push contents of byte buffer
872            break;
873        case TERM_NEAR_OPT:
874
875            // This algorithm terminates in the shortest possible way, besides
876            // the fact any previous 0xFF 0x7F sequences are not
877            // eliminated. The probabalility of having those sequences is
878            // extremely low.
879
880            // The calculation of the length is based on the fact that the
881            // decoder will pad the codestream with an endless string of
882            // (binary) 1s. If the codestream, padded with 1s, is within the
883            // bounds of the current interval then correct decoding is
884            // guaranteed. The lower inclusive bound of the current interval
885            // is the value of C (i.e. if only lower intervals would be coded
886            // in the future). The upper exclusive bound of the current
887            // interval is C+A (i.e. if only upper intervals would be coded in
888            // the future). We therefore calculate the minimum length that
889            // would be needed so that padding with 1s gives a codestream
890            // within the interval.
891
892            // In general, such a calculation needs the value of the next byte
893            // that appears in the codestream. Here, since we are terminating,
894            // the next value can be anything we want that lies within the
895            // interval, we use the lower bound since this minimizes the
896            // length. To calculate the necessary length at any other place
897            // than the termination it is necessary to know the next bytes
898            // that will appear in the codestream, which involves storing the
899            // codestream and the sate of the MQCoder at various points (a
900            // worst case approach can be used, but it is much more
901            // complicated and the calculated length would be only marginally
902            // better than much simple calculations, if not the same).
903
904            int cLow;
905            int cUp;
906            int bLow;
907            int bUp;
908
909            // Initialize the upper (exclusive) and lower bound (inclusive) of
910            // the valid interval (the actual interval is the concatenation of
911            // bUp and cUp, and bLow and cLow).
912            cLow = c;
913            cUp = c+a;
914            bLow = bUp = b;
915
916            // We start by normalizing the C register to the sate cT = 0
917            // (i.e., just before byteOut() is called)
918            cLow <<= cT;
919            cUp  <<= cT;
920            // Progate eventual carry bits and reset them in Clow, Cup NOTE:
921            // carry bit can never be set if the byte buffer was empty so no
922            // problem with propagating a carry into an empty byte buffer.
923            if ((cLow & (1<<27)) != 0) { // Carry bit in cLow
924                if (bLow == 0xFF) {
925                    // We can not propagate carry bit, do bit stuffing
926                    delFF = true; // delay 0xFF
927                    // Get next byte buffer
928                    bLow = cLow>>>20;
929                    bUp = cUp>>>20;
930                    cLow &= 0xFFFFF;
931                    cUp &= 0xFFFFF;
932                    // Normalize to cT = 0
933                    cLow <<= 7;
934                    cUp <<= 7;
935                }
936                else { // we can propagate carry bit
937                    bLow++; // propagate
938                    cLow &= ~(1<<27); // reset carry in cLow
939                }
940            }
941            if ((cUp & (1<<27)) != 0) {
942                bUp++; // propagate
943                cUp &= ~(1<<27); // reset carry
944            }
945
946            // From now on there can never be a carry bit on cLow, since we
947            // always output bLow.
948
949            // Loop testing for the condition and doing byte output if they
950            // are not met.
951            while(true){
952                // If decoder's codestream is within interval stop
953                // If preceding byte is 0xFF only values [0,127] are valid
954                if(delFF){ // If delayed 0xFF
955                    if (bLow <= 127 && bUp > 127) break;
956                    // We will write more bytes so output delayed 0xFF now
957                    out.write(0xFF);
958                    nrOfWrittenBytes++;
959                    delFF = false;
960                }
961                else{ // No delayed 0xFF
962                    if (bLow <= 255 && bUp > 255) break;
963                }
964
965                // Output next byte
966                // We could output anything within the interval, but using
967                // bLow simplifies things a lot.
968
969                // We should not have any carry bit here
970
971                // Output bLow
972                if (bLow < 255) {
973                    // Transfer byte bits from C to B
974                    // (if the byte buffer was empty output nothing)
975                    if (nrOfWrittenBytes >= 0) out.write(bLow);
976                    nrOfWrittenBytes++;
977                    bUp -= bLow;
978                    bUp <<= 8;
979                    // Here bLow would be 0
980                    bUp |= (cUp >>> 19) & 0xFF;
981                    bLow = (cLow>>> 19) & 0xFF;
982                    // Clear upper bits (just pushed out) from cUp Clow.
983                    cLow &= 0x7FFFF;
984                    cUp  &= 0x7FFFF;
985                    // Goto next state where CT is 0
986                    cLow <<= 8;
987                    cUp  <<= 8;
988                    // Here there can be no carry on Cup, Clow
989                }
990                else { // bLow = 0xFF
991                    // Transfer byte bits from C to B
992                    // Since the byte to output is 0xFF we can delay it
993                    delFF = true;
994                    bUp -= bLow;
995                    bUp <<= 7;
996                    // Here bLow would be 0
997                    bUp |= (cUp>>20) & 0x7F;
998                    bLow = (cLow>>20) & 0x7F;
999                    // Clear upper bits (just pushed out) from cUp Clow.
1000                    cLow &= 0xFFFFF;
1001                    cUp  &= 0xFFFFF;
1002                    // Goto next state where CT is 0
1003                    cLow <<= 7;
1004                    cUp  <<= 7;
1005                    // Here there can be no carry on Cup, Clow
1006                }
1007            }
1008            break;
1009        default:
1010            throw new Error("Illegal termination type code");
1011        }
1012
1013        // Reinitialize the state (without modifying the contexts)
1014        int len;
1015
1016        len = nrOfWrittenBytes;
1017        a = 0x8000;
1018        c = 0;
1019        b = 0;
1020        cT = 12;
1021        delFF = false;
1022        nrOfWrittenBytes = -1;
1023
1024        // Return the terminated length
1025        return len;
1026    }
1027
1028    /**
1029     * Returns the number of contexts in the arithmetic coder.
1030     *
1031     * @return The number of contexts
1032     * */
1033    public final int getNumCtxts(){
1034        return I.length;
1035    }
1036
1037    /**
1038     * Resets a context to the original probability distribution, and sets its
1039     * more probable symbol to 0.
1040     *
1041     * @param c The number of the context (it starts at 0).
1042     * */
1043    public final void resetCtxt(int c){
1044        I[c]=initStates[c];
1045        mPS[c] = 0;
1046    }
1047
1048    /**
1049     * Resets all contexts to their original probability distribution and sets
1050     * all more probable symbols to 0.
1051     * */
1052    public final void resetCtxts(){
1053        System.arraycopy(initStates,0,I,0,I.length);
1054        ArrayUtil.intArraySet(mPS,0);
1055    }
1056
1057    /**
1058     * Returns the number of bytes that are necessary from the compressed
1059     * output stream to decode all the symbols that have been coded this
1060     * far. The number of returned bytes does not include anything coded
1061     * previous to the last time the 'terminate()' or 'reset()' methods where
1062     * called.
1063     *
1064     * <P>The values returned by this method are then to be used in finishing
1065     * the length calculation with the 'finishLengthCalculation()' method,
1066     * after compensation of the offset in the number of bytes due to previous
1067     * terminated segments.
1068     *
1069     * <P>This method should not be called if the current coding pass is to be
1070     * terminated. The 'terminate()' method should be called instead.
1071     *
1072     * <P>The calculation is done based on the type of length calculation
1073     * specified at the constructor.
1074     *
1075     * @return The number of bytes in the compressed output stream necessary
1076     * to decode all the information coded this far.
1077     * */
1078    public final int getNumCodedBytes(){
1079        // NOTE: testing these algorithms for correctness is quite
1080        // difficult. One way is to modify the rate allocator so that not all
1081        // bit-planes are output if the distortion estimate for last passes is
1082        // the same as for the previous ones.
1083
1084        switch (ltype) {
1085        case LENGTH_LAZY_GOOD:
1086            // This one is a bit better than LENGTH_LAZY.
1087            int bitsInN3Bytes; // The minimum amount of bits that can be stored
1088                               // in the 3 bytes following the current byte
1089                               // buffer 'b'.
1090            if (b >= 0xFE) {
1091                // The byte after b can have a bit stuffed so ther could be
1092                // one less bit available
1093                bitsInN3Bytes = 22; // 7 + 8 + 7
1094            }
1095            else {
1096                // We are sure that next byte after current byte buffer has no
1097                // bit stuffing
1098                bitsInN3Bytes = 23; // 8 + 7 + 8
1099            }
1100            if ((11-cT+16) <= bitsInN3Bytes) {
1101                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+3;
1102            }
1103            else {
1104                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+4;
1105            }
1106        case LENGTH_LAZY:
1107            // This is the very basic one that appears in the VM text
1108            if ((27-cT) <= 22) {
1109                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+3;
1110            }
1111            else {
1112                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+4;
1113            }
1114        case LENGTH_NEAR_OPT:
1115            // This is the best length calculation implemented in this class.
1116            // It is almost always optimal. In order to calculate the length
1117            // it is necessary to know which bytes will follow in the MQ
1118            // bit stream, so we need to wait until termination to perform it.
1119            // Save the state to perform the calculation later, in
1120            // finishLengthCalculation()
1121            saveState();
1122            // Return current number of output bytes to use it later in
1123            // finishLengthCalculation()
1124            return nrOfWrittenBytes;
1125        default:
1126            throw new Error("Illegal length calculation type code");
1127        }
1128    }
1129
1130    /**
1131     * Reinitializes the MQ coder and the underlying 'ByteOutputBuffer' buffer
1132     * as if a new object was instantaited. All the data in the
1133     * 'ByteOutputBuffer' buffer is erased and the state and contexts of the
1134     * MQ coder are reinitialized). Additionally any saved MQ states are
1135     * discarded.
1136     * */
1137    public final void reset() {
1138
1139        // Reset the output buffer
1140        out.reset();
1141
1142        a=0x8000;
1143        c=0;
1144        b=0;
1145        if(b==0xFF)
1146            cT=13;
1147        else
1148            cT=12;
1149        resetCtxts();
1150        nrOfWrittenBytes = -1;
1151        delFF = false;
1152
1153        nSaved = 0;
1154    }
1155
1156    /**
1157     * Saves the current state of the MQ coder (just the registers, not the
1158     * contexts) so that a near optimal length calculation can be performed
1159     * later.
1160     * */
1161    private void saveState() {
1162        // Increase capacity if necessary
1163        if (nSaved == savedC.length) {
1164            Object tmp;
1165            tmp = savedC;
1166            savedC = new int[nSaved+SAVED_INC];
1167            System.arraycopy(tmp,0,savedC,0,nSaved);
1168            tmp = savedCT;
1169            savedCT = new int[nSaved+SAVED_INC];
1170            System.arraycopy(tmp,0,savedCT,0,nSaved);
1171            tmp = savedA;
1172            savedA = new int[nSaved+SAVED_INC];
1173            System.arraycopy(tmp,0,savedA,0,nSaved);
1174            tmp = savedB;
1175            savedB = new int[nSaved+SAVED_INC];
1176            System.arraycopy(tmp,0,savedB,0,nSaved);
1177            tmp = savedDelFF;
1178            savedDelFF = new boolean[nSaved+SAVED_INC];
1179            System.arraycopy(tmp,0,savedDelFF,0,nSaved);
1180        }
1181        // Save the current sate
1182        savedC[nSaved] = c;
1183        savedCT[nSaved] = cT;
1184        savedA[nSaved] = a;
1185        savedB[nSaved] = b;
1186        savedDelFF[nSaved] = delFF;
1187        nSaved++;
1188    }
1189
1190    /**
1191     * Terminates the calculation of the required length for each coding
1192     * pass. This method must be called just after the 'terminate()' one has
1193     * been called for each terminated MQ segment.
1194     *
1195     * <P>The values in 'rates' must have been compensated for any offset due
1196     * to previous terminated segments, so that the correct index to the
1197     * stored coded data is used.
1198     *
1199     * @param rates The array containing the values returned by
1200     * 'getNumCodedBytes()' for each coding pass.
1201     *
1202     * @param n The index in the 'rates' array of the last terminated length.
1203     * */
1204    public void finishLengthCalculation(int rates[], int n) {
1205        if (ltype != LENGTH_NEAR_OPT) {
1206            // For the simple calculations the only thing we need to do is to
1207            // ensure that the calculated lengths are no greater than the
1208            // terminated one
1209            if (n > 0 && rates[n-1] > rates[n]) {
1210                // We need correction
1211                int tl = rates[n]; // The terminated length
1212                n--;
1213                do {
1214                    rates[n--] = tl;
1215               }  while (n >= 0 && rates[n] > tl);
1216            }
1217        }
1218        else {
1219            // We need to perform the more sophisticated near optimal
1220            // calculation.
1221
1222            // The calculation of the length is based on the fact that the
1223            // decoder will pad the codestream with an endless string of
1224            // (binary) 1s after termination. If the codestream, padded with
1225            // 1s, is within the bounds of the current interval then correct
1226            // decoding is guaranteed. The lower inclusive bound of the
1227            // current interval is the value of C (i.e. if only lower
1228            // intervals would be coded in the future). The upper exclusive
1229            // bound of the current interval is C+A (i.e. if only upper
1230            // intervals would be coded in the future). We therefore calculate
1231            // the minimum length that would be needed so that padding with 1s
1232            // gives a codestream within the interval.
1233
1234            // In order to know what will be appended to the current base of
1235            // the interval we need to know what is in the MQ bit stream after
1236            // the current last output byte until the termination. This is why
1237            // this calculation has to be performed after the MQ segment has
1238            // been entirely coded and terminated.
1239
1240            int cLow; // lower bound on the C register for correct decoding
1241            int cUp;  // upper bound on the C register for correct decoding
1242            int bLow; // lower bound on the byte buffer for correct decoding
1243            int bUp;  // upper bound on the byte buffer for correct decoding
1244            int ridx; // index in the rates array of the pass we are
1245            // calculating
1246            int sidx; // index in the saved state array
1247            int clen; // current calculated length
1248            boolean cdFF; // the current delayed FF state
1249            int nb;   // the next byte of output
1250            int minlen; // minimum possible length
1251            int maxlen; // maximum possible length
1252
1253            // Start on the first pass of this segment
1254            ridx = n-nSaved;
1255            // Minimum allowable length is length of previous termination
1256            minlen = (ridx-1>=0) ? rates[ridx-1] : 0;
1257            // Maximum possible length is the terminated length
1258            maxlen = rates[n];
1259            for (sidx = 0; ridx < n; ridx++, sidx++) {
1260                // Load the initial values of the bounds
1261                cLow = savedC[sidx];
1262                cUp = savedC[sidx]+savedA[sidx];
1263                bLow = savedB[sidx];
1264                bUp = savedB[sidx];
1265                // Normalize to CT = 0 and propagate and reset any carry bits
1266                cLow <<= savedCT[sidx];
1267                if ((cLow & 0x8000000) != 0) {
1268                    bLow++;
1269                    cLow &= 0x7FFFFFF;
1270                }
1271                cUp <<= savedCT[sidx];
1272                if ((cUp & 0x8000000) != 0) {
1273                    bUp++;
1274                    cUp &= 0x7FFFFFF;
1275                }
1276                // Initialize current calculated length
1277                cdFF = savedDelFF[sidx];
1278                // rates[ridx] contains the number of bytes already output
1279                // when the state was saved, compensated for the offset in the
1280                // output stream.
1281                clen = rates[ridx]+(cdFF? 1 : 0);
1282                while (true) {
1283                    // If we are at end of coded data then this is the length
1284                    if (clen >= maxlen) {
1285                        clen = maxlen;
1286                        break;
1287                    }
1288                    // Check for sufficiency of coded data
1289                    if (cdFF) {
1290                        if (bLow < 128 && bUp >= 128) {
1291                            // We are done for this pass
1292                            clen--; // Don't need delayed FF
1293                            break;
1294                        }
1295                    }
1296                    else {
1297                        if (bLow < 256 && bUp >= 256) {
1298                            // We are done for this pass
1299                            break;
1300                        }
1301                    }
1302                    // Update bounds with next byte of coded data and
1303                    // normalize to CT = 0 again.
1304                    nb =  (clen >= minlen) ? out.getByte(clen) : 0;
1305                    bLow -= nb;
1306                    bUp -= nb;
1307                    clen++;
1308                    if (nb == 0xFF) {
1309                        bLow <<= 7;
1310                        bLow |= (cLow >> 20) & 0x7F;
1311                        cLow &= 0xFFFFF;
1312                        cLow <<= 7;
1313                        bUp <<= 7;
1314                        bUp |= (cUp >> 20) & 0x7F;
1315                        cUp &= 0xFFFFF;
1316                        cUp <<= 7;
1317                        cdFF = true;
1318                    }
1319                    else {
1320                        bLow <<= 8;
1321                        bLow |= (cLow >> 19) & 0xFF;
1322                        cLow &= 0x7FFFF;
1323                        cLow <<= 8;
1324                        bUp <<= 8;
1325                        bUp |= (cUp >> 19) & 0xFF;
1326                        cUp &= 0x7FFFF;
1327                        cUp <<= 8;
1328                        cdFF = false;
1329                    }
1330                    // Test again
1331                }
1332                // Store the rate found
1333                rates[ridx] = (clen>=minlen) ? clen : minlen;
1334            }
1335            // Reset the saved states
1336            nSaved = 0;
1337        }
1338    }
1339}