next up previous contents index
Next: Deflation/Inflation Coder ( DeflateCoder Up: Lossless Compression Previous: Static Arithmetic Coder (   Contents   Index


Prediction by Partial Matching ( PPMIICoder )

Definition

The PPMIICoder is based on the compression scheme ``Prediction by Partial Matching with Information Inheritance'' by D. Shkarin [81].
This coder works as follows: Suppose we have processed the first n - 1 symbols x1...xn-1 of the stream. Before reading the next symbol xn we try to guess it, i.e. for every symbol s we estimate the probability p(s) for the event ``xn = s''. This probability distribution determines how the next symbol is encoded: The higher p(s), the fewer bits are used for encoding s. If our estimation is good, which means that p(xn) is high, then we obtain a good compression rate.
In order to predict the probality distribution for the nth symbol the PPM approach considers the preceding k symbols xn-k...xn-1. We call these symbols the context of xn and k the order of the model. (For k = 0 we obtain the order-0 model from the previous section.) E.g., if the current context is ``req'', then we should predict the letter ``u'' as next symbol with high probability.
PPMII is a variant of PPM which usually achieves very accurate estimations.

The PPMIICoder combines very good compression rates with acceptable speed. (Shkarin [81] reports that his coder outperforms ZIP and BZIP2 with respect to compression rates and speed.) The only disadvantage of this coder is that it needs a fair amount of main memory to store the model. However, the user can set an upper bound on the memory usage. And he can specify which model restoration method the coder shall apply when it runs out of memory:

#include < LEDA/coding/PPMII.h >

Types

PPMIICoder::mr_method { mr_restart, mr_cut_off, mr_freeze }
  the different model restoration modes.

Creation

PPMIICoder C(streambuf* src_stream = 0, streambuf* tgt_stream = 0, bool own_streams = false)
    creates an instance C which uses the given source and target streams. If own_streams is set, then C is responsible for the destruction of the streams, otherwise the pointers src_stream and tgt_stream must be valid during the life-time of C.

PPMIICoder C(const char* src_file_name, const char* tgt_file_name)
    creates an instance C which uses file-streams for input and output.

Operations

Standard Operations

void C.encode() encodes the source stream and writes the output to the target stream.

void C.decode() decodes the source stream and writes the output to the target stream.

uint32 C.encode_memory_chunk(const char* in_buf, uint32 in_len, char* out_buf, uint32 out_len)
    encodes the memory chunk starting at in_buf with size in_len into the buffer starting at out_buf with size out_len. The function returns actual length of the encoded chunk which may be smaller than out_len. If the output buffer is too small for the encoded data the failure flag will be set (see below).

uint32 C.decode_memory_chunk(const char* in_buf, uint32 in_len, char* out_buf, uint32 out_len)
    decodes a memory chunk. The meaning of the parameters and the return value is the same as in the previous function.

streambuf* C.get_src_stream() returns the current source stream.

void C.set_src_stream(streambuf* src_stream, bool own_stream = false)
    sets the source stream (cf. constructor).

void C.set_src_file(const char* file_name)
    sets a file as source stream.

streambuf* C.get_tgt_stream() returns the current target stream.

void C.set_tgt_stream(streambuf* tgt_stream, bool own_Stream = false)
    sets the target stream (cf. constructor).

void C.set_tgt_file(const char* file_name)
    sets a file as target stream.

void C.reset(bool keep_parameters = true)
    puts C in the same state as the default constructor. If keep_parameters is false the parameters are set to their default values.

bool C.failed() returns true if an error occured.

bool C.finished() returns true if the coding is finished.

string C.get_description() provides a description for C.

Additional Operations

uint16 C.get_model_memory_bound()
    returns how much memory (in MB) C is allowed to use for storing the model.

void C.set_model_memory_bound(uint16 max_mem)
    determines the amount of memory available for the model (in MB).

byte C.get_model_order() returns the order of the model.

void C.set_model_order(byte order)
    sets the order of the model.
Precondition order $\in$ [2..16].

mr_method C.get_model_restoration_method()
    returns the model restoration method.

void C.set_model_restoration_method(mr_method method)
    sets the model restoration method.

Implementation

Our implementation encapsulates the code by D. Shkarin (the implementation of the PPMII model) and by D. Subbotin (the implementation of the range coder).


next up previous contents index
Next: Deflation/Inflation Coder ( DeflateCoder Up: Lossless Compression Previous: Static Arithmetic Coder (   Contents   Index