php
css
python
linux
android
ruby-on-rails
regex
mysql
visual-studio
silverlight
flash
perl
oracle
delphi
apache
mvc
php5
asp
api
jsp
I do not know exactly internally zlib algorithm work but based on my understanding on DictGZIPOutputStream, when you call write() method, after it is write, it will update its crc for that byte array. So if you call again updateCRC() in your code again, then thing become wrong as the crc is updated twice. Then when gzip -d is executed, as a result of previous two crc updates, gzip will complaint "invalid compressed data--crc error"
DictGZIPOutputStream
updateCRC()
I also noticed that you did not close the compressor after it is used. When I executed the code pasted above, it gave error "gzip: stdin: unexpected end of file". So always make sure to flush method and close method is called in the end. With that said, I have the following,
import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import java.util.zip.GZIPOutputStream; public class Dict { protected static final int BLOCK_SIZE = 128; protected static final int DICT_DIZE = 32; public static void main(String[] args) { InputStream stdinBytes = System.in; byte[] input = new byte[BLOCK_SIZE]; byte[] dict = new byte[DICT_DIZE]; int bytesRead = 0; try { DictGZIPOutputStream compressor = new DictGZIPOutputStream(System.out); bytesRead = stdinBytes.read(input, 0, BLOCK_SIZE); if (bytesRead >= DICT_DIZE) { System.arraycopy(input, 0, dict, 0, DICT_DIZE); } do { compressor.write(input, 0, bytesRead); if (bytesRead == BLOCK_SIZE) { System.arraycopy(input, BLOCK_SIZE-1, dict, 0, DICT_DIZE); compressor.setDictionary(dict); } bytesRead = stdinBytes.read(input, 0, BLOCK_SIZE); } while (bytesRead > 0); compressor.flush(); compressor.close(); } catch (IOException e) { e.printStackTrace(); } } public static class DictGZIPOutputStream extends GZIPOutputStream { public DictGZIPOutputStream(OutputStream out) throws IOException { super(out); } public void setDictionary(byte[] b) { def.setDictionary(b); } public void updateCRC(byte[] input) { crc.update(input); } } }
The test result at the console.
$ cat file.txt hello world, how are you?1e3djw hello world, how are you?1e3djw adfa asdfas $ cat file.txt | java Dict | gzip -d | cmp file.txt ; echo $? 0