问题描述:

I'd like to stream a big log file over the network using asyncio. I retrieve the data from the database, format it, compress it using python's zlib and stream it over the network.

Here is basically the code I use:

@asyncio.coroutine

def logs(requests):

# ...

yield from resp.prepare(request)

# gzip magic number and compression format

resp.write(b'\x1f\x8b\x08\x00\x00\x00\x00\x00')

compressor = compressobj()

for row in rows:

ip, uid, date, url, answer, volume = row

NCSA_ROW = '{} {} - [{}] "GET {} HTTP/1.0" {} {}\n'

row = NCSA_ROW.format(ip, uid, date, url, answer, volume)

row = row.encode('utf-8')

data = compressor.compress(row)

resp.write(data)

resp.write(compressor.flush())

return resp

The file that I retrieve can not be opened with gunzip and zcat raise the following error:

gzip: out.gz: unexpected end of file

网友答案:

Your gzip header is wrong (8 bytes instead of 10), and you follow it with a zlib stream which uses a different header and trailer. Even had you had a correct gzip header, and if you had a raw deflate stream instead of a gzip stream, you would still have not written a gzip trailer.

To do this right, do not attempt to write your own gzip header. Instead request that zlib write a complete gzip stream, which will write the correct header, compressed data, and trailer. You can do this by providing a wbits value of 31 to compressobj().

相关阅读:
Top