问题描述:

I am working on converting a small program from C to Python and I'm having trouble reading the file. It is a .dat file with data in hex format. Here is the first 132 bytes that I'm trying to read

2400 0000 4c61 7a61 726f 2053 756e 6965

7200 ffff 0000 0000 7261 6a70 6f6f 7420

6279 776f 726b 2069 7363 6869 6f70 7562

6963 2073 6872 6f76 6574 6964 6520 6469

7373 7561 5275 746c 616e 642c 5665 726d

6f6e 742c 0d00 0000 7000 0000 0000 0000

0000 0000 0000 0000 4000 0000 0000 0000

ffff ffff 656e 2073 6f76 6572 6f62 6564

6965 6e74

The C code to read this opens the file in fp and reads it like this.

TEXT_SHORT = 64;

fread(&(record->id), sizeof(int), 1, fp);

fread(&(record->name[0]), sizeof(char), TEXT_SHORT, fp);

fread(&(record->location[0]), sizeof(char), TEXT_SHORT, fp);

printf("%06d\n", record->id);

printf("%s\n", record->name);

printf("%s\n", record->location);

Then when printing the values, I get this:

36

Lazaro Sunier

Rutland,Vermont,

To convert this functionality to Python, I wrote the following code:

def read_file(file):

id = struct.unpack('i', file.read(4))[0]

name = ''.join(struct.unpack('c'*64, file.read(64)))

location = ''.join(struct.unpack('c'*64, file.read(64)))

print(id)

print(name)

print(location)

Then I get this output

36

Lazaro Sunier��rajpoot bywork ischiopubic shrovetide dissua

[email protected]����en soverobedient

I have been struggling with this for a while, and have no idea why this is happening. Is there something that fread() does is the background that I need to implement in Python, or am I doing it wrong?

网友答案:

Although you are reading a 64 byte block both in C and in Python, Python has no such thing as \x00 as string terminator. So, while a printf in C will print until the first \0, Python will print the whole buffer, trailing garbage included.

Just split the string at \0 and only keep the first part:

name = name.split(b"\0")[0]
location = name.split(b"\0")[0]

Incidentally, you can retrieve the 3 elements in a single line:

id, name, location = struct.unpack("i64s64s", file.read(132))
name = name.split(b"\0")[0]
location = name.split(b"\0")[0]
相关阅读:
Top