06-30-07 12:21 AM
stenasc@yahoo.com writes:
> Hi,
>
> I have some C code which is being ported from Windoze to Unix. In the
> code there a an encryption function which encrypts test by XORing the
> hex values for the characters by FF. Ehen we look at the encrypted
> characters they are all special characters.
>
> There are thousands of calls to the encryption function...every line
> has been encrypted seperately. Another function in a seperate
> application decrypts the encrypted file by XORing the encrypted
> characters in order to get the original text. Works fine under
> Windoze.
>
> Now...porting to Unix.
> When I try to decrypt using the text in a unix environent, I get pure
> garbage....I'll illustrate below..
>
> a) library ieee; ....this is the line of text to be
> encrypted
>
> b)
> 00000001h: C4 9A 9A 9A 96 DF 86 8D 9E 8D 9D 96 93 ; (Hex values)
>
> Now in Unix, this encrypted code line b) looks like this....
>
> c)
> 00000012h: C4 9A EF BF BD EF BF BD EF BF BD DF 86 EF BF BD ;
> 00000022h: EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD ;
Something is interpreting the your encoded bytes as utf-8 characters.
C4 9A is a valid two-byte sequence, so that is passed through. The
next byte is 9A, which is not valid as first byte of a utf-8
character. When encountering this invalid utf-8 byte, same something
replaces it with EF BF BD, the utf-8 sequence for a reverse-colour
question mark. The same happens with the following two 9A characters
as well as the 96. DF 86 is again a valid utf-8 sequence and is not
modified. And so it continues.
--
Måns Rullgård
mans@mansr.com
[ Post a follow-up to this message ]
|