"Digging deep into encoding"
While performing File I/O, it is important to keep in mind whether the files being processed can contain special characters or not. If it is so, then check the encoding you are using.
Encoding used in:
StreamReader: it uses UTF-8 encoding by default unless otherwise specified.
StreamWriter: Uses the default encoding of the system unless otherwise specified.
If we do a check on the default encoding - we get "Windows-1252" [which is ANSI]. So while development if we don't specify encoding (and use defaults) then the files created by applications will use ANSI encoding, and applications reading them will use UTF-8!!!
This will give error, and we will see square characters/questions marks in place of the special characters.
Solution:
- We can use encoding "Windows-1252" when reading the file or
- use encoding UTF-8 while writing /reading the files (better approach)
Ironically:
You will get error (if no encoding is specified) only if you are reading an ANSI file using UTF-8 encoding. But you can successfully read a UTF-8 file using the ANSI (Windows-1252) encoding!!!
No comments:
Post a Comment
Note: only a member of this blog may post a comment.