DBF and language code-page
Recently I had a problem importing data from a 10-years-old set of DBF tables. All was fine until it came to reading texts with polish diacritic marks. It worked fine on 9 out of 10 machines, all with identical configurations (or at least I had hoped they are identical and couldn’t find any differences - Windows 7 x64 PL, .NET 4.5.2, the same regional options). On that single one all special letters got converted into some eye-hurting characters and looked purely wrong.
As it started to reveal, the OleDbConnection class I used to connect (with “Microsoft.Jet.OLEDB.4.0” provider) magically treated strings as Windows-1250 encoded, event though they were CP852 Latin-2. Thanks to this site, helping me to find out about it.
I tried to enforce the encoding by updating 0x1D byte of the DBF header with proper code page. Following is the list of all possible values (I used 0x64), but still it didn’t help much.
Value | Description |
---|---|
0x00 | No codepage defined |
0x01 | Codepage 437 (US MS-DOS) |
0x02 | Codepage 850 (International MS-DOS) |
0x03 | Codepage 1252 Windows ANSI |
0x04 | Codepage 10000 Standard MacIntosh |
0x64 | Codepage 852 Easern European MS-DOS |
0x65 | Codepage 866 Russian MS-DOS |
0x66 | Codepage 865 Nordic MS-DOS |
0x67 | Codepage 861 Icelandic MS-DOS |
0x68 | Codepage 895 Kamenicky (Czech) MS-DOS |
0x69 | Codepage 620 Mazovia (Polish) MS-DOS |
0x6A | Codepage 737 Greek MS-DOS (437G) |
0x6B | Codepage 857 Turkish MS-DOS |
0x78 | Codepage 950 Chinese (Hong Kong SAR, Taiwan) Windows |
0x79 | Codepage 949 Korean Windows |
0x7A | Codepage 936 Chinese (PRC, Singapore) Windows |
0x7B | Codepage 932 Japanese Windows |
0x7C | Codepage 874 Thai Windows |
0x7D | Codepage 1255 Hebrew Windows |
0x7E | Codepage 1256 Arabic Windows |
0x96 | Codepage 10007 Russian MacIntosh |
0x97 | Codepage 10029 MacIntosh EE |
0x98 | Codepage 10006 Greek MacIntosh |
0xC8 | Codepage 1250 Eastern European Windows |
0xC9 | Codepage 1251 Russian Windows |
0xCA | Codepage 1254 Turkish Windows |
0xCB | Codepage 1253 Greek Windows |
all others | Unknown / invalid |
Ultimately, the very old Visual FoxPro driver did the trick (with switched provider to “VFPOLEDB.1”) and respected encoding, saving me from manual strings transcoding in my C# application.
Now you have seen everything!