Differences From Artifact [c806288ec6]:
- File
src/OFStdIOStream_Win32Console.m
— part of check-in
[3a0fdb6701]
at
2016-03-13 19:33:09
on branch trunk
— OFStdIOStream_Win32Console: Improve writing
When writing an incomplete surrogate, it now writes everything up to
that incomplete surrogate, remembers the incomplete surrogate and writes
it as soon as the surrogate is completed by a following write. (user: js, size: 7841) [annotate] [blame] [check-ins using]
To Artifact [c7b07d3dc1]:
- File
src/OFStdIOStream_Win32Console.m
— part of check-in
[9d70e660ea]
at
2016-03-13 20:04:47
on branch trunk
— OFStdIOStream_Win32Console: Use U+FFFD, not U+FFFE
U+FFFD is for unrepresentable characters, not U+FFFE. (user: js, size: 7841) [annotate] [blame] [check-ins using]
︙ | ︙ | |||
27 28 29 30 31 32 33 | * read. * * Therefore, instead of just using the UTF-8 codepage, this captures all reads * and writes to of_std{in,out,err} on the lowlevel, interprets the buffer as * UTF-8 and converts to / from UTF-16 to use ReadConsoleW() / WriteConsoleW(). * Doing so is safe, as the console only supports text anyway and thus it does * not matter if binary gets garbled by the conversion (e.g. because invalid | | | 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | * read. * * Therefore, instead of just using the UTF-8 codepage, this captures all reads * and writes to of_std{in,out,err} on the lowlevel, interprets the buffer as * UTF-8 and converts to / from UTF-16 to use ReadConsoleW() / WriteConsoleW(). * Doing so is safe, as the console only supports text anyway and thus it does * not matter if binary gets garbled by the conversion (e.g. because invalid * UTF-8 gets converted to U+FFFD). * * In order to not do this when redirecting input / output to a file (as the * file would then be read / written in the wrong encoding and break reading / * writing binary), it checks that the handle is indeed a console. */ #define OF_STDIO_STREAM_WIN32_CONSOLE_M |
︙ | ︙ | |||
245 246 247 248 249 250 251 | UTF8Len = of_string_utf8_decode( _incompleteUTF8Surrogate, _incompleteUTF8SurrogateLen, &c); if (UTF8Len <= 0 || c > 0x10FFFF) { assert(UTF8Len == 0 || UTF8Len < -4); | | | 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 | UTF8Len = of_string_utf8_decode( _incompleteUTF8Surrogate, _incompleteUTF8SurrogateLen, &c); if (UTF8Len <= 0 || c > 0x10FFFF) { assert(UTF8Len == 0 || UTF8Len < -4); UTF16[0] = 0xFFFD; UTF16Len = 1; } else { if (c > 0xFFFF) { c -= 0x10000; UTF16[0] = 0xD800 | (c >> 10); UTF16[1] = 0xDC00 | (c & 0x3FF); UTF16Len = 2; |
︙ | ︙ | |||
292 293 294 295 296 297 298 | length - i); _incompleteUTF8SurrogateLen = length - i; break; } if (UTF8Len <= 0 || c > 0x10FFFF) { | | | 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 | length - i); _incompleteUTF8SurrogateLen = length - i; break; } if (UTF8Len <= 0 || c > 0x10FFFF) { tmp[j++] = 0xFFFD; i++; continue; } if (c > 0xFFFF) { c -= 0x10000; tmp[j++] = 0xD800 | (c >> 10); |
︙ | ︙ |