ObjFW  Check-in [9d70e660ea]

Overview
Comment:OFStdIOStream_Win32Console: Use U+FFFD, not U+FFFE

U+FFFD is for unrepresentable characters, not U+FFFE.

Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: 9d70e660eaa6bd32842e8e6ab0ebd15558924447f4309552d48bf1cebbf4fcd1
User & Date: js on 2016-03-13 20:04:47
Other Links: manifest | tags
Context
2016-03-13
20:29
OFStdIOStream_Win32Console: Small read fix check-in: 976162aa79 user: js tags: trunk
20:04
OFStdIOStream_Win32Console: Use U+FFFD, not U+FFFE check-in: 9d70e660ea user: js tags: trunk
19:33
OFStdIOStream_Win32Console: Improve writing check-in: 3a0fdb6701 user: js tags: trunk
Changes

Modified src/OFStdIOStream_Win32Console.m from [c806288ec6] to [c7b07d3dc1].

27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
 * read.
 *
 * Therefore, instead of just using the UTF-8 codepage, this captures all reads
 * and writes to of_std{in,out,err} on the lowlevel, interprets the buffer as
 * UTF-8 and converts to / from UTF-16 to use ReadConsoleW() / WriteConsoleW().
 * Doing so is safe, as the console only supports text anyway and thus it does
 * not matter if binary gets garbled by the conversion (e.g. because invalid
 * UTF-8 gets converted to U+FFFE).
 *
 * In order to not do this when redirecting input / output to a file (as the
 * file would then be read / written in the wrong encoding and break reading /
 * writing binary), it checks that the handle is indeed a console.
 */

#define OF_STDIO_STREAM_WIN32_CONSOLE_M







|







27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
 * read.
 *
 * Therefore, instead of just using the UTF-8 codepage, this captures all reads
 * and writes to of_std{in,out,err} on the lowlevel, interprets the buffer as
 * UTF-8 and converts to / from UTF-16 to use ReadConsoleW() / WriteConsoleW().
 * Doing so is safe, as the console only supports text anyway and thus it does
 * not matter if binary gets garbled by the conversion (e.g. because invalid
 * UTF-8 gets converted to U+FFFD).
 *
 * In order to not do this when redirecting input / output to a file (as the
 * file would then be read / written in the wrong encoding and break reading /
 * writing binary), it checks that the handle is indeed a console.
 */

#define OF_STDIO_STREAM_WIN32_CONSOLE_M
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259

		UTF8Len = of_string_utf8_decode(
		    _incompleteUTF8Surrogate, _incompleteUTF8SurrogateLen, &c);

		if (UTF8Len <= 0 || c > 0x10FFFF) {
			assert(UTF8Len == 0 || UTF8Len < -4);

			UTF16[0] = 0xFFFE;
			UTF16Len = 1;
		} else {
			if (c > 0xFFFF) {
				c -= 0x10000;
				UTF16[0] = 0xD800 | (c >> 10);
				UTF16[1] = 0xDC00 | (c & 0x3FF);
				UTF16Len = 2;







|







245
246
247
248
249
250
251
252
253
254
255
256
257
258
259

		UTF8Len = of_string_utf8_decode(
		    _incompleteUTF8Surrogate, _incompleteUTF8SurrogateLen, &c);

		if (UTF8Len <= 0 || c > 0x10FFFF) {
			assert(UTF8Len == 0 || UTF8Len < -4);

			UTF16[0] = 0xFFFD;
			UTF16Len = 1;
		} else {
			if (c > 0xFFFF) {
				c -= 0x10000;
				UTF16[0] = 0xD800 | (c >> 10);
				UTF16[1] = 0xDC00 | (c & 0x3FF);
				UTF16Len = 2;
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
				    length - i);
				_incompleteUTF8SurrogateLen = length - i;

				break;
			}

			if (UTF8Len <= 0 || c > 0x10FFFF) {
				tmp[j++] = 0xFFFE;
				i++;
				continue;
			}

			if (c > 0xFFFF) {
				c -= 0x10000;
				tmp[j++] = 0xD800 | (c >> 10);







|







292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
				    length - i);
				_incompleteUTF8SurrogateLen = length - i;

				break;
			}

			if (UTF8Len <= 0 || c > 0x10FFFF) {
				tmp[j++] = 0xFFFD;
				i++;
				continue;
			}

			if (c > 0xFFFF) {
				c -= 0x10000;
				tmp[j++] = 0xD800 | (c >> 10);