General Discussion is the text corruption on the front page being investigated?

View : : :
Aug 8, 2021, 10:54
Fixed(?) Aug 8, 2021, 10:54
Aug 8, 2021, 10:54
Hadn't really spent any serious time on this issue anymore until the past week, when I had some new ideas to investigate the issue. TLDR: I think I found the fix, and Blue and I have not been getting replacement characters displayed for accented characters anymore, in hundreds of refreshes.

Short story longer:
Perl I/O to file handles needs to have UTF-8 encoding set in order to pass through UTF-8 strings correctly. This is done via a pragma: use open ":std", ":encoding(UTF-8)";
This also sets it for standard input & output, the default I/O channels in Unix/Linux. The cgi-bin scripts all print to standard output, and via Apache that gets delivered to the browser. This pragma had been in place since February, but evidently was insufficient to fully get rid of the question marks.

Yesterday I came across a remark that this needs to be done before the stream (file handle) gets opened, otherwise the pragma cannot be applied anymore. Standard output is always opened when a process starts, but it took a while to sink in that this can also happen for cgi-bin scripts invoked by Apache and modperl. So sometimes the pragma works and I/O gets encoded correctly, but in other cases it wouldn't. It can be achieved consistently by another command on the existing, already opened handle: binmode STDOUT, ':encoding(UTF-8)'; (and similarly for STDIN).

So please let me know if you still see accented characters displayed incorrectly, but I have good hopes that the problem is solved. Phew. Sweatdrop
-- Frans
Avatar 1258
Feb 1, 2021Feb 1 2021
Feb 2, 2021Feb 2 2021
Feb 3, 2021Feb 3 2021
Feb 6, 2021Feb 6 2021
Aug 8, 2021Aug 8 2021
Aug 9, 2021Aug 9 2021
Aug 9, 2021Aug 9 2021