= Сообщение: 632 из 7128 ======================================= FTSC_PUBLIC = От : Michiel van der Vlist 2:280/5555 28 Nov 13 00:53:08 Кому : Kees van Eeten 28 Nov 13 00:53:08 Тема : UTF-8 in de the nodelist FGHI : area://FTSC_PUBLIC?msgid=2:280/5555+52968942 На : area://FTSC_PUBLIC?msgid=2:280/5003.4+52966782 = Кодировка сообщения определена как: UTF-8 ================================== ============================================================================== Hello Kees,
On Wednesday November 27 2013 22:37, you wrote to me:
MvdV>> I have seen some arguments against using UTF-8 in messages. So MvdV>> far however I have seen no arguments against UTF-8 in the MvdV>> nodelist. Other than using ANY non ASCII characters that is.
KE> In that way of reasoning, the current nodelist with it 7bit limit, is KE> encoded in all characterset, that are identical to ascii.
ASCII is a subset of all 8 bit character sets used in Fidonet. ASCII is also a subset of the Universal characters set AKA, the Unicode character set. UTF-8 is not a character set, it is a character encoding scheme for Unicode. So are UTF-7, UTF-16 and UTF-32.
KE> There is some experience, when 8 bit characters are used, but have KE> you ever tried to process and use a nodelist with e.g. Bjorn's name KE> encoded in 16 bit values.
There is no problem. Nodelist processors that allow 8 bit values in strings are character encoding scheme agnostic. For them the byte sequence C3 B6 is just two bytes. C3 B6 is the two byte sequence that codes for the o with diaresis in UTF-8, but the nodelist processor does not know that and does not have to know any more than that it has to know that the single byte sequence 41 encodes for the letter A. Nodelist processors do not interpret the byte sequence that encodes the name of a sysop or the name of a city.
MvdV>> That of course is not practical. The only practical downward MvdV>> compatible way is that the whole nodelist is in one and the MvdV>> same character encoding scheme. And since there is no single 8 MvdV>> bit character encoding scheme that can fulfil the needs of all MvdV>> the users of the nodelist, UTF-8 is the only logical choice for MvdV>> encoding the nodelist.
KE> Shure keep the 7 bit check in makenl and claim the nodelist is in KE> UTF-8 ;)
Strictly speaking an ASCII only text is UTF-8, but that is not what I meant of course.