AV>> gremlin@warez:~/books > du -sh . AV>> 216G AV>> Over 250000 books, most of which are in Russian, so keeping them AV>> in one-byte encoding saves me about 200Gb of disk space. MV> Ah, yes, I remember now. Yes with a such a large library I can MV> understand why you want to save on disk space. And of course if MV> this is just local the encoding is not such a issue.
They are available from outside, but every file carries a charset tag, so conversion isn't a problem.
<?xml version="1.0" encoding="KOI8-R"?>
MV> You could however also save (significantly more) by using compression MV> while retaining UTF-8.
It will ruin the ability to search through the library - say, with `egrep`.
MV> KOI8 encoding carries the risk that others may not be able to MV> read it in 50 years when the world has gone completely Unicode MV> and has forgotten about all the dozens of local character sets MV> and accompabnying translation utilities...
Once I've heard these words regarding 5.25 inch floppies... After that, I've heard them several times regarding various file formats and media. But most people who say these words tend to forget about quite long transition periods between the appearance of new technology and final disappearance of the old.
MV> С уважением Михаил
You've missed a comma before your name :-)
-- Alexey V. Vissarionov aka Gremlin from Kremlin gremlin.ru!gremlin; +vii-cmiii-ccxxix-lxxix-xlii