[Insight-developers] Image writing with unicode filename impossible with MSVC

Bill Lorensen bill.lorensen at gmail.com
Mon Oct 26 17:13:28 EDT 2009

Sounds great. Go for it.


On Mon, Oct 26, 2009 at 4:54 PM, Tom Vercauteren
<tom.vercauteren at m4x.org> wrote:
> Hey Bill,
>> Is there a way to try the portability of this solution without
>> touching any itk classes. Can you check in a test that verifies the
>> functionality and portability of your proposed solution before we
>> commit to this solution?
> Sure, I can make a unit test out of my previous preliminary
> experiments with utfcpp:
> http://public.kitware.com/Bug/file_download.php?file_id=2574&type=bug
> This will just require to commit one new test and four files from
> utfcpp (e.g. in itkExtHeaders). Would this be fine?
> Tom
>> On Mon, Oct 26, 2009 at 4:14 PM, Tom Vercauteren
>> <tom.vercauteren at m4x.org> wrote:
>>> Hi all,
>>> I'm back on the unicode filenames topic and really need your feedback
>>> before I start touching every IO class...
>>> I have uploaded a preliminary patch on the bug tracker that allows the
>>> use of utf-8 encoded strings on windows for several IO classes:
>>>  http://public.kitware.com/Bug/file_download.php?file_id=2601&type=bug
>>> Namely, what is working already is writing (and maybe reading) of
>>> unicode filenames on windows for the following formats:
>>> - jpeg
>>> - png
>>> - meta (mhd and mha)
>>> - tiff
>>> My approach was to convert the utf-8 encoded std::string to a utf-16
>>> encoded wstring (on windows only) when it becomes necessay. This is
>>> done using the utfcpp library:
>>>  http://utfcpp.sourceforge.net/
>>> For backward compatibility reasons, this conversion is activated by a
>>> cmake variable:
>>> For png, jpeg and tiff, no modification were necessary to the
>>> underlying third party libraries.
>>> For metaio, one file had to be modified. For backward compatibility
>>> reasons, the new behavior is only activated if
>>> is defined. Of course, turning ITK_USE_REVIEW_UTF8_STRINGS on in
>>> cmake, turns METAIO_USE_REVIEW_UTF8_STRINGS on in the c++ code.
>>> Could you give a look at my preliminary patch and tell me if something
>>> along those lines could be accepted into ITK?
>>> Cheers,
>>> Tom
>>> On Tue, Oct 20, 2009 at 18:40, Tom Vercauteren <tom.vercauteren at m4x.org> wrote:
>>>> Hi all,
>>>> Thanks for your constructive feedback.
>>>> Benjamin and I have looked a bit further into this issue and into utfcpp.
>>>> Unfortunately utfcpp does not really provide the features we would
>>>> really like, namely:
>>>> - It does not define a separate utf8 string class, it uses std::string
>>>> as a container
>>>> - It does not allow the creation of a utf8 encoded std::string from a
>>>> std::string encoded with the default encoding
>>>> That being said, we can still make efficient use of it. Here is a proposal:
>>>> 1) We keep the current API that only allows users to set char* or
>>>> std::string filenames
>>>> 2) We specify in the documentation that these strings have to be
>>>> encoded in utf8 on MSVC (and other utf8-based systems as previously)
>>>> 3) On MSVC, we use utfcpp to check whether the filename actually is
>>>> encoded in utf8 and we throw an exception otherwise
>>>> 4) We write fopen-like functions in ITK (say itk::fopen) that works
>>>> with utf8 filenames (For MSVC, this will basically use utfcpp to
>>>> convert the utf8 encoded string to a utf16 encoded wstring and call
>>>> _wopen)
>>>> 5) We use itk::fopen when possible instead of fopen
>>>> Some preliminary experiments are shown here:
>>>> http://public.kitware.com/Bug/file_download.php?file_id=2574&type=bug
>>>> The only drawback of this approach is that it is not strictly backward
>>>> compatible for MSVC. More specifically it will work as previously with
>>>> ASCII filenames but will not work without prior utf8 conversion for
>>>> non-ASCII filenames that could be represented in the local codepage.
>>>> We could of course add a cmake switch to maintain strict backward
>>>> compatibilty if deemed necessary.
>>>> Thoughts?
>>>> Tom
>>>> On Tue, Oct 20, 2009 at 14:56, Brad King <brad.king at kitware.com> wrote:
>>>>> Sean McBride wrote:
>>>>>> On 10/19/09 11:15 AM, Brad King said:
>>>>>>> As the primary maintainer of KWSys I prefer to put as little
>>>>>>> in the library as possible.
>>>>>> Perhaps I haven't been following closely enough, but do you mean you
>>>>>> wouldn't want to create a utf8 lib from scratch in KWSys or that you
>>>>>> don't even want a thin wrapper over utf-cpp in KWSys?
>>>>> Both.
>>>>> There is no reason to create a utf8 lib from scratch when there are
>>>>> plenty of third-party libraries available.  We cannot do a thin-wrapper
>>>>> because KWSys cannot have third-party dependencies.
>>>>> IMO KWSys already has too much.  Originally it was just supposed to avoid
>>>>> duplicate Kitware-written code that was copied between VTK and ITK.  It
>>>>> was a/my mistake to add things like the MD5 hash implementation to it.
>>>>>> If the latter, that means we'd end up with both a vtkUnicodeString and
>>>>>> itkUnicodeString?  What if CMake needs to process utf8?
>>>>> We already have zlib in all three projects, named vtkzlib, itkzlib, and
>>>>> cmzlib.  Each project mangles the symbols to avoid conflicts, and they
>>>>> all support sharing a system-installed version.
>>>>> -Brad
>>> _______________________________________________
>>> Powered by www.kitware.com
>>> Visit other Kitware open-source projects at
>>> http://www.kitware.com/opensource/opensource.html
>>> Kitware offers ITK Training Courses, for more information visit:
>>> http://kitware.com/products/protraining.html
>>> Please keep messages on-topic and check the ITK FAQ at:
>>> http://www.itk.org/Wiki/ITK_FAQ
>>> Follow this link to subscribe/unsubscribe:
>>> http://www.itk.org/mailman/listinfo/insight-developers

More information about the Insight-developers mailing list