Pango: Issues with illegal characters in family names (encoding problems on Windows?)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Eduard Braun
Hi all,

I'm a developer in the Inkscape project which uses Pango/Fontconfig to
render (and also query) available fonts on the system.

In the past there were multiple bug reports [2,3] about Inkscape
crashing due to fonts with illegal characters in their family name
(Inkscape queries them with "pango_ft2_font_map_new()" and
"pango_font_map_list_families()" respectively and gets the name with
"pango_font_family_get_name()", see [3] for details).

I recently pushed a fix [4] which seems to avoid the crashes and proves,
that they are caused by font family names containing illegal UTF-8
characters!

The question I'm asking myself is, whether
a) the font files themselves are invalid and contain characters in some
broken encoding
b) fontconfig has issues querying fonts with special characters on
Windows, or
c) Pango does something wrong when passing the list to Inkscape

As I have not nearly enough knowledge about Pango/Fontconfig (or font
formats to start with) I hope that somebody is able to figure out what
is going wrong and if it can be fixed somehow!

As a start you can find two fonts that caused crashes in the past at [5]
and [6] and according to user reports the fix in [4] prevents them both
(while obviously making it impossible to use the font).

Best Regards,
Eduard Braun


[1]https://bugs.launchpad.net/inkscape/+bug/1495386
[2]https://bugs.launchpad.net/inkscape/+bug/1508928
[3]
http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/view/head:/src/libnrtype/FontFactory.cpp
[4] http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/15687
[5]
https://bugs.launchpad.net/inkscape/+bug/1495386/+attachment/4875733/+files/AESYSMAT.zip
[6]
https://bugs.launchpad.net/inkscape/+bug/1508928/+attachment/4502390/+files/%E9%87%91%E6%A2%85%E6%AF%9B%E8%A1%8C%E6%9B%B8.ttf

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Behdad Esfahbod-3
Hi Eduard,

It's possible that there are bugs on the fonts.  For the ttf in [6] I cannot spot any from the output of fc-query.  The one in [5] is a Windows .fon font.  It's possible that there's a bug in FreeType driver for that format.

At any rate, I guess the right question you should be asking is, why is Inkscape crashing, and fix that.  Pango, etc, handle invalid UTF-8 just fine for example.

b

On Fri, May 12, 2017 at 11:44 AM, Eduard Braun <[hidden email]> wrote:
Hi all,

I'm a developer in the Inkscape project which uses Pango/Fontconfig to render (and also query) available fonts on the system.

In the past there were multiple bug reports [2,3] about Inkscape crashing due to fonts with illegal characters in their family name (Inkscape queries them with "pango_ft2_font_map_new()" and "pango_font_map_list_families()" respectively and gets the name with "pango_font_family_get_name()", see [3] for details).

I recently pushed a fix [4] which seems to avoid the crashes and proves, that they are caused by font family names containing illegal UTF-8 characters!

The question I'm asking myself is, whether
a) the font files themselves are invalid and contain characters in some broken encoding
b) fontconfig has issues querying fonts with special characters on Windows, or
c) Pango does something wrong when passing the list to Inkscape

As I have not nearly enough knowledge about Pango/Fontconfig (or font formats to start with) I hope that somebody is able to figure out what is going wrong and if it can be fixed somehow!

As a start you can find two fonts that caused crashes in the past at [5] and [6] and according to user reports the fix in [4] prevents them both (while obviously making it impossible to use the font).

Best Regards,
Eduard Braun


[1]https://bugs.launchpad.net/inkscape/+bug/1495386
[2]https://bugs.launchpad.net/inkscape/+bug/1508928
[3] http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/view/head:/src/libnrtype/FontFactory.cpp
[4] http://bazaar.launchpad.net/~inkscape.dev/inkscape/trunk/revision/15687
[5] https://bugs.launchpad.net/inkscape/+bug/1495386/+attachment/4875733/+files/AESYSMAT.zip
[6] https://bugs.launchpad.net/inkscape/+bug/1508928/+attachment/4502390/+files/%E9%87%91%E6%A2%85%E6%AF%9B%E8%A1%8C%E6%9B%B8.ttf

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list



--

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Werner LEMBERG

> The one in [5] is a Windows .fon font.  It's possible that there's a
> bug in FreeType driver for that format.

As far as I can see, there is no FreeType bug.  According to the FNT
specification[1], the family name must be ASCII, which is not true for
this font.  The first character of the font's family name string is
0xC6, which corresponds to `Æ' in win1252 encoding.  This also
corresponds to the charset value specified in the font (value 0, which
is `FT_WinFNT_ID_CP1252').

In general, FreeType doesn't adjust the family name string.  For
`.fon' files, FreeType sets the font encoding to `FT_ENCODING_NONE'
(in almost all cases), and it's up to the application to use function
`FT_Get_WinFNT_Header' to extract the font's charset.  I assume that
the font's charset is also used for the encoding of the family name
string.  However, this is undocumented.


    Werner


[1] http://support.microsoft.com/kb/65123
_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Eduard Braun
Hi Werner and Behdad,

thanks for your answers!


Am 18.05.2017 um 21:39 schrieb Behdad Esfahbod:
> For the ttf in [6] I cannot spot any from the output of fc-query.
I'm afraid I wasn't able to reproduce a crash with that font myself so I
wouldn't even rule out that the font attached by the user did not even
cause the problems reported.

> At any rate, I guess the right question you should be asking is, why
> is Inkscape crashing, and fix that.  Pango, etc, handle invalid UTF-8
> just fine for example.
Well, Inkscape isn't crashing anymore (as I took the radical approach of
axing any font with illegal UTF8 characters in family name). The
crashing is not too surprising as Inkscape (especially GTK+ and glib at
that)  usually expect UTF-8 strings to be valid, and I think that is a
good assumption: If a font name includes an invalid UTF-8 character
something went wrong at some point and I doubt it makes much sense to
work around that in Inkscape. As it's not a common bug and you both seem
positive this is not an issue in any of the involved libraries (Windows
is often "complicated" when it gets to character set conversion, so
there are often issues not exposed on *nix) I guess there's not much I
can (or want) to do about it at this time...


Am 20.05.2017 um 08:10 schrieb Werner LEMBERG:
>> The one in [5] is a Windows .fon font.  It's possible that there's a
>> bug in FreeType driver for that format.
> As far as I can see, there is no FreeType bug.  According to the FNT
> specification[1], the family name must be ASCII, which is not true for
> this font.
That's useful information! In that case I'd say this can safely be
judged as a "bug" in the font. While I guess it could be worked around,
as already stated above I don't think it's the job of any software to
work around the issues arising from fonts that do not follow the
respective format's specification.

Best Regards,
Eduard

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Behdad Esfahbod-3
On Sun, May 21, 2017 at 4:02 PM, Eduard Braun <[hidden email]> wrote:

Am 18.05.2017 um 21:39 schrieb Behdad Esfahbod:

At any rate, I guess the right question you should be asking is, why is Inkscape crashing, and fix that.  Pango, etc, handle invalid UTF-8 just fine for example.
Well, Inkscape isn't crashing anymore (as I took the radical approach of axing any font with illegal UTF8 characters in family name). The crashing is not too surprising as Inkscape (especially GTK+ and glib at that)  usually expect UTF-8 strings to be valid,

Can you point me to the crash stacktrace?  I couldn't find it browsing your links casually.
 
and I think that is a good assumption:

Not necessarily.  I'm of the opinion that our libraries should not crash on invalid input to the extent possible.

 
If a font name includes an invalid UTF-8 character something went wrong at some point and I doubt it makes much sense to work around that in Inkscape. As it's not a common bug and you both seem positive this is not an issue in any of the involved libraries (Windows is often "complicated" when it gets to character set conversion, so there are often issues not exposed on *nix) I guess there's not much I can (or want) to do about it at this time...


Am 20.05.2017 um 08:10 schrieb Werner LEMBERG:
The one in [5] is a Windows .fon font.  It's possible that there's a
bug in FreeType driver for that format.
As far as I can see, there is no FreeType bug.  According to the FNT
specification[1], the family name must be ASCII, which is not true for
this font.
That's useful information! In that case I'd say this can safely be judged as a "bug" in the font. While I guess it could be worked around, as already stated above I don't think it's the job of any software to work around the issues arising from fonts that do not follow the respective format's specification.

Best Regards,
Eduard




--

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Eduard Braun
Am 22.05.2017 um 23:36 schrieb Behdad Esfahbod:
Can you point me to the crash stacktrace?  I couldn't find it browsing your links casually.

https://bugs.launchpad.net/inkscape/+bug/1495386/+attachment/4881592/+files/bt.txt

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list
Reply | Threaded
Open this post in threaded view
|

Re: Pango: Issues with illegal characters in family names (encoding problems on Windows?)

Behdad Esfahbod-3
On Mon, May 22, 2017 at 3:13 PM, Eduard Braun <[hidden email]> wrote:
Am 22.05.2017 um 23:36 schrieb Behdad Esfahbod:
Can you point me to the crash stacktrace?  I couldn't find it browsing your links casually.

https://bugs.launchpad.net/inkscape/+bug/1495386/+attachment/4881592/+files/bt.txt

That's not very helpful as it doesn't show which function in glib is crashing:

#1  0x00000000686238e9 in ?? ()
   from E:\Temp\Inkscape\inkscape\trunk_msys2\build64\inkscape\libglib-2.0-0.dll
#2  0x0000000001dc05ac in Inkscape::FontLister::new_font_family(Glib::ustring, bool) ()

Someone who can reproduce the issue needs to debug it and report what exactly is happening before I can recommend anything.  To me, inkscape shouldn't need to do anything on the family name that would require it to be valid.  But I'm sure I'm wrong.  I just don't know how.

_______________________________________________
gtk-i18n-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/gtk-i18n-list