Last time I wrote an introduction to adding OpenType features to Protest. I included links to the resources I used most or found helpful. Reading those materials is foundational, and highly recommended.

There are two types of features: substitutional and positional. The substitution features come first.

This post covers local language rules in the locl substitution feature.

Of the substitution features, here’s where I am in the feature order for Protest:

  1. script language specific forms (locl)
  2. fractions (frac, numr, dnom)
  3. ordinals (ordn)
  4. all caps (case)
  5. various alternates (calt, salt, ss01, ss02, ss…)
  6. ligatures (liga, dlig)
  7. manual alternate access (aalt)

Note: I’m a noob, and my code may be jank! I am sharing what I’ve learned so you can learn from my failures as much as from my successes. If something in here doesn’t work the way it should (though everything seems to when testing), I will come back and make corrections.

Language Specific Features (locl)

Some languages have corner cases with glyph usage that necessitate a bit of tweaking for a font to output things correctly for that language.

Glyphs has a number of good articles on localizing a font for some of the more common latin-based language issues. These are certainly worth the read, even though some of the information is specific to the Glyphs app.

The following are the lookups I’ve included in Protest’s locl feature.

Dutch ij acute

In Dutch the ij digraph (which may also be a ligature in some cases), is a single sound, as in ijs (ice). When the i in the ij digraph is accented, they both need to be accented. So when an acute i is typed followed by a j, then the j needs to be substituted by an acute j.

First, I need to specify the language, then use contextual substitution to replace the j. I’ll then do the same for the uppercase as the lower. It looks like this:

# Add jacute to iacute in ij pair for Dutch accented ij

language NLD exclude_dflt;
    sub iacute j' by jacute;
    sub Iacute J' by Jacute;

The single quote mark denotes which glyph to be substituted in the context.

German uppercase eszett

When typing in uppercase letters in German, it is appropriate to have the uppercase eszett (a relatively new development) instead of a lowercase eszett or subbing in SS.

To do this, I need to check to see that the surrounding letters are uppercase. I don’t want to check after just one uppercase letter, because it could just be a proper name or beginning of a sentence. If I check that there is an uppercase letter before and after, or that there is more than one uppercase letter before the eszett, that should work fine. Rainer Erich Scheichelbauer points out in his Glyphs tutorial that there are very few cases where this is a problem.

This is an instance where I now need a class, in this case the set of all possible uppercase letters. I selected all of them from my Font Window in FontLab. This was pretty easy when I searched for an uppercase A in the search bar, because it alphabetized everything with the uppercase first, putting all the uppercase in one place.

Once selected, I make sure there aren’t any rogue non-uppercase glyphs like .notdef and .null. I hit the plus in the lower left of the Classes Panel, and it creates a new class that I then rename “Uppercase”.

FontLab VI Classes Panel and Font Window

Now I have a class I can refer to in my code. It looks like this:

# Substitute lowercase eszett by Uppercase Eszett in uppercase situations for German

language DEU exclude_dflt;
    sub @Uppercase germandbls' @Uppercase by germandbls.calt;
    sub @Uppercase @Uppercase germandbls' by germandbls.calt;

Now, Protest does have the uppercase Germandbls (unicode 1E9E), but this is a whole different character from the lowercase germandbls (unicode 00DF). And for reasons that still aren’t 100% clear to me, perhaps due to how software may handle or search characters, Rainer recommends creating a germandbls.calt glyph that’s identical to the uppercase Germandbls. This becomes the substitute rather than the U+1E9E character.

Dotless i languages

In Turkish (and Azeri, Tartar, Crimean Tartar, and Kazakh) they use dotless i (ı) instead of i. These languages use i as well, but it is considered an ı with a dot accent. This actually makes a ton of sense. So in Turkish, the uppercase of ı is I, and the uppercase of i is İ.

What all this means for the font is that an i needs to be replaced by the i dot accent character for dotless i languages. This glyph can be called idotaccent, i.TRK, i.loclTRK, idot… there are lots of names. Here’s what the code looks like:

# Substitute i by dotted i (idotaccent) in dotless i languages (AZE, CRT, KAZ, TAT, TRK)

language AZE exclude_dflt;
    lookup locl_i {
        sub i by idotaccent;
    } locl_i;

language CRT exclude_dflt;
    lookup locl_i;

language KAZ exclude_dflt;
    lookup locl_i;

language TAT exclude_dflt;
    lookup locl_i;

language TRK exclude_dflt;
    lookup locl_i;

Notice I made a single lookup that I can call for each language as appropriate.

Also, as far as the glyph name, I used idotaccent, but I’m considering changing it to i.TRK or some form with a dotted suffix, because I hear it’s more searchable in things like PDFs.

S comma accent

Once upon a time, in Romanian and Moldovan, they used S and T with cedillas (and of course lowercase s and t with cedillas—I’m using a shorthand for both cases). Over time it just seemed better to use a comma below accent instead of the cedilla. (Seems very reasonable.) However, when the characters were included in Unicode, they were still identified as S and T with cedilla.

Unicode does have S and T with comma accents, so that’s good. But when typing Romanian or Moldovan, one might end up with an archaic looking S or T with a cedilla instead.

I wanted to make all the glyphs, so I had made S and T with cedillas. This is fine, as there are languages that use S with cedilla. I came to learn, however, that no-one uses a T with cedilla. So the T with cedilla in Protest is actually a T with comma accent.

This leaves only the S with cedilla (and s with cedilla) to substitute in Romanian and Moldovan. It’s just as simple and straightforward as Turkish:

# Substitute Scedilla and scedilla by Scommaaccent and scommaaccent in Moldovan and Romanian

language MOL exclude_dflt;
    lookup scomma {
        sub Scedilla by Scommaaccent;
        sub scedilla by scommaaccent;
    } scomma;

language ROM exclude_dflt;
    lookup scomma;

Catalan geminated L

Catalan has a very interesting case amongst latin script languages. It is closely related to Spanish, and has the ll (ʎ), which is one sound and can’t be divided. There can’t be a hyphen in between the two l’s in paella.

However, there are words in Catalan which have two l’s with a hard l sound, and can be divided by a hyphen in line breaks. These l’s are separated by a punt volat (flying point, or middle dot) in what is called the geminated l. It is used in words like paral·lel.

Often when typing the middle dot, it will be typed as a period centered. These are essentially the same, but the period centered can be too big. I’ve drawn my middle dot a touch smaller, like the dot accent.

So in Catalan when someone types an L followed by a period centered, I want to replace the period centered with a middle dot. Then, because the kerning will be too loose (especially with the uppercase L), I’m using an Ldot (or ldot) glyph, which is a ligature of an L (or l) with a middle dot.

Now when an L with a middle dot occurs, it will be replaced by an Ldot. And since all L’s with period centered are replaced by L with middle dot, then all L’s with period centered become replaced by the Ldot.

Since I will have contextual alternates that will cycle through versions of the L or l, and because this is a ligature, I need to specify each version of L or l being used.

# Catalan geminated L

language CAT exclude_dflt;
    # Uppercase geminated L
    sub @L periodcentered' @L by middledot;
    lookup CATL {
        sub L middledot by Ldot;
        sub L.ss01 middledot by Ldot.ss01;
        sub L.ss02 middledot by Ldot.ss02;
        sub L.ss03 middledot by Ldot.ss03;
    } CATL;
    # Lowercase geminated l
    sub @l periodcentered' @l by middledot;
    lookup CATl {
        sub l middledot by ldot;
        sub l.ss01 middledot by ldot.ss01;
        sub l.ss02 middledot by ldot.ss02;
        sub l.ss03 middledot by ldot.ss03;
    } CATl;

I know this isn’t the most elegant solution out there, but if someone wants to space my font, they’ll have to either think twice or make it work. This may be something I revisit later.

Testing locl features

I got a bit bent out of shape when trying to test the language features. I couldn’t find anything out there explicitly talking about testing locl features. I found how to change the language (dictionary) in Pages, but it didn’t seem to do anything.

I thought this could mean one of three things:

  1. This program doesn’t support the locl feature
  2. I’m not implementing the language change correctly to trigger the locl feature.
  3. My code doesn’t work, which I wouldn’t know for sure unless I knew I was doing the testing correctly.

I’ve also tried testing in Adobe InDesign. I thought surely the feature is supported, so at least I can eliminate problem 1. I changed the language dictionary in the InDesign preferences, and it still wasn’t working. I figured it had to be my code, which I poured over and tried tweaking… in vain.

Finally I asked about it on the TypeDrawers forum. It was kindly pointed out to me that the InDesign language change was not hidden in the preferences, but was right out there in the open on the Character panel. (Face palm.)

After testing this way, the language features worked. (My acute i doesn’t show up, but that’s a whole other issue for another post.)

I’m still not sure the best way to test locl features outside of InDesign. So if anyone has any suggestions, feel free to let me know in the comments.

Up Next

  • Part 3: frac, ordn, case, liga, dlig
  • Part 4: calt, aalt
  • Part 5: mark, mkmk
  • Kerning!