Let me say this upfront: I am neither a specialist in linguistics nor in international SEO (search engine optimization). But there’s one thing that has been keeping my brain busy for some time: what is the right method of rewriting a URL if you want to stand amongst a foreign market, on maybe alternative search engines, that uses different characters than simple Latin?

You might know that I am working as a Community Guide for OXID eSales, dealing very close to the product management of OXID eShop, and we feel completely responsible in a SEO-optimized shopping cart platform delivery, not only to the domestic (German) market but beyond.

In OXID eShop, we use transliteration lists for several languages: for example, the German special character “ü” becomes an “ue” in the URL. As far as I know, this works really well, the search for a word with a “ü” on Google would lead you to the transliterated version. If you don’t use any transliteration list in OXID eShop, this character will simply be left out in the URL and for example becomes “Grtel” (blt) in your URL instead of “Guertel” (belt) as it should.

The same goes for some Slavic languages, where they use a “Š” for the voiced “Z”. This could be transliterated to a “sh” in the URL but will search engine users still find “shunka” (Czech for bacon)?

It gets even more complicated when having a look at non-romanized writings like in Russian, where Cyrillic letters are used. Looking up “агрегат бензина” (gas-driven power supply) will lead you to totally different search results on Google then in it’s transliteration “agregat benzina”.

Now that we know that Google is not the true north of the world, I recognized that a search result for “агрегат бензина” could be totally different when related to yandex.ru, in all probability the most-used search engine in the eastern (from us) world, isn’t it.

Also, when searching on Google for the cyrillic version, Wikipedia comes up on the first page with a hybrid, but looks really down and dirty when sharing this link:
http://ru.wikipedia.org/wiki/%D0%90%D0%91_(%D1%8D%D0%BB%D0%B5%D0%BA%D1%82%D1%80%D0%BE%D0%B0%D0%B3%D1%80%D0%B5%D0%B3%D0%B0%D1%82%D1%8B)

Now what is better, use Cyrillic or Latin transliterations for a search? And how do we design the rewritten URLs in a standard software like OXID eShop?

  1. Katja says:

    You’re absolutely right, that’s an important question – and a complex one. It starts with the fact that transliteration lists should be different for different languages. The “ü” you mentioned, for example, is used in Spanish and Catalan as well (it’s called “diéresis”), but in a different way which has to be transliterated differently. Normally, a “u” followed by a soft vowel (i or e) is not pronounced in Spanish/Catalan (i.e. “guerra” is pronounced “gerra”). The diéresis îs used for a “u” followed by a soft vowel that is to be pronounced, like in “lingüística”, “vergüenza”, “multilingüe” etc. In Spanish or Catalan URLs, if it is transliterated (in many cases it’s not), it is replaced by a simple “u” (like for example in http://www.parkguell.es – website of the Park Güell in Barcelona).
    If it were to be transliterated to “ue” as we do in German this would cause nothing but confusion…

  2. Katja says:

    My pleasure! :-)
    I just noticed that there’s a slight flaw in my explanation: it’s only after “g” and “q” that a “u” followed by a soft vowel is not pronounced, seems I forgot to mention that. Doesn’t change anything for the transliteration problem though… :-).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>