6 Quick Tips for International Websites

Note from the editors: After previously looking into various ways to handle internationalization for Google’s web-search, here’s a post from Google Web Studio team members with tips for web developers.

Many websites exist in more than one language, and more and more websites are made available for more than one language. Yet, building a website for more than one language doesn’t simply mean translation, or localization (L10N), and that’s it. It requires a few more things, all of which are related to internationalization (I18N). In this post we share a few tips for international websites.

1. Make pages I18N-ready in the markup, not the style sheets

Language and directionality are inherent to the contents of the document. If possible you should hence always use markup, not style sheets, for internationalization purposes. Use @lang and @dir, at least on the html element:

<html lang="ar" dir="rtl">

Avoid coming up with your own solutions like special classes or IDs.

As for I18N in style sheets, you can’t always rely on CSS: The CSS spec defines that conforming user agents may ignore properties like direction or unicode-bidi. (For XML, the situation changes again. XML doesn’t offer special internationalization markup, so here it’s advisable to use CSS.)

2. Use one style sheet for all locales

Instead of creating separate style sheets for LTR and RTL directionality, or even each language, bundle everything in one style sheet. That makes your internationalization rules much easier to understand and maintain.

So instead of embedding an alternative style sheet like

<link href="default.rtl.css" rel="stylesheet">

just use your existing

<link href="default.css" rel="stylesheet">

When taking this approach you’ll need to complement existing CSS rules by their international counterparts:

3. Use the [dir='rtl'] attribute selector

Since we recommend to stick with the style sheet you have (tip #2), you need a different way of selecting elements you need to style differently for the other directionality. As RTL contents require specific markup (tip #1), this should be easy: For most modern browsers, we can simply use [dir='rtl'].

Here’s an example:

aside {
 float: right;
 margin: 0 0 1em 1em;
}

[dir='rtl'] aside {
 float: left;
 margin: 0 1em 1em 0;
}

4. Use the :lang() pseudo class

To target documents of a particular language, use the :lang() pseudo class. (Note that we’re talking documents here, not text snippets, as targeting snippets of a particular language makes things a little more complex.)

For example, if you discover that bold formatting doesn’t work very well for Chinese documents (which indeed it does not), use the following:

:lang(zh) strong,
:lang(zh) b {
 font-weight: normal;
 color: #900;
}

5. Mirror left- and right-related values

When working with both LTR and RTL contents it’s important to mirror all the values that change directionality. Among the properties to watch out for is everything related to borders, margins, and paddings, but also position-related properties, float, or text-align.

For example, what’s text-align: left in LTR needs to be text-align: right in RTL.

There are tools to make it easy to “flip” directionality. One of them is CSSJanus, though it has been written for the “separate style sheet” realm, not the “same style sheet” one.

6. Keep an eye on the details

Watch out for the following items:
  • Images designed for left or right, like arrows or backgrounds, light sources in box-shadow and text-shadow values, and JavaScript positioning and animations: These may require being swapped and accommodated for in the opposite directionality.
  • Font sizes and fonts, especially for non-Latin alphabets: Depending on the script and font, the default font size may be too small. Consider tweaking the size and, if necessary, the font.
  • CSS specificity: When using the [dir='rtl'] (or [dir='ltr']) hook (tip #2), you’re using a selector of higher specificity. This can lead to issues. Just have an eye out, and adjust accordingly.

If you have any questions or feedback, check the Internationalization Webmaster Help Forum, or leave your comments here.

Introducing "x-default hreflang" for international landing pages

Webmaster Level: All

The homepages of multinational and multilingual websites are sometimes configured to point visitors to localized pages, either via redirects or by changing the content to reflect the user’s language. Today we’ll introduce a new rel-alternate-hreflang annotation that the webmaster can use to specify such homepages that is supported by both Google and Yandex.

To see this in action, let’s look at an example. The website example.com has content that targets users around the world as follows:

Map of the world illustrating which hreflang code to use for which locale

In this case, the webmaster can annotate this cluster of pages using rel-alternate-hreflang using Sitemaps or using HTML link tags like this:



<link rel="alternate" href="http://example.com/en-gb" hreflang="en-gb" />
<link rel="alternate" href="http://example.com/en-us" hreflang="en-us" />
<link rel="alternate" href="http://example.com/en-au" hreflang="en-au" />
<link rel="alternate" href="http://example.com/" hreflang="x-default" />

The new x-default hreflang attribute value signals to our algorithms that this page doesn’t target any specific language or locale and is the default page when no other page is better suited. For example, it would be the page our algorithms try to show French-speaking searchers worldwide or English-speaking searchers on google.ca.

The same annotation applies for homepages that dynamically alter their contents based on a user’s perceived geolocation or the Accept-Language headers. The x-default hreflang value signals to our algorithms that such a page doesn’t target a specific language or locale.

As always, if you have any questions or feedback, please tell us in the Internationalization Webmaster Help Forum.

How search results may differ based on accented characters and interface languages

When a searcher enters a query that includes a word with accented characters, our algorithms consider web pages that contain versions of that word both with and without the accent. For instance, if a searcher enters [México], we'll return results for pages about both "Mexico" and "México."



Conversely, if a searcher enters a query without using accented characters, but a word in that query could be spelled with them, our algorithms consider web pages with both the accented and non-accented versions of the word. So if a searcher enters [Mexico], we'll return results for pages about both "Mexico" and "México."



How the searcher's interface language comes into play
The searcher's interface language is taken into account during this process. For instance, the set of accented characters that are treated as equivalent to non-accented characters varies based on the searcher's interface language, as language-level rules for accenting differ.

Also, documents in the chosen interface language tend to be considered more relevant. If a searcher's interface language is English, our algorithms assume that the queries are in English and that the searcher prefers English language documents returned.

This means that the search results for the same query can vary depending on the language interface of the searcher. They can also vary depending on the location of the searcher (which is based on IP address) and if the searcher chooses to see results only from the specified language. If the searcher has personalized search enabled, that will also influence the search results.

The example below illustrates the results returned when a searcher queries [Mexico] with the interface language set to Spanish.



Note that when the interface language is set to Spanish, more results with accented characters are returned, even though the query didn't include the accented character.

How to restrict search results
To obtain search results for only a specific version of the word (with or without accented characters), you can place a + before the word. For instance, the search [+Mexico] returns only pages about "Mexico" (and not "México"). The search [+México] returns only pages about "México" and not "Mexico." Note that you may see some search results that don't appear to use the version of word you specified in your query, but that version of the word may appear within the content of the page or in anchor text to the page, rather than in the title or description listed in the results. (You can see the top anchor text used to link to your site by choosing Statistics > Page analysis in webmaster tools.)

The example below illustrates the results returned when a searcher queries [+Mexico].