1. About.com
  2. Computing & Technology
  3. Web Design / HTML

Discuss in my forum

Using Non-English Letters

If You’re Posting a Page in Another Language, You Need to Know Character Codes

By , About.com Guide

Website localization is the process of writing websites for local audiences. Primarily that means writing or translating the site into the local language. If you’re trying to create a site that caters to non-English speakers, you need to be able to use languages other than English.

Note for advanced readers, this article is not going to address the needs of double-byte languages such as Japanese or Chinese. This is meant as an overview of special characters within the Latin-1 character set.

English and HTML

HTML was developed in English. Many pages are written in English, but that doesn’t mean that you can’t write pages in other languages. It’s easy to write pages that use a similar alphabet to English, called the Latin-1 character set. This includes languages such as:

  • English
  • French
  • Spanish
  • German

Using Special Characters

But standard English doesn’t include accented characters, letters with tildes or umlauts, and ligatures (two letters smashed together). So, how do you use languages that do have those characters? HTML has an encoding system that allows you to display these characters even if they aren’t right on your keyboard.

One common problem that occurs on non-English pages is that the non-English characters, such as Á and ñ are displayed using computer codes. For many people viewing the page, this doesn’t appear to be a problem. The Á and ñ display correctly, as long as the web page viewers are using the same operating system. For example, my co-worker has written pages for our French Canada site that look correct on his Mac and my Mac, but when we look at them on my Windows PC strange characters appear.

This is because the web page content was written in Microsoft Word. Then the text was copied and pasted directly into the HTML. Word uses codes to define the special characters, but these codes are different across systems. But if you convert those codes to HTML codes, they will display correctly on the web.

Don’t Just Copy and Paste Special Characters

It can be very tempting, when writing a word with special characters like résumé to just copy the é and paste it into your web document. As I mention above, it will probably look fine on your screen, but will then result in strange characters instead.

Don’t believe me? Here is a page that shows pasted special characters. I just pasted the word résumé into the HTML without encoding the accented e characters. Instead of é I got é when I viewed it locally. And when I uploaded the file, the About.com CMS converted them to Ž. Not exactly what I was expecting. Using the correct codes results in the correct character.

Character Codes

Here are some of the non-English languages, and the special codes used in HTML to write the special characters in them:

Current Web Design / HTML Features

©2012 About.com. All rights reserved. 

A part of The New York Times Company.