In HTML4 to set the character encoding on a document with a
META element, you would write:
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
What is important to notice are the quotation marks around the
content="text/html; charset=iso-8859-1". They indicate that the entire string
text/html; charset=iso-8959-1 is the
content of this element.
But a lot of web developers were writing the character encoding
META element without any quotation marks:
<meta http-equiv=content-type content=text/html; charset=iso-8859-1>
So, for browsers to interpret this they had to see the element and recognize that
charset was not a separate attribute, but rather a part of the
content attribute. Luckily for us, browser manufacturers are smart (and kind) and built the browsers so that they would recognize what the developer was trying to say and set the character encoding correctly, even though the element was written incorrectly.
HTML5 Cut Out the Extra Stuff
The HTML5 editors looked at the fact that developers were writing the meta tag incorrectly and that browsers were interpreting it anyway, and decided to make this shortened syntax valid. Now with HTML5 you can add your character encoding with a much easier to remember
Always Include the Character Encoding
You should always include character encoding for your web pages, even if you never use any special characters. If you don't, your site becomes vulnerable to a cross site scripting attack using UTF-7.
The attacker sees that your site has no character encoding defined, so it makes the browser think that the character encoding is UTF-7. Then the attacker injects UTF-7 encoded scripts into the web page, and your site is hacked.
The Character Encoding Should be the First Line of Your HTML After the Root and Head Elements
This ensures that the browser knows what the character encoding is before it does anything else. Your HTML should read:
You can also specify the character encoding in the HTTP headers. This is even more secure than adding it to the HTML, but you need to have access to the server configurations or .htaccess files.
In Apache, you can set the default character set for your entire site by adding:
AddDefaultCharset UTF-8 to your root
.htaccess file. Apache's default character set is