If you've written HTML, you've almost certainly used attributes without even realizing it. For example, the image tag requires the use of at least one attribute, the src attribute in order to display any images. But with HTML, if you include an attribute that is incorrect or invalid, the browsers will ignore it.
In XML, like HTML, an attribute is a part of an element that provides additional information about that element. You might think of an attribute as an adjective describing the element it is within. For example, if you have an element "dog", it might have an attribute color="white": <dog color="white" />
Attributes are formed in name=value pairs. Thus, in XML, you would never write <dog white /> - that would be incorrect. One way to think about it is to think of the most generic instance of the adjective you are using. If you're describing your "dog" element as "big", "white", and "smart", then you should probably have three attributes: size, color, and intelligence. Then you could have one <dog color="white" size="big" intelligence="smart" /> and another element <dog color="calico" size="medium" intelligence="stupid" />
An XML attribute can be one of the following types:
- CDATA
CDATA is character data. This means that any string of non-markup characters is legal as part of the attribute. So, if the color attribute uses CDATA for our dog element, your DTD might look like this:
This would allow for both <dog color="white" /> and <dog color="black and white" /><!ATTLIST dog color CDATA<
- ENTITY and ENTITIES
The ENTITY attribute type indicates that the attribute will represent an external entity in the document itself. In order for the parser to know what to do with the entity, you need to declare it with a notation element in your DTD.
- Enumeration
Enumeration allows you to define a specific list of values that the attribute value must match. With your dog element, you might want to define the "intelligence" attribute as only "smart" or "stupid". Your DTD for this might look like:<!ATTLIST dog intelligence (smart | stupid) > - ID
Use the ID attribute type if you want to specify a unique identifier for each element. So, if I had a database write out XML of my dogs, I might use the ID type for their names, as all my dogs have a different name. But this is more often used with truely unique identifiers, like "d001", "d002", and "d003". Your DTD for this attribute type might look like this:<!ATTLIST dog id ID > <dog id="d004" /> - IDREF and IDREFS
You can use the IDREF type to reference an ID that has been named for another element. In my pets data, I might have another element describing the dog's bad habits. With the IDREFS attribute type, I can refer to the ids of all the dogs that did one particular bad habit:<habit type="digging" dog="d002 d004" /> - NMTOKEN and NMTOKENS
The NMTOKEN attribute type is similar to CDATA with even more restrictions on what data can be part of the attribute. They are restricted to letters, numbers, periods, hyphens, underscores, and colons.
- NOTATION
A NOTATION declares that an element will be referenced to a NOTATION declared somewhere else in the XML document. For example, if you wanted to include graphics in your document, you might include a notation declaration defining a JPeG:
And when you create your graphic element, you would refer to the type of image in the attributes list:<!NOTATION jpg PUBLIC "JPeG Image" ><!ATTLIST image filetype NOTATION (jpg | gif) >
So, if I were creating elements to list my dogs in an XML document, the DTD might look something like this (this is not a complete DTD):
<!ELEMENT dog (#PCDATA)>
<!ATTLIST dog
name CDATA #REQUIRED
id ID #REQUIRED
size (small | medium | large) #IMPLIED
color CDATA #IMPLIED
intelligence (smart | stupid) #IMPLIED
photo IDREF #IMPLIED
>
<!NOTATION jpg PUBLIC "JPeG Image" >
<!NOTATION gif PUBLIC "GIF Image" >
<!ELEMENT photo (#PCDATA)>
<!ATTLIST photo
src CDATA #REQUIRED
id ID #REQUIRED
filetype NOTATION (jpg | gif)
>
<!ELEMENT habits (#PCDATA)>
<!ATTLIST habits
type (digging | barking | howling) #REQUIRED
dogs IDREFS
>
And this would result in a section of an XML document that might look like this:
<dog name="Homer" id="d001" size="medium" color="black and white" intelligence="smart" />
<photo src="shasta" id="p001" filetype="gif" />
<dog name="Shasta" id="d004" size="large" color="white" intelligence="smart" photo="p001" />
<dog name="Calico" id="d002" intelligence="stupid" />
<habits type="digging" dogs="d002 d004" />

