Programming Portal: XML

Showing posts with label XML. Show all posts

XML Logical Structure

HTML uses its tags as if they were style switches. The start tag turns a feature on, such as underlining, and an end tag turns it off again. XML uses its start tags and end tags as containers.

The start tag, the content, and the end tag all form a single element. Elements are the building bricks out of which an XML document is assembled. Each XML document must have only one root element, and all the other elements must be perfectly nested inside that element. This means that if an element contains other elements, those elements must be completely enclosed within that element.

If we sketch out the structure of the elements in this XML document, you’ll obtain the kind of tree structure of elements shown in figure below.

As we can see from figure , the document has a sort of tree-like structure, with the root element (&lthome.page>) at the top of the tree (or the base, depending on how you look at it). All the elements that are inside this element are neatly contained within each other. An XML document must contain one and only one root element, and there must not be any elements that are either partially or completely outside, before or after, that element.

To make it easier to refer to the relationships between elements and to elements with respect to other elements, we could say that an element is the parent of the elements that it contains. The elements that are inside an element are called its children. Elements that share the same parent element are called siblings.

In the simple example shown in Figure , &lthome.page> is the parent of all the other elements, &lttext> is the parent of &ltpara>, &lttitle> is a child of &lthead>, and &lttitle> and &ltbanner> are siblings. Going down the element tree, each child element must be fully contained within its parent element. Sibling elements may not overlap.

The arrangement of the elements in an XML document is called the logical structure. As you will see next, an XML document also has a physical structure. In order to be usable the logical and physical structure of an XML document must be synchronous; they must be completely and properly nested inside each other.

XML anatomy part two
TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

XML Document Anatomy part two

This post is in continuation with XML Document Anatomy part one and going through the will give you more comfort in understanding the present topic.

The Root Element (Lines 2 through 23)

Each XML document must have only one root element, and all the other elements must be completely enclosed in that element. Line 2 identifies the start of the

&lthome.page>

element (the start tag), and line 23 identifies the end of the element (the end tag).

Note that unlike HTML, in which a

&ltP>

tag might often be used as a sort of formatting instruction to insert a blank line between paragraphs of text, in XML an element normally consists of three things: a start tag, content (either text or other elements), and an end tag.

An XML element doesn’t always have content. Empty elements, such as the IMG element in HTML that simply points to an external graphics file through its SRC attribute, obviously have no content. An empty element might have an end tag, but it can have a special form of start tag that allows an explicit end tag to be omitted.

the name you use in the element start tag must exactly match the name you use in the end tag. If you want to use an odd combination of cases to increase the legibility of long names .

XML is case sensitive, recognizing the difference between uppercase letters (A–Z) and lowercase letters (a–z). In applications that aren’t case sensitive, mixed-case characters are usually converted—folded into one case or the other. The ASCII character set usually folds to uppercase characters. Unicode usually folds to lowercase characters. XML has to account for this, and for the fact that it might have to deal with languages in which the case folding is uncertain. Therefore, XML defaults to lowercase (and the XML declaration also has to be in lowercase).

An Empty Element (Line 13)

Empty elements are a special case in XML. In SGML and HTML, it is obvious from the DTD’s definition of an empty element that it is empty and has no comment. XML, in keeping with its developers’ design goals, requires you to be much more explicit. Indeed, you might not use a DTD at all, so it could be hard to decide whether an element is or should be empty. Therefore, empty elements have to be very clearly identified as such. To do so, there is a special empty tag close delimiter, />, as in the following:

&ltempty_element/>

To maintain a certain degree of backward-compatibility with SGML (until such time as the SGML standard is updated to allow the use of empty-tag close delimiters), and to make the conversion of existing SGML and HTML code into XML a little easier (a process called normalization, which adds end tags to all elements and is supported by a lot of SGML tools), you can use an end tag instead of the special empty tag close delimiter. The element declaration

blockquote>

&ltgraphic source=”file.gif”/>



is therefore interchangeable with



&ltgraphic source=”file.gif”></graphic>

Attributes (Lines 7 and 22)

Element tags can include one or more optional or mandatory attributes that give further information about the elements they delimit.

Attributes can only be specified in the element start tag. The syntax for specifying an attribute is

<element.type.name attribute.name=”attribute.value”>

If elements were nouns, attributes would be adjectives. We could therefore say

&ltfruit taste=”sharp”>

or even

&ltproblem size=”huge” cause=”unknown” solution=”run.away”>

An attribute can only be specified in an element start tag.

In direct contrast to SGML and HTML, in which multiple declarations are considered to be fatal errors, XML deals with multiple declarations of attributes in a unique manner. If an element appears once with one set of attributes and then appears again with a different set of attributes, the two sets of attributes are merged. The first declaration of an attribute for a particular element is the only one that counts, and any other declarations are ignored. The XML processor might warn you about the appearance of multiple declarations, but it is not required to do so and processing can continue as normal.

Xml introduction

TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

XML Document Anatomy

XML’s rules for distinguishing between markup and content are :

1.The start of markup is identified by either the less-than symbol (<) or the ampersand character (&).

2. Three other characters are also treated as markup characters: the greater-than symbol (>), the apostrophe or single quote, (‘), and the (double) quotation mark (“).

3.If you want to use any of the preceding special characters as normal characters, you must “escape” them by using the general entities that represent them.

To escape a character means to conceal it from a subsequent software package or process. It is often used in computing terms to refer to prefixing certain characters in programming languages with a special character string to prevent them from being interpreted as special characters.

Originally the ESC (escape) character string was used to prefix commands sent to the printer itself to control such things as the font or page size and distinguish the command strings from printable characters.

4.Everything that is not markup is content (character data).

The following code shows the XML code for a Web home page. This is a very simple example, but it contains all the important parts that you will find in nearly all XML documents.


1:  <?xml version=”1.0”?>

2:   &lthome.page>

3:     &lthead>

4:       &lttitle>

5:         My Home Page

6:       </title>

7:       &ltbanner source=”topbanner.gif”/>

8:     </head>

9:     &ltbody>

10:       &ltmain.title>

11:         Welcome to My Home Page

12:      </main.title>

13:      &ltrule/>

14:      &lttext>

15:        &ltpara>

16:          Sorry, this home page is still

17:          under construction. Please come

18:          back soon!

19:        </para>

20:      </text>

21:    </body>

22:    &ltfooter source=”foot.gif”/>

23:  </home.page>

XML Introduction

Problems with HTML :

1.HTML has syntactic checking and Validation constraints:

There are formal definitions of the structure of HTML documents. HTML is an SGML application and there is a document type definition (DTD) for every version of HTML. Web browsers are designed to accept almost anything that looks even slightly like HTML .The only tag that is compulsory in an HTML document is the TITLE tag; and this is one of the least common tags there is.

2. HTML content awareness problems:

Searching the Web is complicated by the fact that HTML doesn’t give you a way to describe the information content i.e the semantics of documents. In XML you can use any tags you like (such as &ltNAME> instead of &ltH3>), but using attributes in tags (such as &ltH3 CLASS=“name”>) can embed just as much semantic information as custom tags can.

Without any agreement on tag names, the value of custom tags becomes a bit doubtful. To worsen matters, the same tag name in one context can mean something completely different in another. Furthermore, there are the complications of foreign languages—seeing &ltinkoopprijs> isn’t going to help very much if you don’t know that it’s Dutch for “purchase price.”

HTML is not object-oriented:

Modern programmers have been making a long and difficult transition to object-oriented techniques. They want to leverage these skills and have such things as inheritance, and HTML has done very little to accommodate them.

HTML lacks a robust linking mechanism:

If you’ve spent a few hours on the Web, you’ve probably encountered at least one broken link. HTML’s links are one-to-one, with the linking hard coded in the source HTML files. If the location of one target file changes, a Webmaster may have to update dozens or even hundreds of other pages.

HTML is not reusable:

Depending on how well written they are, HTML pages and fragments of HTML code can be extremely difficult to reuse because they are so specifically tailored to their place in the web of associated pages.

The Standard Generalized Markup Language (SGML) from which XML is derived, is useful to make data storage independent of any one software package or software vendor. SGML is a meta language, or a language for describing markup languages. HTML is one such markup language and is therefore called an SGML application. In XML, these applications are often called markup languages—such as the hand-held device markup language (HDML) and the FAQ markup language (QML).

But SGML is just too expensive and complicated for Web use on a large scale. Using SGML requires too much of an investment in time, tools, and training.

XML uses the features of SGML that it needs and tries to incorporate the lessons learned from HTML.

Advantages of XML :

1.XML can be used with existing Web protocols and mechanisms and it does not impose any additional requirements. XML has been developed with the Web in mind—features of SGML that were too difficult to use on the Web were left out, and features that are needed for Web use either have been added or are inherited from applications that already work.

2.XML supports a wide variety of applications. It is difficult to support a lot of applications with just HTML; hence, the growth of scripting languages. HTML is simply too specific. XML adopts the generic nature of SGML, but adds flexibility to make it truly extensible.

3. It is easy to write programs that process XML documents. One of the major strengths of HTML is that it’s easy for even a non-programmer to throw together a few lines of scripting code that enable you to do basic processing . HTML even includes some features of its own that enable you to carry out some basic processing .

4.XML documents are reasonably clear to the any one.A valid XML document

Describes the structural rules that the markup attempts to follow
Lists any external resources (external entities) that are part of the document
Declares any internal resources (internal entities) that are used within the document
Lists the types of non-XML resources (notations) used and identifies any helper applications that might be needed
Lists any non-XML resources (binaries) that are used within the document and identifies any helper applications that might be needed

5.XML documents are easy to create. HTML is almost famous for its ease of use, and XML capitalizes on this strength.

Other Programming Courses :

Security testing and functional testing
TESTING CONSTRAINTS PART TWO

LIFE CYCLE TESTING

TEST METRICS

Independent Software Testing

Test Process

Testing verification and validation

Functional and structural testing

Static and dynamic testing

V model testing

Eleven steps of V model testing

Structural testing

Execution testing technique

Recovery Testing technique

Operation testing technique

Compliance software testing technique

Security testing technique

Top Tabs