1A Quick Tour of the Foundations of Web Apps

If you are already familiar with HTML, XML and JavaScript, you may skip this chapter and immediately start developing a minimal web application by going to the next chapter.

1.1The World Wide Web (WWW)

After the Internet had been established in the 1980’ies, Tim Berners-Lee4 developed the idea and the first implementation of the WWW in 1989 at the European research institution CERN in Geneva, Switzerland. The WWW (or, simply, “the Web”) is based on the Internet technologies TCP/IP (the Internet Protocol) and DNS (the Domain Name System). Initially, the Web consisted of

  1. the Hypertext Transfer Protocol (HTTP),
  2. the Hypertext Markup Language (HTML), and
  3. web server programs, acting as HTTP servers, as well as web ’user agents’ (such as browsers), acting as HTTP clients.

Later, further important technology components have been added to this set of basic web technologies:

the page/document style language Cascading Style Sheets (CSS) in 1995,

the web programming language JavaScript in 1995,

the Extensible Markup Language (XML), as the basis of web formats (like SVG and RDF/XML), in 1998,

the XML-based Scalable Vector Graphics (SVG) format in 2001,

the Resource Description Framework (RDF) for knowledge representation on the Web in 2004.

1.2HTML and XML

HTML allows to mark up (or describe) the structure of a human-readable web document or web user interface, while XML allows to mark up the structure of all kinds of documents, data files and messages, whether they are human-readable or not. XML can also be used as the basis for defining a version of HTML that is called XHTML.

1.2.1XML documents

XML provides a syntax for expressing structured information in the form of an XML document with nested elements and their attributes. The specific elements and attributes used in an XML document can come from any vocabulary, such as public standards or (private) user-defined XML formats. XML is used for specifying

document formats, such as XHTML5, the Scalable Vector Graphics (SVG) format or the DocBook format,

data interchange file formats, such as the Mathematical Markup Language (MathML) or the Universal Business Language (UBL),

message formats, such as the web service message format SOAP5

1.2.2Unicode and UTF-8

XML is based on Unicode, which is a platform-independent character set that includes almost all characters from most of the world’s script languages including Hindi, Burmese and Gaelic. Each character is assigned a unique integer code in the range between 0 and 1,114,111. For example, the Greek letter π has the code 960, so it can be inserted in an XML document as π using the XML entity syntax.

Unicode includes legacy character sets like ASCII and ISO-8859–1 (Latin-1) as subsets.

The default encoding of an XML document is UTF-8, which uses only a single byte for ASCII characters, but three bytes for less common characters.

Almost all Unicode characters are legal in a well-formed XML document. Illegal characters are the control characters with code 0 through 31, except for the carriage return, line feed and tab. It is therefore dangerous to copy text from another (non-XML) text to an XML document (often, the form feed character creates a problem).

1.2.3XML namespaces

Generally, namespaces help to avoid name conflicts. They allow to reuse the same (local) name in different namespace contexts. Many computational languages have some form of namespace concept, for instance, Java and PHP.

XML namespaces are identified with the help of a namespace URI, such as the SVG namespace URI “ http://www.w3.org/2000/svg”, which is associated with a namespace prefix, such as svg. Such a namespace represents a collection of names, both for elements and attributes, and allows namespace-qualified names of the form prefix:name, such as svg:circle as a namespace-qualified name for SVG circle elements.

A default namespace is declared in the start tag of an element in the following way:

This example shows the start tag of the HTML root element, in which the XHTML namespace is declared as the default namespace.

The following example shows an SVG namespace declaration for an svg element embedded in an HTML document:

1.2.4Correct XML documents

XML defines two syntactic correctness criteria. An XML document must be well-formed, and if it is based on a grammar (or schema), then it must also be valid with respect to that grammar, or, in other words, satisfy all rules of the grammar.

An XML document is called well-formed, if it satisfies the following syntactic conditions:

  1. There must be exactly one root element.
  2. Each element has a start tag and an end tag; however, empty elements can be closed as <phone/> instead of <phone></phone>.

3.Tags don’t overlap. For instance, we cannot have
<author><name>Lee Hong</author></name>

4.Attribute names are unique within the scope of an element. For instance, the following code is not correct:

An XML document is called valid against a particular grammar (such as a DTD or an XML Schema), if

  1. it is well-formed,
  2. and it respects the grammar.

1.2.5The evolution of HTML

The World-Wide Web Committee (W3C) has developed the following important versions of HTML:

1997: HTML 4 as an SGML-based language,

2000: XHTML 1 as an XML-based clean-up of HTML 4,

2014: (X)HTML 5 in cooperation (and competition) with the WHAT working group6 supported by browser vendors.

As the inventor of the Web, Tim Berners-Lee developed a first version of HTML7 in 1990. A few years later, in 1995, Tim Berners-Lee and Dan Connolly wrote the HTML 28 standard, which captured the common use of HTML elements at that time. In the following years, HTML has been used and gradually extended by a growing community of early WWW adopters. This evolution of HTML, which has led to a messy set of elements and attributes (called “tag soup”), has been mainly controlled by browser vendors and their competition with each other. The development of XHTML in 2000 was an attempt by the W3C to clean up this mess, but it neglected to advance HTML’s functionality towards a richer user interface, which was the focus of the WHAT working group9 led by Ian Hickson10 who can be considered as the mastermind and main author of HTML 5 and many of its accompanying JavaScript APIs that made HTML fit for mobile apps.

HTML was originally designed as a structure description language, and not as a presentation description language. But HTML4 has a lot of purely presentational elements such as font. XHTML has been taking HTML back to its roots, dropping presentational elements and defining a simple and clear syntax, in support of the goals of

device independence,

accessibility, and

usability.

We adopt the symbolic equation

HTML = HTML5 = XHTML5

stating that when we say “HTML” or “HTML5”, we actually mean XHTML5 because we prefer the clear syntax of XML documents over the liberal and confusing HTML4-style syntax that is also allowed by HTML5.

The following simple example shows the basic code template to be used for any HTML document:

Notice that in line 1, the HTML5 document type is declared, such that browsers are instructed to use the HTML5 document object model (DOM). In the html start tag in line 2, using the default namespace declaration attribute xmlns, the XHTML namespace URI http://www.w3.org/1999/xhtml is declared as the default namespace for making sure that browsers, and other tools, understand that all non-qualified element names like html, head, body, etc. are from the XHTML namespace.

Also in the html start tag, we set the (default) language for the text content of all elements (here to “en” standing for English) using both the xml:lang attribute and the HTML lang attribute. This attribute duplication is a small price to pay for having a hybrid document that can be processed both by HTML and by XML tools.

Finally, in line 4, using an (empty) meta element with a charset attribute, we set the HTML document’s character encoding to UTF-8, which is also the default for XML documents.

1.2.6HTML forms

For user-interactive web applications, the web browser needs to render a user interface (UI). The traditional metaphor for a software application’s UI is that of a form. The special elements for data input, data output and user actions are called form controls or UI widgets. In HTML, a form element is a section of a web page consisting of block elements that contain form controls and labels on those controls.

Users complete a form by entering text into input fields and by selecting items from choice controls, including dropdown selection lists, radio button groups and checkbox groups. A completed form is submitted with the help of a submit button. When a user submits a form, it is normally sent to a web server either with the HTTP GET method or with the HTTP POST method. The standard encoding for the submission is called URL-encoded. It is represented by the Internet media type application/x-www-form-urlencoded. In this encoding, spaces become plus signs, and any other reserved characters become encoded as a percent sign and hexadecimal digits, as defined in RFC 1738.

Each form control has both an initial value and a current value, both of which are strings. The initial value is specified with the control element’s value attribute, except for the initial value of a textarea element, which is given by its initial contents. The control’s current value is first set to the initial value. Thereafter, the control’s current value may be modified through user interaction or scripts. When a form is submitted for processing, some controls have their name paired with their current value and these pairs are submitted with the form.

Labels are associated with a control by including the control as a child element within a label element (implicit labels), or by giving the control an id value and referencing this ID in the for attribute of the label element (explicit labels).

In the simple user interfaces of our “Getting Started” applications, we only need four types of form controls:

  1. single line input fields created with an <input name="" /> element,
  2. single line output fields created with an <output name="" /> element,
  3. push buttons created with a <button type="button"></button> element, and
  4. dropdown selection lists created with a select element of the following form:

An example of an HTML form with implicit labels for creating such a user interface is

In an HTML-form-based data management user interface, we have a correspondence between the different kinds of properties defined in the model classes of an app and the form controls used for the input and output of their values. We have to distinguish between various kinds of model class attributes, which are mapped to various kinds of form fields. This mapping is also called data binding.

In general, an attribute of a model class can always be represented in the user interface by a plain input control (with the default setting type=" text"), no matter which datatype has been defined as the range of the attribute in the model class. However, in special cases, other types of input controls (for instance, type= "date"), or other widgets, may be used. For instance, if the attribute’s range is an enumeration, a select control or, if the number of possible choices is small enough (say, less than 8), a radio button group can be used.

1.3Styling Web Documents and User Interfaces with CSS

While HTML is used for defining the content structure of a web document or a web user interface, the Cascading Style Sheets (CSS) language is used for defining the presentation style of web pages, which means that you use it for telling the browser how you want your HTML (or XML) rendered: using which layout of content elements, which fonts and text styles, which colors, which backgrounds, and which animations. Normally, these settings are made in a separate CSS file that is associated with an HTML file via a special link element in the HTML’s head.

A first sketch of CSS11 was proposed in October 1994 by Håkon W. Lie12 who later became the CTO of the browser vendor Opera. While the official CSS113 standard dates back to December 1996, “most of it was hammered out on a whiteboard in Sophia-Antipolis” by Håkon W. Lie together with Bert Bos in July 1995 (as he explains in an interview14).

CSS is based on a form of rules that consist of selectors, which select the document element(s) to which a rule applies, and a list of property-value pairs that define the styling of the selected element(s) with the help of CSS properties such as font-size or color. There are two fundamental mechanisms for computing the CSS property values for any page element as a result of applying the given set of CSS rules: inheritance and the cascade.

The basic element of a CSS layout15 is a rectangle, also called “box”, with an inner content area, an optional border, an optional padding (between content and border) and an optional margin around the border. This structure is defined by the CSS box model.

We will not go deeper into CSS in this book, since our focus here is on the logic and functionality of an app, and not so much on its beauty.

1.4JavaScript – “the assembly language of the Web

JavaScript was developed in 10 days in May 1995 by Brendan Eich16, then working at Netscape17, as the HTML scripting language for their browser Navigator 2 (more about history18). Brendan Eich said (at the O’Reilly Fluent conference in San Francisco in April 2015): “I did JavaScript in such a hurry, I never dreamed it would become the assembly language for the Web”.

JavaScript is a dynamic functional object-oriented programming language that can be used for

1.Enriching a web page by

generating browser-specific HTML content or CSS styling,

inserting dynamic HTML content,

producing special audio-visual effects (animations).

2.Enriching a web user interface by

implementing advanced user interface components,

validating user input on the client side,

automatically pre-filling certain form fields.

3.Implementing a front-end web application with local or remote data storage19.

4.Implementing a front-end component for a distributed web application with remote data storage managed by a back-end component, which is a server-side program that is traditionally written in a server-side language such as PHP, Java or C#, but can nowadays also be written in JavaScript with NodeJS.

5.Implementing a complete distributed web application where both the front-end and the back-end components are JavaScript programs.

The version of JavaScript that is currently fully supported by web browsers is called “ECMAScript 5.1”, or simply “ES5”, but the next two versions, called “ES6” and “ES7” (or “ES 2015” and “ES 2016”), are already partially supported by current browsers and back-end JS environments. In fact, in May 2017, ES6 is fully supported in non-mobile browsers, except its important new module concept.

1.4.1JavaScript as an object-oriented language

JavaScript is object-oriented, but in a different way than classical OO programming languages such as Java and C++. In JavaScript, classes, unlike objects and functions, are not first-class citizens. Rather, classes have to be defined by following some code pattern in the form of special JS objects: either as constructor functions (possibly using the syntactic sugar of ES6 class declarations) or as factory objects.

However, objects can also be created without instantiating a class, in which case they are untyped, and properties as well as methods can be defined for specific objects independently of any class definition. At run time, properties and methods can be added to, or removed from, any object and class. This dynamism of JavaScript allows powerful forms of meta-programming, such as defining your own concepts of classes and enumerations (and other special datatypes).

1.4.2Further reading about JavaScript

Good open access books about JavaScript are

Speaking JavaScript20, by Dr. Axel Rauschmayer.

Eloquent JavaScript21, by Marijn Haverbeke.

1.5Accessibility for Web Apps

The recommended approach to providing accessibility for web apps is defined by the Accessible Rich Internet Applications (ARIA) standard. As summarized by Bryan Garaventa22 in his article on different forms of accessibility23, there are 3 main aspects of accessibility for interactive web technologies: 1) keyboard accessibility, 2) screen reader accessibility, and 3) cognitive accessibility.

Further reading on ARIA:

  1. How browsers interact with screen readers, and where ARIA fits in the mix24 by Bryan Garaventa
  2. The Accessibility Tree Training Guide25 by whatsock.com
  3. The ARIA Role Conformance Matrices26 by whatsock.com
  4. Mozilla’s ARIA overview article27
  5. W3C’s ARIA overview page28

1.6Quiz Questions

If you would like to look up the answers for the following quiz questions, you can check our discussion forum29. If you don’t find an answer in the forum, you may create a post asking for an answer to a particular question.

1.6.1Question 1: Well-Formed XML

Which of the following fragments represent well-formed XML? Select one or many:

  1. <emph><STRONGER>This is some text <bold>and this is more text. Here is even more.</bold> text.</STRONGER></emph>
  2. <stronger>This text is bold. <emph>And this is italicized and bold.</ emph></stronger><emph>And this is just italics.</emph>
  3. <STRONGER>This text is bold. <emph>And this is italicized and bold.</ STRONGER> And this is just italics.</emph>
  4. <strong>This text is bold. <em>and this is italicized and bold.</EM></ strong><em>and this is just italics.</em>

1.6.2Question 2: HTML Forms

Recall that an HTML form is a section of an HTML document consisting of block elements that contain controls and labels on those controls. Which of the following form elements represent correct forms? Select one or many:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset