XML

 

Regular publications like Journals can also be designed in such a way to fit for this type of work, and also applies to some financial services which they articles are presented through searching from a depository of contents across different providers, but because it is structured in a pre-defined format, it can be retrieved and re-purposed for different reading devices. Allowing the service can be provided real time and presented in a friendly readable manner.

About XML

Extensible Markup Language (XML) is a computer-to-computer communication technology that now touches every aspect of print media, printing workflows, and printing applications. Today XML is widely used in the publishing industry to use as a structured language to safe guard their investment on their contents. Making the contents in a structured way, we can re-purpose a 20,000 word dictionary in many different formats, serving the market need quickly.

XML is used for publishing e-books on tablets and mobile devices in EPUB format. XML-based job dockets enable seamless communication from digital store fronts to imposition to ink key settings to the bindery.

XML is a easy to learn language for non-technical audiences because there is no requirement to learn a programming language. The underlying syntax of XML uses simple, English word-tag names such as  <Author>John Smith </Author> will enable readers to search the article written by John Smith easily.  There are no encrypted files or special software needed to implement XML, and it can be viewed using a simple text editor or any browser—Firefox, Chrome, Internet Explorer, or Safari. The simplicity of XML has contributed greatly to its widespread use today. In this article, we will explain the basics of XML and demonstrate a number of practical print applications.

Markup Languages

In many technology areas today markup languages form the fundamental method for data storage and information exchange and display. Markup languages consist of open and close tags, with the information stored in between. Most of us are familiar with Hypertext Markup Language (HTML), which is a markup language used for encoding in web-based applications. In HTML, information is stored within predefined open and close tags, e.g. <h1>This is a heading</h1>, which is interpreted and rendered by a web browser such as Chrome or Firefox to display the content as This is a heading.

What is XML?

XML also uses open and close tags; however, unlike HTML that has predefined tags, XML is an extensible database language, which means that it can be extended or adapted to suit individual needs.  XML is not encrypted or encoded, and the ability to easily view XML and understand the code is fully intentional, leading to very wide adoption and implementation.

The main components of XML are elements, attributes and schema. Elements and attributes are the basic building blocks of XML; these are HTML-like tags that are used to store data. A schema is a type of rulebook where a technology area or industry group has agreed to use only specific XML element names and attributes. Elements are used to store data and communicate information. The data in an XML document is stored between open and close tags, and this structure is called an XML element, e.g. <Address>128 Hope Street</Address>, where Address is called an XML element and 128 Hope Street is the content of the element.

The opening and closing tags must match exactly, including upper and lower case letters and the closing tag has an added backslash. Further information pertaining to an element can be stored in an associated structure called an attribute. An attribute is always contained within the opening element and the attribute value is enclosed in quotation marks. Attributes are usually features or characteristics of an element, as shown in the following example where the Address element has an attribute that indicates if it is residential or business.

<Address>128 Hope Street </Address>
<Address> Type = “Residential”>128 Hope Street </Address>

XML and EPUBs

So how does XML relate to e-books and the EPUB format? The EPUB format is widely supported and used by tablets, smartphones, and dedicated e-readers such as Apple iPad, Amazon Kindle, Kobo, or Nook. The EPUB format is different than a PDF file, as the content in an EPUB is dynamic, so that if the user increases the font size the index still directs the user to the correct page. An EPUB file is delivered as a single file; however, it is actually a zipped archive containing a set of resources that are managed using XML instructions.

EPUB files have an XML directory file that points to the different files defining the book’s images, front matter, contents, etc. The EPUB format is useful for books, instruction manuals, flight manuals, and other long documents. Programs such as Adobe InDesign are able to take conventional documents and export them as XML-based EPUBs that can be viewed on a range of tablets and devices. The EPUB technology separates formatting (fonts, font sizes) from the content, thus allowing one to be easily changed without affecting the other. For this reason, a user can choose any range of fonts in which they wish to read their favorite EPUB title. The EPUB community has adopted XML as their underlying data storage format, so an understanding of XML syntax becomes very valuable in this area of digital publishing.

Graphicraft has been working in XML for Journal publication for many years, since then we have been heavily involved with Dictionary work, and a real time document production where a reader can run a contextual search in a huge database, the result of the search is then turn into a document either in PDF or HTML, allowing readers to save it as a eBook, take it anywhere to read it.