OpenDocument
From FreeBio
OpenDocument, short for the OASIS Open Document Format for Office Applications, is an open document file format for saving office documents such as spreadsheets, memos, charts, and presentations. The standards are developed by the OASIS consortium based upon the XML file format created by OpenOffice.org.
The OpenDocument format is intended to provide an open alternative to proprietary document formats including the popular DOC, XLS, and PPT formats used by Microsoft Office. Organizations and individuals that store their data in an open format avoid being locked in to a single software vendor, leaving them free to switch software if their current vendor goes out of business or changes their software or licensing terms to something less favorable.
Contents |
Public policy implications
Since one objective of open formats like OpenDocument is to guarantee long-term access to data without legal or technical barriers, governments have become increasingly aware of open formats as a public policy issue.
In 2002, Dr. Edgar David Villanueva Nuñes, a lawyer and Congressman of the Republic of Perú, wrote a letter to Microsoft Peru raising questions about free and permanent document access with proprietary formats.
In early 2005, Eric Kriss, Secretary of Administration and Finance in Massachusetts, was the first State government official in the U.S. to connect open formats to a public policy purpose: "It is an overriding imperative of the American democratic system that we cannot have our public documents locked up in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license that restricts access." [1]
Subsequently, in September 2005, Massachusetts became the first state to formally endorse OpenDocument formats for its public records and, at the same time, reject Microsoft's proprietary XML format. Microsoft Office, which has a nearly 100% market share among the state's employees, does not currently support OpenDocument formats. Microsoft has tentatively indicated that OpenDocument formats will not be supported in new versions of Office.
Standardization and licensing
Version 1.0 of the OpenDocument specification was developed after a long period of time by multiple organizations, including the developers of the office suites OpenOffice.org, StarOffice, KOffice, and Word Perfect, along with users with the need to manage complex documents (including Boeing and a bible translation organization who needed sophisticated multi-language support). The group decided to build on an earlier version of the OpenOffice.org format, since this was already an XML format with most of the desired properties. After public review it was approved as an OASIS standard in May 2005.
The specification is available for free download and use [2]. The specification is licensable under reciprocal, royalty-free terms by any party. Terms of the license can be seen here: [3] Reciprocal, royalty-free licensing terms are being promoted by some standards developing organizations, such as the W3C and OASIS, as a method for avoiding conflict over intellectual property concerns while still promoting innovation. See also software patent debate. In short, anyone can implement OpenDocument, without restraint.
All of this is in contrast with the competing "Microsoft Office Open XML" developed by Microsoft. Although technically royalty-free, Microsoft imposes additional legal conditions on its format that many analysts have determined will prevent many competitors from ever implementing the format. In particular, analysts have determined that the legal obligations for the Microsoft format are such so it cannot be used by competing programs licensed under the GNU General Public License, the most popular license for open source software, and possibly many other licenses as well. Microsoft has also stated that it is applying for a number of patents related to its format; these patents could be used later to force competitors to exit the market. Microsoft developed and controlled its format; it has not submitted their specification to any standards body or any other independent multi-vendor review and development process. Thus, there's been no external effort to ensure that others can easily implement it, or that patents cannot be used to prevent implementation. These attributes of Microsoft's format are especially concerning to some because formerly secret Microsoft documents (known as "Halloween I" and "Halloween II"), which were developed in collaboration with key people in Microsoft, recommended that Microsoft suppress competition by "de-commoditizing" protocols (creating proporietary formats that could not be used by others) and by attacking competitors through patent lawsuits.
The European Union recommended OpenOffice.org as the basis for standard file formats and document interchange, and the OpenDocument format is likely to become an ISO Standard.
File types
The file extensions are .odt for text documents, .ods for spreadsheets, .odp for presentation programs, .odg for graphics and .odb for database applications.
Documents
| File type | Extension | Mime Type |
|---|---|---|
| Text | .odt | application/vnd.oasis.opendocument.text |
| Spreadsheet | .ods | application/vnd.oasis.opendocument.spreadsheet |
| Presentation | .odp | application/vnd.oasis.opendocument.presentation |
| Drawing | .odg | application/vnd.oasis.opendocument.graphics |
| Chart | .odc | application/vnd.oasis.opendocument.chart |
| Formula | .odf | application/vnd.oasis.opendocument.formula |
| Database | .odb | application/vnd.oasis.opendocument.database |
| Image | .odi | application/vnd.oasis.opendocument.image |
| Master Document | .odm | application/vnd.oasis.opendocument.text-master |
Templates
| File type | Extension | Mime Type |
|---|---|---|
| Text | .ott | application/vnd.oasis.opendocument.text-template |
| Spreadsheet | .ots | application/vnd.oasis.opendocument.spreadsheet-template |
| Presentation | .otp | application/vnd.oasis.opendocument.presentation-template |
| Drawing | .otg | application/vnd.oasis.opendocument.graphics-template |
Format internals
An OpenDocument file is a ZIP archive containing a number of files and directories:
| XML Files | Other files | Directories |
|---|---|---|
content.xml meta.xml settings.xml styles.xml |
mimetype layout-cache |
META-INF/ Thumbnails/ Pictures/ Configurations2/ |
The OpenDocument format provides a strong separation between content, layout and metadata. The most notable components of the format are:
- content.xml: This is the most important file. It carries the actual content of the document (except for binary data, like images). The base format is inspired by HTML, and though far more complex, it should be reasonably legible to humans:
<text:h text:style-name="Heading_2">This is a title</text:h> <text:p text:style-name="Text_body"/> <text:p text:style-name="Text_body"> This is a paragraph. The formatting information is in the Text_body style. The empty text:p tag above is a blank paragraph (an empty line). </text:p>
- styles.xml: OpenDocument makes heavy use of styles for formatting and layout. Most of the style information is here (though some is in content.xml). Styles types include:
* Paragraph styles. * Page Styles. * Character Styles. * Frame Styles. * List styles.
The OpenDocument format is unique in that you cannot avoid using styles for formatting. Even "manual" formatting is implemented through styles (the application dynamically makes new styles as needed).
- meta.xml: Contains the file metadata. For example, Author, "Last modified by", date of last modification, etc. The contents look somewhat like this:
<meta:creation-date>2003-09-10T15:31:11</meta:creation-date>
<dc:creator>Daniel Carrera</dc:creator>
<dc:date>2005-06-29T22:02:06</dc:date>
<dc:language>es-ES</dc:language>
<meta:document-statistic
meta:table-count="6" meta:object-count="0"
meta:page-count="59" meta:paragraph-count="676"
meta:image-count="2" meta:word-count="16701"
meta:character-count="98757"/>
The <dc:...> tags are the Dublin Core XML standard.
- settings.xml: Settings include things like the zoom factor, or the cursor position. These are properties that are not content or layout.
- Pictures/: This folder contains all images in the document. They are refered to from content.xml using a <draw:image> tag, similar to the HTML <img> tag:
<draw:image
xlink:href="Pictures/10000000000005E80000049F21F631AB.tif"
xlink:type="simple" xlink:show="embed"
xlink:actuate="onLoad"/>
The layout information (width, anchor, etc) is provided by a <draw:frame> tag that contains the <draw:image> tag.
Most images are kept in their original format (GIF, JPEG, PNG) but bitmap images are converted to PNG for size considerations.
- mimetype: This is just a one-line file with the mimetype of the document. One implication of this is that the file extension is actually inmaterial to the format. The file extension is only there for the benefit of the user.
OpenDocument is designed to reuse existing open XML standards whenever they are available, and it creates new tags only where no existing standard can provide the needed functionality. So, OpenDocument uses DublinCore for metadata, MathML for formulae, SVG for vector graphics, SMIL for multimedia, etc.
Applications supporting OpenDocument
- Abiword 2.3, through the OpenWriter plugin
- eZ publish 3.6, with OpenOffice extension
- IBM Workplace
- Knomos case management 1.0 [4]
- KOffice 1.4, released on June 21st 2005
- OpenOffice.org 1.1.5 and 2.0 beta
- Scribus 1.2.2, imports OpenDocument Text and Graphics
- TextMaker 2005 beta [5]
- Visioo Writer 0.5.2 [6]
See also
- WordprocessingML
- List of document markup languages
- Comparison of document markup languages
- Open Document Architecture - An older standard file format that failed to gain acceptance.
- Open format
External links
- OASIS OpenDocument Essentials A book describing the OpenDocument format.
- OASIS Open Document Format Technical Committee
- Tim Bray of Sun on Open Office XML ISO Certification
- The Future Is Open: What OpenDocument Is And Why You Should Care ~ by Daniel Carrera
- The announcement of OpenDocument's approval from OASIS
- OpenDocument for Spreadsheets Morten Welinders complaints that the spreadsheet spec doesn't define anything about formulas.
- And a reply from Eike Rathke.
- OpenFormula, a specification for spreadsheet formula extending OpenDocument
- Why OpenDocument Won (and Microsoft Office Open XML Didn’t)
- "Open XML Incompatible With GPL " by Peter Galli, June 20, 2005, eWeek.
- Halloween I
- Halloween II

