Jump to content

Office Open XML: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Planned and beta software: Fixed ref code on OO.o 3.0
Undid revision 201771907 by HAl (talk): info is not long, and makes a large difference to the summary
Line 278: Line 278:
| publisher= ISO News and Media
| publisher= ISO News and Media
| date=2008-03-05 }}</ref>
| date=2008-03-05 }}</ref>

There have been reports that [[Microsoft]] and its partners have used questionable techniques in the process of standardization.<ref>http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9033701&source=rss_news6</ref><ref>http://www.noooxml.org/forum/t-49321/porn-site-technique-used-to-promote-ooxml</ref><ref>http://www.noooxml.org/irregularities</ref><ref>http://www.openxml.info/index.php?option=com_content&task=category&sectionid=5&id=3&Itemid=9</ref><ref>http://www.groklaw.net/article.php?story=20070902123701843</ref><ref>http://polishlinux.org/poland/poland-vote-for-microsoft-ooxml/</ref><ref>http://polishlinux.org/poland/possible-manipulation-around-ooxml-process-in-poland/</ref> Microsoft denied this.<ref>http://blogs.msdn.com/dmahugh/archive/2007/08/30/oh-the-drama-of-it-all.aspx</ref>


==Application support==
==Application support==

Revision as of 13:58, 31 March 2008

Template:Distinguish2

Office Open XML (often referred to as OOXML) is an XML-based file format specification for electronic documents such as spreadsheets, charts, presentations and word processing documents.

Microsoft originally developed the specification as a successor to its binary Microsoft Office file formats. The specification was later handed over to Ecma International to be developed as the Ecma 376 standard, under the stewardship of Ecma International Technical Committee TC45. Ecma 376 was published in December 2006[1] and can be freely downloaded from Ecma international.[2]

Background

Prior to the 2007 edition of Microsoft Office, its component applications (such as the word-processor Word and spreadsheet Excel) used binary file formats for storing data by default. Historically, these formats have been difficult for developers to work with natively, due to a lack of publicly available information on, and royalty-free access to, the format specifications. (Microsoft does offer a subset of these binary format specifications under a royalty-free covenant not to sue.[3]) While a level of support for the binary formats had been achieved by various applications, full interoperability remained elusive.

In 2000, Microsoft released an initial version of an XML-based format for Excel, which was incorporated in Office XP. In 2002, a new file format for Microsoft Word followed.[4] The Excel and Word formats - known as the Office 2003 XML formats - were later incorporated into the 2003 release of Microsoft Office.

In 2004, governments and the European Union recommended to Microsoft that they publish and standardize their XML Office formats through a standardization organization.[5] Microsoft announced[6] in November 2005 that it would standardize the new version of their XML-based formats through Ecma, as "Ecma Office Open XML."

File format and structure

In the earlier form of these formats, prior to Ecma standardization, the Microsoft Office 2003 XML formats used a single monolithic file with embedded items like pictures as binary encoded blocks within the XML. Office Open XML no longer supports those but uses a file package conforming to the Open Packaging Convention. This format uses the ZIP file format and contains the individual files that form the basis of the document. In addition to Office markup, the package can also include embedded (binary) files in formats such as PNG, BMP, AVI or PDF.

Document markup languages

An Office Open XML file may contain several documents encoded in specialized markup languages corresponding to applications within the Microsoft Office product line. Office Open XML defines multiple vocabularies (using 27 namespaces and 89 schema modules.)

The primary markup languages are:

  • WordprocessingML for word-processing
  • SpreadsheetML for spreadsheets
  • PresentationML for presentations
  • DrawingML used for vector drawing, charts, and for example, text art. (additionally, though deprecated, VML is supported for drawing).

Shared markup language materials include:

  • Office Math Markup Language (OMML)
  • Extended properties
  • Custom properties
  • Variant Types
  • Custom XML data properties
  • Bibliography

In addition to the above markup languages custom XML schemas can be used to extend Office Open XML.

The XML Schema of OOXML emphasizes reducing load time and improving parsing speed. In a test with applications current in April 2007, XML based office documents were slower to load than binary formats.[7] To enhance performance, OOXML uses very short element names for common elements and spreadsheets save dates as index numbers (starting from 1899 or from 1904). In order to be systematic and generic, OOXML typically uses separate child elements for data and metadata (element names ending in Pr for properties) rather than using multiple attributes, which allows structured properties. OOXML does not use mixed content but uses elements to put a series of text runs (element name r) into paragraphs (element name p). The result is terse and highly nested in contrast to HTML, for example, which is fairly flat, designed for humans to write in text editors and is more congenial for humans to read.

Office MathML (OMML)

Office Math Markup Language is a mathematical markup language which can be embedded in WordprocessingML, with intrinsic support for including word processing markup like revision markings [8] , footnotes, comments, images and elaborate formatting and styles.[9] The OMML format is different from the World Wide Web Consortium (W3C) MathML recommendation that does not support those office features, but is partially compatible[10] through relatively simple XSL Transformations.

DrawingML
File:DrawingML text effect.png
Example of DrawingML text effects

DrawingML is the graphics markup language used in Office Open XML documents. Its major features are the graphics rendering of text elements, graphical vector based shape elements, graphical tables and charts.

The DrawingML table is the third table model in Office Open XML (next to the table models in WordprocessingML and SpreadsheetML) and is optimized for graphical effects and its main use is in presentations created with PresentationML markup. DrawingML contains graphics effects (like shadows and reflection) that can be used on the different graphical elements that are used in DrawingML. In DrawingML you can also create 3d effects, for instance to show the different graphical elements through a flexible camera viewpoint. It is possible to create separate DrawingML theme parts in an Office Open XML package. These themes can then be applied to graphical elements throughout the Office Open XML package.[11]

DrawingML is unrelated to the other vector graphics formats such as SVG. These can be converted to DrawingML to include natively in an Office Open XML document. This is a different approach to that of the OpenDocument format, which uses a subset of SVG, and includes vector graphics as separate files.

Container structure

Office Open XML packages have characteristically different directory structures and names depending on the type of document. An application will use the relationships files to locate individual sections (files), with each having accompanying metadata, in particular MIME metadata.

Office Open XML format uses a ZIP package for storing XML and other data files.[12]

A basic package contains an XML file called [Content_Types].xml at the root, along with three directories: _rels, docProps, and a directory specific for the document type (for example, in a .docx word processing package, there would be a word directory). The word directory contains the document.xml file which is the core content of the document.

[Content_Types].xml
This file describes the contents of the package. It also contains a mapping for file extensions and overrides for specific URIs.
_rels
This directory contains relationships for the files within the package. To find the relationships for a specific file, look for the _rels directory that is a sibling of the file, and then for a file that has the original file name with a .rels appended to it. For example, if the content types file had any relationships, there would be a file called [Content_Types].xml.rels inside the _rels directory.
_rels/.rel
This file is where the package relationships are located. Applications look here first. Viewing in a text editor, one will see it outlines each relationship for that section. In a minimal document containing only the basic document.xml file, the relationships detailed are metadata and document.xml.
word/document.xml
This file is the main part for any Word document. Viewed in an XML editor, one will see a pretty basic XML file.

Relationships

Relationship files in Office Open XML

An example relationship file (from word/_rels/document.xml.rels)

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Relationships
  xmlns="http://schemas.microsoft.com/package/2005/06/relationships">
  <Relationship Id="rId1"
     Type="http://schemas.microsoft.com/office/2006/relationships/image"
     Target="http://en.wikipedia.org/images/wiki-en.png"
     TargetMode="External" />
  <Relationship Id="rId2"
     Type="http://schemas.microsoft.com/office/2006/relationships/hyperlink"
     Target="http://www.wikipedia.org"
     TargetMode="External" />
</Relationships>

As such, images referenced in the document can be found in the relationship file by looking for all relationships that are of type http://schemas.microsoft.com/office/2006/relationships/image. To change the used image, edit the relationship.

The following code shows an example of inline markup for a hyperlink:

<w:hyperlink w:rel="rId2" w:history="1"> 

In this example, the URL is represented by "rId2". The actual URL is in the accompanying relationships file, located by the corresponding "rId2" item. Linked images, templates, and other items are referenced in the same way.

Embedded or linked media file relations

Pictures can be embedded or linked using a tag:

<v:imagedata w:rel="rId1" o:title="example" />

This is the reference to the image file. All references are managed via relationships. For example, a document.xml has a relationship to the image. There is a _rels directory in the same directory as document.xml, inside _rels is a file called document.xml.rels. In this file there will be a relationship definition that contains type, ID and location. The ID is the referenced ID used in the XML document. The type will be a reference schema definition for the media type and the location will be an internal location within the ZIP package or an external location defined with a URL.

Licensing

Reasonable and Non Discriminatory

Ecma International provides specifications that "can be freely copied by all interested parties without restrictions"[13]. Under the Ecma code of conduct in patent matters, participating and approving member organisations are required to make available their patent rights on a Reasonable and Non Discriminatory (RAND) basis. While making patent rights available on a RAND basis is considered a common minimum patent condition for a standard, international standardization has a clear preference for royalty-free patent licensing. That is why Microsoft, a main contributor to the standard, provided a Covenant Not to Sue[14] for its patent licensing. The covenant received a mixed reception, with some (like the Groklaw blog) identifying problems[15] and others (such as Lawrence Rosen, an attorney and lecturer at Stanford Law School) endorsing it.[16]

Open Specification Promise

Microsoft also added the format to their Open Specification Promise[17] in which

Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making, using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to a Covered Specification

subject to certain restrictions. Office Open XML can therefore be used under the Covenant Not to Sue or the Open Specification Promise.

The Open Specification Promise was included in documents submitted to ISO in support of the Ecma 376 fast track submission.[18] Ecma International asserted that, "The OSP enables both open source and commercial software to implement [the specification]."[19]

In support of the licensing arrangements Microsoft commissioned an analysis from the London legal firm Baker & Mckenzie.[20]

Several standards and OSS licensing experts expressed support in 2006 of the OSP. A 2006 article in Cover Pages quotes Lawrence Rosen, an attorney and lecturer at Stanford Law School, as saying,

"I'm pleased that this OSP is compatible with free and open source licenses."[21]

In 2006 [22] , Mark Webbink; a lawyer and member of the board of the Software Freedom Law Center, and former employee of Linux vendor Red Hat; has said,

"Red Hat believes that the text of the OSP gives sufficient flexibility to implement the listed specifications in software licensed under free and open source licenses. We commend Microsoft’s efforts to reach out to representatives from the open source community and solicit their feedback on this text, and Microsoft's willingness to make modifications in response to our comments."[23]

Standards lawyer Andy Updegrove said in 2006 the Open Specification Promise was

"what I consider to be a highly desirable tool for facilitating the implementation of open standards, in particular where those standards are of interest to the open source community."[24]

On March 12, 2008 the Software Freedom Law Center, which provides services to protect and advance free software and open source software, has warned of problems with the Open Specification Promise as it relates to OOXML and the GPL[25]. In a published analysis of the promise it states that[26]

  • "Because of this narrow definition of the covered specifications, no future versions of any of the specifications are guaranteed to be covered under the OSP."[27]
  • "Any code that implements the specification may also do other things in other contexts, so in effect the OSP does not cover any actual code, only some uses of code."[27]
  • "...it permits implementation under free software licenses so long as the resulting code isn't used freely."[27]
  • "The OSP cannot be relied upon by GPL developers for their implementations not because its provisions conflict with GPL, but because it does not provide the freedom that the GPL requires." [27]

With Ecma International publishing the specification for free and patents made irrevocably available on a royalty-free basis trough the Open Specification Promise, Office Open XML conforms to all characteristics of the European Union's definition of an open standard.[28]

Standardization

Microsoft's Office Open XML is currently an Ecma standard (Ecma-376, approved on 7 December 2006).

The specification is currently undergoing fast-track standardization within ISO/IEC as DIS 29500 (Draft International Standard 29500).[29] In a round of voting by ISO/IEC national body members in September 2007, the draft text was not approved as an international standard. A ballot resolution process has amended the text, and a final decision will be reached on its approval or disapproval at the end of March 2008.[30]

There have been reports that Microsoft and its partners have used questionable techniques in the process of standardization.[31][32][33][34][35][36][37] Microsoft denied this.[38]

Application support

The list here is not exhaustive. A More exhaustive list of supporting implementations of Office Open XML can be found on Microsoft's office open XML Community website.

Implementation

Office Open XML (as specified by Ecma 376) is the default Microsoft Office 2007 format. For older versions of Office (2000, XP and 2003) a compatibility pack is provided.[39] It is available for Windows 2000 and newer operating systems.[40]. The compatibility pack does not require Microsoft Office. It can be used as a standalone converter with any product that reads Office's older binary formats, such as OpenOffice.org.[41]

Filters and Converters

  • OxygenOffice includes xmlfilter which is the code that OpenOffice.org 3 will use to process Office Open XML files, and xmlfilter is completely different than OdfConverter[61]. This filter however is only for importing OOXML files not for exporting them.
  • docXConverter by Panergy Ltd. converts from WordprocessingML to Rich Text Format (RTF) and from SpreadsheetML to Comma-separated values (CSV). docXConverter can be used to transfer WordprocessingML data to other applications that read RTF data such as Word 97.[62]
  • Google search supports direct HTML view of Office Open XML files. Found files can be viewed directly in a converted HTML view. [63]

Planned and beta software

Template:Future software

  • Corel has released beta version of their Corel Wordperfect Office X3 edition that includes support for Office Open XML.[64]
  • Office Open XML SDK, containing a set of managed code libraries to create and manipulate OOXML files programmatically, will be shipped by Microsoft. Currently available in a CTP release, version 1.0 will be released in May 2008.[65] The shipping version of the SDK will incorporate the changes made to the OOXML specification made during the current ISO/SEC standardization process.[66] Version 2 of the OOXML SDK will support validating OOXML documents against the OOXML schema, as well as searching in OOXML documents.[65]
  • OpenOffice.org 3.0 alpha supports Office Open XML.[67]

See also

References

  1. ^ "Ecma International approves Office Open XML standard" (Press release). Ecma International. December 7 2006. Retrieved 2006-12-08. {{cite press release}}: Check date values in: |date= (help)
  2. ^ Standard ECMA-376
  3. ^ "How to extract information from Office files by using Office file formats and schemas". Microsoft. 2007-03-27. Retrieved 2007-07-10.
  4. ^ Brian Jones (2007-01-25). "History of office XML formats (1998-2006)". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  5. ^ Telematics between Administrations Committee based on IDA expert group on open document formats (2004-05-25). "TAC approval on conclusions and recommendations on open document formats". IDABC - European eGovernment Services. Retrieved 2007-07-30.
  6. ^ "Microsoft Co-Sponsors Submission of Office Open XML Document Formats to Ecma International for Standardization". Microsoft. 2005-11-21.
  7. ^ George Ou (2007-04-27). "MS Office 2007 versus Open Office 2.2 shootout". ZDnet.com. Retrieved 2007-04-27.
  8. ^ Jesper Lund Stocholm (2008-01-29). "Do your math - OOXML and OMML". A Mooh Point blog. Retrieved 2008-02-12.
  9. ^ Murray Sargent (2007-06-05). "Science and Nature have difficulties with Word 2007 mathematics". MSDN blogs. Retrieved 2007-07-31.
  10. ^ David Carlisle (2007-05-09). "XHTML and MathML from Office 2007". David Carlisle. Retrieved 2007-09-20.
  11. ^ Wouter Van Vugt (2007-08-13). "Open XML Explained e-book". Openxmldeveloper.org. Retrieved 2007-09-14.
  12. ^ Tom Ngo (December 11 2006). "Office Open XML Overview" (PDF). Ecma International. p. 6. Retrieved 2007-01-23. {{cite web}}: Check date values in: |date= (help)
  13. ^ "What is Ecma International". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  14. ^ "Microsoft Covenant Regarding Office 2003 XML Reference Schemas". Microsoft. Retrieved 2006-07-11.
  15. ^ "2 Escape Hatches in MS's Covenant Not to Sue". Groklaw. Retrieved 2007-01-29.
  16. ^ Berlind, David (November 28 2005). "Top open source lawyer blesses new terms on Microsoft's XML file format". ZDNet. Retrieved 2007-01-27. {{cite web}}: Check date values in: |date= (help)
  17. ^ "Microsoft Open Specification Promise". Microsoft. 2006-09-12. Retrieved 2007-04-22. {{cite web}}: Cite has empty unknown parameter: |1= (help)
  18. ^ Licensing conditions that Microsoft offers for Office Open XML
  19. ^ -Response Document- National Body Comments from 30-Day Review of the Fast Track Ballot for ISO/IEC DIS 29500 (ECMA-376) Office Open XML File Formats
  20. ^ Baker & McKenzie (2006). "Standardization and Licensing of Microsoft's Office Open XML Reference Schema" (PDF). Baker & Mckenzie. Retrieved 2007-02-01. {{cite web}}: Unknown parameter |month= ignored (help)
  21. ^ "Microsoft's Open Specification Promise Eases Web Services Patent Concerns". xml.coverpages.org. 2006-09-12.
  22. ^ "Microsoft promises to hang patent fire over web services". 2006-09-12.
  23. ^ "Microsoft Open Specification Promise".
  24. ^ Peter Galli (2006-09-12). "Microsoft Promises Not to Sue over Web Services Specs".
  25. ^ "Software Freedom Law Center Publishes Analysis of Microsoft's Open Specification Promise new article". Software Freedom Law Center. March 12, 2008.
  26. ^ "Software Freedom Law Center Publishes Analysis of Microsoft's Open Specification Promise". Business Wire. March 12, 2008.
  27. ^ a b c d "Microsoft's Open Specification Promise: No Assurance for GPL". Software Freedom Law Center. March 12, 2008.
  28. ^ IDABC - European eGovernment Services (2004). "European Interoperability Framework for pan-European eGovernment Services". Retrieved 2007-07-30.
  29. ^ ISO/IEC DIS 29500, Information technology -- Office Open XML file formats
  30. ^ "Ballot resolution meeting addresses comments on draft ISO/IEC 29500 standard". ISO News and Media. 2008-03-05.
  31. ^ http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9033701&source=rss_news6
  32. ^ http://www.noooxml.org/forum/t-49321/porn-site-technique-used-to-promote-ooxml
  33. ^ http://www.noooxml.org/irregularities
  34. ^ http://www.openxml.info/index.php?option=com_content&task=category&sectionid=5&id=3&Itemid=9
  35. ^ http://www.groklaw.net/article.php?story=20070902123701843
  36. ^ http://polishlinux.org/poland/poland-vote-for-microsoft-ooxml/
  37. ^ http://polishlinux.org/poland/possible-manipulation-around-ooxml-process-in-poland/
  38. ^ http://blogs.msdn.com/dmahugh/archive/2007/08/30/oh-the-drama-of-it-all.aspx
  39. ^ "Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats (Version 3)". Microsoft. 2007-06-18. Retrieved 2007-09-04.
  40. ^ Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats
  41. ^ "Office Compatibility Pack Review". OpenOffice.org Ninja. 2008-02-06. Retrieved 2008-02-26.
  42. ^ Amazon. "Microsoft Office 2008 for Mac".
  43. ^ "Microsoft Office Open XML File Format Converter for Mac 0.21 (Beta)". Microsoft. 2008-03-06.
  44. ^ sherjo (2006-12-6). "Converters Coming! Free and (Fairly) Fast". The Office for Mac Team Blog. Retrieved 2007-03-18. {{cite web}}: Check date values in: |date= (help)
  45. ^ "Microsoft Office Mobile 6.1: Upgrade for Microsoft Office 2007 file formats". Microsoft. 2007-11-28. Retrieved 2007-11-29.
  46. ^ "Apple - iWork - Pages". Retrieved 2007-07-08.
  47. ^ "Apple - iWork - Numbers". Retrieved 2007-07-08.
  48. ^ "Apple - iWork - Keynote". Retrieved 2007-07-08.
  49. ^ "OS X leopard Text Edit to Support Office 2007?". uneasysilence. Retrieved 2007-02-14.
  50. ^ ""iPhone User's Guide"" (PDF). Apple, Inc.
  51. ^ "Power Edit MS Word 2007 (DOCX) Support". Retrieved 2007-10-09.
  52. ^ "Gnumeric 1.8 is Here!". www.gnome.org. Retrieved 2008-01-28.
  53. ^ "QuickOffice".
  54. ^ ""DocumentsToGo for PalmOS Premium Edition"". Dataviz.
  55. ^ "Datawatch Announces Availability of Monarch V.9.0; Supports Microsoft® Windows Vista™ and Extends Excel Capabilities". 2007-02-27. {{cite web}}: Unknown parameter |Author= ignored (|author= suggested) (help); Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  56. ^ [www.textglow.net]
  57. ^ andrew z (2008-1-31). "odf-converter 1.1 released". OpenOffice.org Ninja. {{cite web}}: Check date values in: |date= (help)
  58. ^ Convert OpenXML (.docx, etc.) in Linux using command line
  59. ^ "NeoOffice 2.2.1 for Mac OS X Released". trinity.neooffice.org. 2007-08-26. Retrieved 2007-10-09.
  60. ^ Raju Vegesna (2008-02-27). "Zoho Writer Update: DocX Support, Thesaurus, Group Sharing & More".
  61. ^ "OxygenOffice as a Word 2007 (.docx) converter". OpenOffice.org Ninja. 2008-02-25. Retrieved 2008-02-26.
  62. ^ "docXConverter - Features". panergy. Retrieved 2007-01-31.
  63. ^ Brian Jones (2008-01-17). "Google support for Open XML formats".
  64. ^ "OOXML/ODF beta for WordPerfect® Office now available". Corel. Retrieved 2007-10-04.
  65. ^ a b Doug Mahugh. "Open XML SDK roadmap". MSDN Blogs. Retrieved 2008-03-23.
  66. ^ Eric Lai (2008-03-12). "Microsoft releasing OOXML SDK". {{cite web}}: Unknown parameter |Publisher= ignored (|publisher= suggested) (help)
  67. ^ Andrew Z (2008-03-19). "OpenOffice.org 3.0's new features, an early look".