less than 1 minute read xlsx docx pptx

Office Open XML(OOXML)

Office Open XML(OOXML) is a file format that packages XML-based data into a single compressed(zipped) one.
The fact that it is compressed is a key characteristic of the OOXML format.

Common OOXML file formats include .xlsx, .docx, and .pptx.

Standardization

  • ECMA-376[1]
  • ISO/IEC 29500[2]

File Structure: unzip xlsx

An .xlsx file is a zip file, so you can unzip it for sure.

unzip test.xlsx -d extracted
extracted/
├── _rels/
│   └── .rels
├── docProps/
│   └── core.xml
├── [Content_Types].xml
└── xl/
    ├── _rels/
    │   └── workbook.xml.rels
    ├──  drawings/
    │   └── workbook.xml.rels
    ├─── theme/
    ├─── worksheets/
    ├── sharedStrings.xml
    ├── styles.xml
    ├── workbook.xml
    ...

The xl/ directory contains the actual spreadsheet data.
_rels/, [Content_Types].xml, and docProps/ are required components defined by the OOXML standard.

  • _rels/: describes relationships between files.
  • [Content_Types].xml: defines the MIME types for each files.
  • docProps/: stores document properties, such as creation time and metadata.

Reference

  1. ECMA-376
  2. ISO/IEC 29500

Leave a comment