Extracting text/html out Word (.docx) files
November 3, 2018
We are going to be extracting out HTML from a Word (.docx) file.
.docx is an example of an Open Document Format for Office Applications (ODF) file. It is a ZIP of an XML document.
By unzipping the file and locating the appropriate XML file, we can process the data an generate HTML
More like this
Top recommended articles