DOCX Input: Workaround buggy version of Microsoft Word converting newlines in the document summary into _x000d_. They are now ignored when reading metadata from docx files. Fixes #1321343 [DOCX input known characters metadata](https://bugs.launchpad.net/calibre/+bug/1321343)

This commit is contained in:
Kovid Goyal 2014-05-20 22:19:25 +05:30
parent 245617c745
commit b37e932668

View File

@ -54,6 +54,7 @@ def read_doc_props(raw, mi):
desc = XPath('//dc:description')(root) desc = XPath('//dc:description')(root)
if desc: if desc:
raw = etree.tostring(desc[0], method='text', encoding=unicode) raw = etree.tostring(desc[0], method='text', encoding=unicode)
raw = raw.replace('_x000d_', '') # Word 2007 mangles newlines in the summary
mi.comments = raw mi.comments = raw
langs = [] langs = []