8000 DOCX export fails for question text containing certain valid but problematic HTML structures · Issue #3527 · DMPRoadmap/roadmap · GitHub
[go: up one dir, main page]

Skip to content
DOCX export fails for question text containing certain valid but problematic HTML structures #3527
@don-stuckey

Description

@don-stuckey

I have encountered the following issue for DMPs created from certain templates.

On the "Download" page in one of the DMPs in question, suppose the "docx" format is selected and the DMP is then downloaded. On the user then trying to open the exported .docx document on their local machine, the document does not open and instead the user sees the following error message:

Image

This appears to happen with DMPs created from templates with questions that have been saved in the database in a certain way.

The question fields in a template support rich-text formatting, with the value of the question field being saved in the database as a raw html string. And it seems that if the value of the question field is of the following form:

(i) 'some text<ul><li>an item</li></ul>'

...and the user exports the DMP as a .docx file, then the aforementioned error message is displayed on the user attempting to open the exported docx file.

And while (i) is actually valid HTML when eventually wrapped in some kind of parent block element, e.g. a <body></body> element, (i) nonethless causes problems for the parser that generates the .docx file.

In contrast, the parser can handle the following (also valid) HTML without any problems:

(ii) '<p>some text</p><ul><li>an item</li></ul>'

Additionally, although users don’t appear to be able to manually enter content into a question field in a template that produces HTML like (i) — instead, manual input typically results in HTML like (ii) — they can still copy and paste content from a Word or PDF document, which can result in HTML like (i) being saved to the database. And given that copy-pasting from a Word / PDF document is a very common user action, roadmap should be able to handle HTML values like (i) being in the database for a question text.

To reproduce the issue, you can carry out the following steps:

  1. Create a fresh local instance of roadmap, and seed the database.
  2. In the UI, create a template and add a phase, a section, and a question.
  3. Go into the database and set the text field of the question to (i), e.g. with this command: update questions set text = 'some text<ul></ul>' where id = 40;
  4. In the UI, create a DMP from the template and provide an answer to the question.
  5. Go to the Download page, select 'docx' for formatting, and then click on "Download plan" to export the DMP.
  6. Try to open the .docx file and you should see the Word error message referred to above.

Note that if you carry out the above steps but use (ii) instead of (i) in step 3, the exported .docx file should open without any problems.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0