Tuesday, December 4, 2018

Parse MS Word text with nlapiXMLToPDF

Certain customizations require the conversion of rich text field data into PDF using the nlapiXMLToPDF function.  In addition, some business scenarios use rich text fields to hold content that are copied and pasted directly from MS Word documents.  However, the PDF conversion library from BFO cannot parse formatted text from MS Word.  This is specified in their FAQ site in the link below:

                Can I convert Microsoft Office documents to PDF?

In contrast, the BFO library can parse properly-formatted HTML data.  A good way to convert MS Word data into HTML is to use the following web site:

                Convert Word To Clean HTML Documents (http://www.word2cleanhtml.com)

To use this web site, follow these steps:                                                                       

1.  Copy and paste the MS Word content into the Home box.

2.  Press the Convert to Clean HTML button.  This generates HTML code without the MS-specific formatting.

3.  Click the Source Editing mode button at the top left of the rich text field and paste the generated HTML.

This generated HTML content can now be properly parsed by the BFO PDF library.

No comments:

Post a Comment