HTML Support Limitations and Troubleshooting in Word Processing Document API
- 4 minutes to read
Word Processing Document API is not designed to fully support the HTML format. Expect HTML interpretation to be limited and optimized for documents rather than web pages. This topic describes these limitations, possible issues, and troubleshooting steps you may take when you process HTML documents. The same limitations apply to the DevExpress.XtraReports.UI.XRRichText component.
#Supported HTML Tags
RichEditDocumentServer
converts HTML files into its internal document model. Some HTML tags and CSS attributes have no counterparts in Open XML and RTF formats. This means that HTML documents may lose some of their formatting if you use RichEditDocumentServer
to process them.
The table below lists HTML tags that the RichEditDocumentServer supports. External links are processed for inline pictures and style sheets (CSS files). The id
and class
attributes are interpreted for all tags, including those not listed below.
Tag | Attributes | Notes |
---|---|---|
a | dir | |
b | dir | |
background-color | The tag specifies the Paragraph |
|
base | ||
basefont | size color face dir |
|
big | dir | |
blockquote | dir | |
br | dir | |
center | dir | |
code | dir | |
del | cite datetime |
|
div | page-break-before page-break-after page-break-inside background-color border (CSS) dir |
Only the always property value is supported for the page-break-before tag. |
em | dir | |
font | size color face dir |
|
h1-h6 | align dir |
Specify a paragraph’s Paragraph. |
head | ||
html | ||
hr | align color noshade size width |
|
i | dir | |
ins | cite datetime |
|
img | align src height width |
If the align attribute is not specified, the image is considered as inline. |
li | type value dir |
|
link | href type media dir |
|
meta | ||
ol | type value align dir |
|
p | align dir line-height |
|
small | ||
span | ||
strike | dir | |
strong | dir | |
style | ||
sub | dir | |
sup | dir | |
table | align bgcolor border bordercolor cellpadding cellspacing dir width |
The dir attribute reorders table columns and applies the table indent (specified by the Table. |
td | align bgcolor bordercolor colspan height nowrap rowspan text-align valign width |
The align tag is supported in Internet Explorer only. The bgcolor attribute specifies the Table The Rich |
th | any allowed | |
tr | align bgcolor bordercolor height text-align valign |
The align attribute is supported in Internet Explorer only. |
u | dir | |
ul | dir |
#Unsupported HTML Tags
- <base> tag with href attribute;
- <div> tag with border, align, and float CSS attribute;
- <li> tag with list-style-image CSS attribute;
- <margin> tag;
- <script> tag;
- <tab> tag;
- <table> tag with cols attribute;
- <td> tab with bordercolor and nowrap attributes;
- <title> tag;
- !important declaration;
- word-wrap, break-word, and letter-spacing css properties;
- css3 shapes;
- <ui> tag with type attribute.
#Unsupported Document Elements
The following document elements are not exported to HTML format:
- Headers and footers
- OLE objects
#Mso-prefixed Styles
Microsoft Word and other word processing applications allow you to save HTML documents with Microsoft Office mso
-prefixed styles. For example, these styles are applied when you select Web Page type in the Save As dialog in Microsoft Word. Word Processing Document API does not support these style types. To create a document compatible with DevExpress Word Processing Document API, save Microsoft Word documents in the Web Page, Filtered format.
#Media Queries
Word Processing Document API processes media queries as regular CSS styles. The HtmlDocumentImporterOptions.IgnoreMediaQueries option allows you to ignore media rules in HTML documents.
#Default Font Parameters and Formatting
Word Processing Document API uses the Times New Roman font with size 12 as default font parameters in HTML documents.
The Normal
style is exported as the default style. To change this behavior, set the HtmlDocumentExporterOptions.DefaultCharacterPropertiesExportToCss property to false
and explicitly set the Document.DefaultCharacterProperties.ForeColor to Black
.
wordProcessor.Options.Export.Html.DefaultCharacterPropertiesExportToCss = false;
wordProcessor.Document.DefaultCharacterProperties.ForeColor = System.Drawing.Color.Black;
You can also use this API to reduce the number of span
tags in an HTML document.
#Troubleshooting
Try one of the following solutions if issues occur with the HTML document output or an exception is thrown:
- Make sure that the HTML document does not contain unsupported tags or elements (OLE objects, headers, and footers).
Enable exceptions when debugging the project. Refer to the following article for more information: Obtain an Exception’s Call Stack
Please note that the
RichEditDocumentServer
may throw an exception when it loads a document with images. Refer to the following article for more information: Use Word Processing Document API to Load HTML Files or Export Documents to HTML- Use global policy settings to spot, analyze, and prohibit unwanted download requests. Refer to the following article for more information: Suppress Control Requests to Download Data from External URLs
- Compare the export result with Microsoft Word or other word processing application. If other word processors produce a different result (apart from the described limitations), please contact the DevExpress Support Team.