html-to-xlsx
recipe generates Excel xslx files from HTML tables. This isn't a full HTML -> Excel conversion but a rather pragmatic and fast way to create Excel files from jsreport. The recipe reads input table and extract a couple of CSS style properties using a specific HTML engine (which defaults to chrome), and finally uses the styles to create the Excel cells.
The following CSS properties are supported:
background-color
- cell background colorcolor
- cell foreground colorborder
- all the border-[left|right|top|bottom]-width
, border-[left|right|top|bottom]-style
, border-[left|right|top|bottom]-color
will be transformed into Excel cells borders.text-align
- text horizontal align in the Excel cellvertical-align
- vertical align in the Excel cellwidth
- the Excel column will get the highest width; it can be little bit inaccurate because of pixel to Excel points conversionheight
- the Excel row will get the highest heightfont-family
- font family, defaults to Calibri
font-size
- font size, defaults to 16px
font-style
- normal
, and italic
styles are supportedfont-weight
- control whether the cell's text should be bold or nottext-decoration
- underline
and line-through
are supportedoverflow
- the Excel cell will have text wrap enabled if this is set to scroll.The following HTML attributes are supported:
colspan
- numeric value that merges current column with columns to the rightrowspan
- numeric value that merges current row with rows below.htmlEngine
- String
(supported values here depends on the HTML engines that you have available in your jsreport installation, by default just chrome
is available but you can additionally install better performing cheerio as HTML engine too)waitForJS
- Boolean
whether to wait for the JavaScript trigger to be enabled before trying to read the HTML tables on the page or not. defaults to false
.insertToXlsxTemplate
- Boolean
controls if the result of the HTML to Excel tables conversion should be added as new sheets of existing xlsx template, it needs you to set an xlsx template to work. Default is false
.Each table detected on the HTML source is converted to a new sheet in the final xlsx file. The sheets names are by default Sheet1
, Sheet2
etc. However, you can specify a custom sheet name using the name
or data-sheet-name
attribute on the table
element where the data-sheet-name
has precedence.
<table name="Data1">
<tr>
<td>1</td>
</tr>
</table>
<table data-sheet-name="Data2">
<tr>
<td>2</td>
</tr>
</table>
To produce a cell with specific data type you need to use the data-cell-type
on the td
element. The supported data types are number
, boolean
, date
, datetime
and formula
:
<table>
<tr>
<td data-cell-type="number">10</td>
<td data-cell-type="boolean" style="width: 85px">1</td>
<td data-cell-type="date">2019-01-22</td>
<td data-cell-type="datetime">2019-01-22T17:31:36.000-05:00</td>
</tr>
</table>
Excel supports setting cell string format. Add the following attributes to the td
element:
data-cell-format-str
-> Specify the raw string formatdata-cell-format-enum
-> Select an existing formatPossible values of the data-cell-format-enum
are:
0
-> format equal to general
1
-> format equal to 0
2
-> format equal to 0.00
3
-> format equal to #,##0
4
-> format equal to #,##0.00
9
-> format equal to 0%
10
-> format equal to 0.00%
11
-> format equal to 0.00e+00
12
-> format equal to # ?/?
13
-> format equal to # ??/??
14
-> format equal to mm-dd-yy
15
-> format equal to d-mmm-yy
16
-> format equal to d-mmm
17
-> format equal to mmm-yy
18
-> format equal to h:mm am/pm
19
-> format equal to h:mm:ss am/pm
20
-> format equal to h:mm
21
-> format equal to h:mm:ss
22
-> format equal to m/d/yy h:mm
37
-> format equal to #,##0 ;(#,##0)
38
-> format equal to #,##0 ;[red](#,##0)
39
-> format equal to #,##0.00;(#,##0.00)
40
-> format equal to #,##0.00;[red](#,##0.00)
41
-> format equal to _(* #,##0_);_(* (#,##0);_(* "-"_);_(@_)
42
-> format equal to _("$"* #,##0_);_("$* (#,##0);_("$"* "-"_);_(@_)
43
-> format equal to _(* #,##0.00_);_(* (#,##0.00);_(* "-"??_);_(@_)
44
-> format equal to _("$"* #,##0.00_);_("$"* (#,##0.00);_("$"* "-"??_);_(@_)
45
-> format equal to mm:ss
46
-> format equal to [h]:mm:ss
47
-> format equal to mmss.0
48
-> format equal to ##0.0e+0
49
-> format equal to @
<style>
td {
width: 60px;
padding: 5px;
}
</style>
<table>
<tr>
<td data-cell-type="number" data-cell-format-str="0.00">10</td>
<td data-cell-type="number" data-cell-format-enum="3">100000</td>
<td data-cell-type="date" data-cell-format-str="m/d/yyy">2019-01-22</td>
</tr>
</table>
Setting the format is also required when the cell needs to have a specific format category which depends on the computer locale. The cell is otherwise categorized by Excel as General
.
For example, using data-cell-type="date"
makes the cell a date and you can use it in the date-based calculations. However, the cell format category in Excel is displayed as General
and not Date
. To rectify this, you need to use data-cell-format-str
to match your locale.
A formula cell can be specified using data-cell-type="formula"
on the td
element.
<table>
<tr>
<td data-cell-type="number">10</td>
<td data-cell-type="number">10</td>
<td data-cell-type="formula">=SUM(A1, B1)</td>
</tr>
</table>
You can use the following CSS styles to change the default font-family for all cells in table.
td {
font-family: 'Verdana';
font-size: 18px;
}
You can use the following flow to produce a complex xlsx with charts or pivot tables.
html-to-xlsx
template<table>
but with name
attribute matching the existing sheet with dataSee the example Chart with html-to-xlsx recipe
In this case, the data source for the chart is dynamic and needs to use a named formula. To define such a formula, use "Name Manager" in the "Formulas" ribbon.
=OFFSET(Data!$A$2;;;COUNTIF(Data!$A$2:$A$999999;"<>"))
This formula defines the range of non-empty cells starting A2
in column A
.
To apply this named formula, you click on the chart's series and in the Excel formula box update the SERIES
function
=SERIES("Sales";book1.xlsx!ChartMonths;book1.xlsx!ChartValues;1)
You may need to postpone conversion of tables until some JavaScript async
tasks are processed. If this is the case; set htmlToXlsx.waitForJS = true
in the API options or Wait for conversion trigger
in the studio menu. When set, the conversion won't start until you set window.JSREPORT_READY_TO_START = true
inside your template's JavaScript.
...
<script>
// do some calculations or something async
setTimeout(function() {
window.JSREPORT_READY_TO_START = true; //this will start the conversion and read the existing tables on the page
}, 500);
...
</script>
When using phantomjs
as the engine there are cases when a row height ends with a larger height than the actual content. This is caused by a phantomjs
bug that retrieves a larger height when the content of cells contains white space characters.
There are two possible workarounds:
letter-spacing
CSS property with some negative value (demo)<!-- without "letter-spacing" the row would be more larger -->
<table style="letter-spacing: -4px">
<tr>
<td> From Date: N/A</td>
<td> To Date: N/A </td>
<td> Search Text: N/A </td>
<td> Sort Order: N/A </td>
<td> Sort Key: N/A </td>
<td> Filter: N/A </td>
</tr>
</table>
line-height: 0
with a specific height
(demo)<!-- without "line-height" and "height" the row would be more larger -->
<table style="line-height: 0">
<tr style="height: 20px">
<td> From Date: N/A</td>
<td> To Date: N/A </td>
<td> Search Text: N/A </td>
<td> Sort Order: N/A </td>
<td> Sort Key: N/A </td>
<td> Filter: N/A </td>
</tr>
</table>
The chrome engine can have performance problems when evaluating huge tables with many cells. For these cases the recipe provides an additional helper which splits large tables into chunks and runs evaluation in batches. Usage is like each
or jsrender for
handlebar helpers.
<table>
{{#htmlToXlsxEachRows people}}
<tr>
<td>{{name}}</td>
<td>{{address}}</td>
</tr>
{{/htmlToXlsxEachRows}}
</table>
Although the htmlToXlsxEachRows
helper prevents Chrome from hanging, the rendering can still be slow. This is because Chrome needs to create DOM elements for the whole table and evaluate every single cell. Fortunately, there is a better option for large tables – using the custom HTML engine cheerio-page-eval.
This custom engine is experimental and requires manual installation through NPM.
npm i cheerio-page-eval
restart jsreport
Afterward, you can select it in the studio HTML to xlsx menu and start using it. This engine doesn't create DOM representation like Chrome, so it has much better performance. However, the lack of DOM also introduces some limitations.
<style>
tag. You need to use in-line styles on cells.<script>
tags. The helpers and templating engines aren't limited.htmlToXlsxEachRows
helper also works with the cheerio engine and can significantly improve rendering memory footprint on long tables.
See general documentation for office preview in studio here.
The html-to-xlsx
will be always limited and you may miss some features that aren't yet implemented in it. In this case you can use xlsx recipe and postprocess the html-to-xlsx
and modify what you need using low level xlsx
helpers.
You can specify the template the standard way by using name
or shortid
, or alternatively you can also send it in the API request. If you have the Excel template stored as an asset you can also reference it in the request.
{
"template": {
"recipe": "html-to-xlsx",
"engine": "handlebars",
"content": "<table></table>",
"htmlToXlsx": {
"templateAssetShortid": "xxxx"
}
},
"data": {}
}
If you don't have the xlsx template stored as an asset you can send it directly in the API request.
{
"template": {
"recipe": "html-to-xlsx",
"engine": "handlebars",
"content": "<table></table>",
"htmlToXlsx": {
"templateAsset": {
"content": "base64 encoded word file",
"encoding":"base64"
}
}
},
"data": {}
}