Convert Tables to HTML Activity
The Convert Tables to HTML activity enables the extraction and conversion of table data from PDF files into HTML format. This activity processes specified pages within a PDF, identifies table structures, and outputs the data as a clean HTML file.
| Field | Description | Requirement |
|---|---|---|
| Pdf Name | The reference name for the PDF file to be processed. | Required |
| Table Type | Specifies the structure of the table to be converted. | Required |
| Start Page | The starting page number within the PDF document for table extraction. | Optional |
| End Page | The ending page number within the PDF document for table extraction. | Optional |
| Maximum Font Size | The maximum font size to consider during table detection. | Optional |
| Ignore Line Count | The number of initial lines from the start page to exclude from the HTML output. | Optional |
| Output Path | The full directory path and filename for the generated HTML file. | Required |
Action Types & Examples
BASIC
- Format: String
- Example Result: "BASIC"
COMPLEX
- Format: String
- Example Result: "COMPLEX"
STRIPLESS
- Format: String
- Example Result: "STRIPLESS"
Implementation Examples
Field Setup - Pdf Name: ${RobustaPdf} - Table Type: BASIC - Start Page: 1 - End Page: 1 - Output Path: C:\Robusta\robusta.html
Execution Parameters - Pdf Name: ${RobustaPdf} - Table Type: BASIC - Start Page: 1 - End Page: 1 - Output Path: C:\Robusta\robusta.html
Technical Notes
Strikethrough lines may occur when the
BASICtable type is selected. If this situation is not desired, the table type should be changed toCOMPLEX.