|Tool Name||Excel File (XLSX)|
|Tool Web Site||https://tools.ietf.org/html/rfc4180|
|Supported Methodology||[File System] Data Store (Physical Data Model) via XLSX File|
Import tool: ISO Excel File (XLSX) N/A (https://tools.ietf.org/html/rfc4180)
Import interface: [File System] Data Store (Physical Data Model) via XLSX File from Excel File (XLSX)
Import bridge: 'ExcelFile' 10.1.0
This bridge detects (reverse engineer) the metadata from a data file of type Excel XML format (XLSX).
The detection of such Excel File is based on file extension .XLSX.
The bridge can detect a header row, and use it to create the field name, otherwise generic filed names are created.
The bridge samples up to 1000 rows to detect the file data types, such as DATE, NUMBER, STRING.
If an Excel file has multiple sheets, each sheet is imported as the equivalent of a file/table with the same sheet name.
The bridge uses the machine's local to read files and allows you to specify the character set encoding files use.
|File||Path to file to import||FILE||*.XLSX||Mandatory|
|Header row number||
The row number that contains the file header describing its field names.
When empty the header is assumed to be in the first row.
|Number of rows to read||
The maximum number of rows to sample from files. The rows are used to identify file format details, like field data types.
When empty, the number of rows is assumed to be 1000.
|Miscellaneous||Specify miscellaneous options identified with a -letter and value.
For example, -m 4G -f 100 -j -Dname=value -Xms1G
-m the maximum Java memory size whole number (e.g. -m 4G or -m 2500M ).
-v set environment variable(s) (e.g. -v var1=value -v var2="value with spaces").
-j the last option that is followed by Java command line options (e.g. -j -Dname=value -Xms1G).
-hadoop key1=val1;key2=val2 to manualy set hadoop configuration options
-tps 10 maximum threads pool size
-tl 3600s processing time limit in s -seconds m - minutes or h hours;
-fl 1000 processing files count limit;
-delimited.top_rows_skip 1 number of rows to skip while processing csv files
-delimited.extra_separators ~,||,|~ comma separated extra delimiters each of which will be used while processing csv files
-delimited.no_header by default, bridge automatically tries to detect headers while processing csv files(basing on header columns types), use this option to disable headers import(f.e. to hide sensitive data)
-fresh.partition.models - use to import latest modified files when processing partitions defined in Partitioned directories parameter
-subst K: C:/test - use to associate a root path part with a drive or another path.
-skip.download - use to disable dependencies downloading and use only download cache
-prescript [cmd] - runs a script command before bridge execution. Example: -prescript \"script.bat\"
The script must be located in the bin directory, and have .bat or .sh extension.
The script path must not include any parent directory symbol (..)
The script should return exit code 0 to indicate success, or another value to indicate failure.
-disable.partitions.autodetection - use this option to disable automatic partitions detection(when "Partition directories" option is empty)
Mapping information is not available