Saves a URL as a Magpie table. Copies the data into a Magpie data source.
Syntax
save url "<url>" as [temp | temporary] table { <table spec> | <table name> } [in schema <schema reference>] [with infer schema] [with format <file format>] [with compression <compression>] [with quote char "<quote character>"] [with escape char "<escape character>"] [with null value "<null value>"] [with delimiter "<delimiter>"] [with encoding "<encoding>"] [with date format "<date format>"] [with timestamp format "<timestamp format>"] [with multiline] [with header] [with ignore leading white space] [with ignore trailing white space] [with data source <data source reference>] [with partitions <partition count>] [partition by "<partition columns>"] [with replace | with replace with delete]
Parameters
table spec
JSON. A specification for saving the table. Note that fields are not used and persistence mapping is optional when saving a table from a URL.
table name
String. The name of the table to save the data as.
schema reference
String. The name of the schema to save the table in. Defaults to the current schema.
url
String. The URL location of the data file to save as a table. Currently supported URL schemes include: http
, https
, hdfs
, file
, s3a
.
infer schema
None. If present, this option causes schema inference to occur, attempting to identify the data types of each field in the file. Only used for csv
files.
file format
String. The format of the source file. Supported formats: text
, parquet
, csv
, json
, orc
. Default is csv
.
compression
String. If the file is compressed, the type of compression. Default is none. Options: gzip
, zip
.
quote character
String. The character optionally used to enclose fields within the file. Only used for csv
files. Default is "
.
escape character
String. The character optionally used to escape quotations within a quoted field. Only used for csv
files. Default is "
.
null value
String. Fields of this value will be converted to null
. Only used for csv
files. Default is that empty fields are converted to null
.
delimiter
String. The character used to separate fields within the file. Only used for csv
files. Default is ,
.
encoding
String. The encoding of the file. Only used for csv
and json
files. Default is UTF-8
for csv
and newline-delimited json
and auto-detected for multi-line json
.
date format
String. The Java date format used to identify fields as dates within the file. Only used for csv
and json
files. Default is yyyy-MM-dd
.
timestamp format
String. The Java datetime format used to identify fields as timestamps within the file. Only used for csv
and json
files. Default is yyyy-MM-dd'T'HH:mm:ss.SSSXXX
for json
and yyyy-MM-dd HH:mm:ss
for csv
.
multiline
None. If present, this option enables parsing multiple lines as one record. Only used for csv
and json
files.
header
None. If present, the first line of the file will be used as field names for the resulting table. Only used for csv
files.
ignore leading white space
None. If present, leading white space will be trimmed from each field. Only used for csv
files.
ignore trailing white space
None. If present, trailing white space will be trimmed from each field. Only used for csv
files.
data source reference
String. The name of the data source to save the table with. Defaults to the default data source for the repository.
partition count
Integer. The exact number of partitions to save the resulting table with. Defaults to 1.
partition columns
String. A comma-separated list of column names to partition the resulting table by. Default is unpartitioned.
with replace [with delete]
None. If present and a table already exists with the same name as the target table, this command will drop the existing table and replace it with this new table. If with delete
is also present, the underlying data for the existing table will be deleted. If a table does not already exist, this option has no effect. By default, an error will be thrown if a table exists with the same name as the target table.