A persistence mapping defines the underlying data for a Magpie table. Usually, this is a location on a Magpie data source (e.g., a file or files in a filesystem or a table in a database), but sometimes this can be a virtual mapping (such as a SQL query executed by Magpie). There are many types of persistence mappings, and each has different required fields based on the mappingType
field.
Distributed File System Mapping (e.g., Amazon S3, Google Cloud Storage, Azure Storage)
Specification
{ "sourceName": "<string>", "file": <file spec>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "isWritable": boolean, "mappingType": "FileSystem" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie HDFS data source containing the data. | Required | |
file | An object which describes the details of the underlying file or folder on the filesystem. | Required | ||
mapping | Array of | List of mappings from source field name to target field name. Default is to use the source field names in Magpie. | Optional | |
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying filesystem files. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Delimited Text File Example
{ "sourceName": "my_s3_source", "file": { "path": "/my_data", "fileName": "transactions", "delimiter": "\t", "compression": "Gzip", "header": true, "multiLine": false, "ignoreLeadingWhiteSpace": false, "ignoreTrailingWhiteSpace": false, "quoteCharacter": "\"", "escapeCharacter": "\"", "dateFormat": "yyyy-MM-dd", "timestampFormat": "yyyy-MM-dd'T'HH:mm:ss.SSSXXX", "format": "DelimitedText" }, "mapping": [ { "sourceField": "uid", "targetField": "uid" }, { "sourceField": "price", "targetField": "invoice_val" }, "isWritable": false, "mappingType": "FileSystem" }
JDBC Mapping (e.g., most relational databases)
Specification
{ "sourceName": "<string>", "expression": <database expression spec>, "fetchSize": <integer>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "isWritable": <boolean>, "mappingType": "jdbc" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie database data source containing the data. | Required | |
expression | Defines the location of the data in the database, either a table or a SQL query. | Required | ||
fetchSize | Integer | The number of rows to fetch at once from a JDBC connection when reading from the table. | Optional | |
mapping | Array of | List of mappings from source to target fields. Default is to use the source field names in Magpie. | Optional | |
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying table. Only valid for mappings with table expressions. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "my_rdbms", "expression": { "schemaName": "dbo", "tableName": "transactions" }, "fetchSize": 50000, "isWritable": false, "mappingType": "Jdbc" }
PostgreSQL Bulk Mapping
Specification
{ "sourceName": "<string>", "expression": <database expression spec>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "isWritable": <boolean>, "mappingType": "PostgresBulk" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie Postgres data source containing the data. | Required | |
expression | Defines the location of the data in the database, either a table or a SQL query. | Required | ||
mapping | Array of | List of mappings from source to target fields. Default is to use the source field names in Magpie. | Optional | |
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying table. Only valid for mappings with table expressions. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "my_rdbms", "expression": { "schemaName": "public", "tableName": "transactions" }, "isWritable": false, "mappingType": "PostgresBulk" }
Amazon Redshift Bulk Access Mapping
Specification
{ "sourceName": "<string>", "expression": <database expression spec>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "stagingFile": <staging file spec>, "isWritable": <boolean>, "mappingType": "RedshiftBulk" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie Redshift data source containing the data. | Required | |
expression | Defines the location of the data in the database, either a table or a SQL query. | Required | ||
mapping | Array of | List of mappings from source to target fields. Default is to use the source field names in Magpie. | Optional | |
stagingFile | A specification of a location to use to stage reads from and writes to Redshift. | Optional | ||
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying table. Only valid for mappings with table expressions. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "my_redshift_db", "expression": { "schemaName": "bi", "tableName": "transactions" }, "stagingFile": { "sourceName": "my_temporary_s3_dump", "path": "/staging", "fileName": "/transactions_temp" }, "mappingType": "RedshiftBulk" }
Snowflake Bulk Mapping
Specification
{ "sourceName": "<string>", "expression": <database expression spec>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "isWritable": <boolean>, "mappingType": "SnowflakeBulk" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie Snowflake data source containing the data. | Required | |
expression | Defines the location of the data in the database, either a table or a SQL query. | Required | ||
mapping | Array of | List of mappings from source to target fields. Default is to use the source field names in Magpie. | Optional | |
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying table. Only valid for mappings with table expressions. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "my_snowflake_source", "expression": { "schemaName": "public", "tableName": "transactions" }, "isWritable": false, "mappingType": "SnowflakeBulk" }
Google BigQuery Mapping
Specification
{ "sourceName": "<string>", "expression": <bigquery expression spec>, "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "stagingFile": <staging file spec>, “isWritable”: <boolean>, "mappingType": "BigQuery" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie Redshift data source containing the data. | Required | |
expression | Defines the location of the data in the bigquery project via a dataset ID and a table ID. | Required | ||
mapping | Array of | List of mappings from source to target fields. Default is to use the source field names in Magpie. | Optional | |
stagingFile | A specification of a location to use to stage writes to BigQuery. | Optional | ||
isWritable | Boolean | Whether Magpie is allowed to write to or delete the underlying table. | Optional | true |
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "my_bigquery_source", "expression": { "datasetId": "sample-bq-dataset", "tableId": "sample-bq-table" }, "stagingFile": { "sourceName": "my_temporary_gs_dump", "path": "staging", "fileName": "temp_file" }, “isWritable”: true, "mappingType": "BigQuery" }
Kafka Subscription Mapping
Specification
{ "sourceName": String, "subscription": <subscription spec>, "keyType": "<string>", "valueType": "<string>", "startingOffsets": <offsets spec>, "endingOffsets": <offsets spec>, "isWritable": <boolean>, "mappingType": "Kafka" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of the Magpie Kafka data source containing the data. | Required | |
subscription | Defines which data Magpie should include in the Kafka query. | Required | ||
keyType | String | The data type of keys in the loaded topic or topics. Supported value types:
| Optional |
|
valueType | String | The data type of message values in the loaded topic or topics. Supported value types:
| Optional |
|
startingOffsets | The start point for loading messages from the configured topic or topics. Must be either | Optional |
| |
endingOffsets | The end point for loading messages from the configured topic or topics. Must be either | Optional |
| |
isWritable | Boolean | Whether Magpie is allowed to write to the underlying topics. | Optional |
|
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sourceName": "kafka_source", "subscription": { "subscriptionType": "Topic", "topic": "topicA" }, "keyType": "String", "valueType": "String", "startingOffsets": "Earliest", "endingOffsets": "Latest", "isWritable": true, "mappingType": "Kafka" }
Generic Spark Mapping
Specification
{ "sourceName": "<string>", "path": "<string>", "mapping": [ { "sourceField": "<string>", "targetField": "<string>" } ], "options": <options spec>, "isWritable": <boolean>, "mappingType": "Generic" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sourceName | String | Name of data source containing table. Data source object must exist. | Required | |
path | String | The | Required | |
mapping | Array of | List of mappings from source to target fields. | Optional | None |
options | Object | An object of string key-value pairs to supply to Spark as data frame read or write options. This will be merged with the data source options set, with values here overriding duplicate values specified on the data source. | Required | |
isWritable | Boolean | Whether Magpie is allowed to write to the underlying source. | Optional |
|
mappingType | String | Type of data source, | Required |
Example
{ "sourceName": "my_elastic_source", "path": "media/movies", "options": { "es.read.field.as.array.include": "actor, genre" }, "mappingType": "Generic" }
SQL Query
Specification
{ "sql": "<string>", "mappingType": "Sql" }
Structure Values
Field Name | Type | Description | Required | Default |
---|---|---|---|---|
sql | String | SQL query string to execute in Magpie. | Required | |
mappingType | String | Type of persistence mapping, | Required |
Example
{ "sql": "SELECT product, sum(revenue) AS sales FROM transactions", "mappingType": "Sql" }