Magpie supports access to any type of data source supported by the underlying Spark environment using a generic data source type. This provides more flexibility to integrate data from sources that are not fully supported yet, but may require a more detailed understanding of the underlying configuration of sources of that type.
Generic sources have a few key attributes that need to be configured as part of their creation. Please view the Data Source JSON specification for more details. These attributes are set during the initial creation of the source or through an
ALTER statement using a JSON specification to set the values. An example of such a command is shown below.
Once a generic source has been created, the tables within it can be used as the underlying storage for tables within Magpie. One factor to note is that any
options specified when creating a data source will be combined with
options present in the
Create Table command. This can save you time and avoid repetitive tasks. Options in the table command will override options in the data source specification if an option with the same key is specified in both locations. Also note that not all generic sources support tables. Some only support streams.
The following is an example of creating a Magpie table that references an underlying generic source table:
This will result in a table within the Magpie Context that in turn references the source table in the generic source. When this table is queried within Magpie, Magpie will "reach" into the source and pull the data before combining it with other local data.
To access this type of data source, you may need to adjust the security configuration of your cloud environment. Please reach out to a member of the Silectis team with any questions or support requests.