Magpie presents a notebook style interface that allows users to create shareable records of their work. These notebooks can be organized into folders and shared with other users by setting appropriate permissions.
Creating and Organizing Notebooks
To create a new notebook, simply click the "Create new note" link on the Magpie home page after logging in or by choosing "Create new note" from the Notebook dropdown in the nav bar.
Notebooks can be organized into folders by adding a folder name before the notebook name separated by a slash.
Working within Notebooks
Once you have created a notebook, you can begin adding paragraphs to it. Paragraphs are simply a container for a set of commands that are executed in succession as a unit. A new notebook starts with an initial paragraph. Additional paragraphs are added by clicking directly below the last paragraph in a notebook.
When creating a paragraph, it is possible to assign an interpreter to the paragraph. The interpreter determines how the commands in a paragraph are processed by Magpie. Depending on the interpreter selected a different scripting or programming language maybe used within the interpreter. The interpreter is set by adding a percent sign followed by an identifier at the beginning if the paragraph. For example, add "%sql" at the beginning of a paragraph causes it to use the SQL interpreter and expect a SQL queries to be present in the paragraph.
The table below describes each of the interpreters available in Magpie.
This is the default interpreter within Magpie and allows magpie specific commands to be used to create new objects in Magpie, ingest data from sources, create new jobs, list and describe existing objects, and profile data.
The SQL interpreter executes SQL statements and returns a tabular result.
The Python interpreter allows you to write scripts in Python that run within the Magpie context and can access tables that have been created by Magpie as dataframes. This interpreter further provides access to the full set of PySpark libraries. This interpreter can be used to build train and execute quantitative models using the Spark ML libraries. Magpie Python API documentation is available here.
The Scala interpreter allows you to write scripts in Scala, leveraging the Spark libraries to build custom processing components, similar to the Python interpreter. Magpie Scala API documentation is available here.
The R interpreter allows you to write scripts in R, leveraging the Spark R libraries to build custom processing components, similar to the Python interpreter. Magpie R API documentation is available here.
This interpreter allow you to create richly formatted blocks within notebooks using the Markdown format. This is useful for creating inline documentation within notebooks.
Once you have selected an interpreter, you can write scripts that are then executed by clicking the arrow icon in the upper right hand corner of the paragraph. The results of the script are displayed below the script editor and can either text feedback on the execution, a table in cases when a data set is being returned, or a specialized display when exploring Magpie metadata or profiling tables.
Paragraphs can be resized, moved within the notebook allowing for a number of different layouts. Paragraphs can also be deleted, cloned, or disabled. As shown in the figure below, this can by clicking on the gear icon in the upper right corner of the paragraph.
Magpie can offer completion suggestions for commands, queries, and method calls while writing scripts in the notebook. Syntax autocomplete is currently supported for the Magpie, SQL, Python, and Scala interpreters. To generate autocomplete suggestions, just hit the Tab key while editing a paragraph. Suggested completions will show up in a dropdown list, and you can use the arrow keys to select a completion and Enter or Tab again to fill a completion.
Currently, only syntax autocomplete is supported, and Magpie is not yet able to suggest table, schema, or other metadata names in commands and queries.
Setting Notebook Permissions
By default, the notebooks you create in Magpie are only accessible to you. To share notebooks with your teammates you can update the permissions on the notebook. This will make the notebook visible to others. Permissions are set by clicking on the lock icon on the upper right corner of the notebook and entering usernames in the appropriate field. Permissions can also be assigned using User Roles, and the example below shows
analysis_team roles being applied to make it easier to manage permissions. Note that if any of the permission type fields are left blank, that signifies that all users of the cluster are granted that permission on the note.