With many people in an organization likely to be using a data lake for important analysis, it’s important to be able to track who has accessed or changed data and when it happened. If there is an issue with the quality of a table, we want to be able to identify who last modified the table, when, and how. It’s also useful for organizations to monitor for unusual data access, as they may need to report on data access for compliance purposes.
In Magpie, there are two primary mechanisms for understanding how users are interacting with the data.
The first is the Activity History command, which shows the activities of a specified user or the organization as a whole. Clicking on an individual row will present the actual command that was executed along with whatever permissions were utilized. Failed commands are also logged. Example commands:
activity history of user email@example.com limit 30; activity history of organization silectis limit 30;
The second mechanism for reviewing user activity is the is the Usage History command, which shows the recent usage of a given metadata object and optionally that object’s children.
In this example, we’ll inspect a schema and specify
cascade to also include any tables included within the schema.
usage history of schema washington_dc cascade