This workflow automates the process of monitoring a specific Google Drive folder for new CSV files, analyzing the content to identify columns containing personally identifiable information (PII), removing these columns, and then uploading the sanitized files back to Google Drive. It uses a Google Drive trigger to detect new file uploads, downloads the files, extracts and analyzes their tabular data with OpenAI’s GPT-4 model to identify PII columns, and then processes the data to remove such sensitive information. The cleaned CSV files are then saved to a designated folder, ensuring data privacy and compliance.
The process begins with the Google Drive trigger, which watches a specified folder for new files. When a new CSV file is detected, it is downloaded and its content extracted. The data is then sent to OpenAI’s GPT-4, which analyzes the table and returns the names of columns containing PII. Next, the workflow removes these columns from the data, reconstructs the CSV without PII, and generates a new filename indicating the removal of sensitive data. Finally, the sanitized file is uploaded to a different Google Drive folder for secure storage.
This workflow is particularly useful in scenarios where organizations need to automate the anonymization of data shared via Google Drive, such as preparing datasets for analysis or sharing sensitive information without risking privacy violations.
Reviews
There are no reviews yet.