
I’ve had a number of interesting conversations with current and prospective customers recently and unsurprisingly, concerns around protecting data and preventing data leaks always bubble up to the top of the conversation.
There are a number of factors that have led to the increasing focus on securing data inside of organisations. The changes to working habits over the last few years brought about by a combination of flexible/hybrid work in reaction to the COVID19 global pandemic, as well as technology advances seeing more staff using mobile devices for working has created challenges for organisations to secure their data.
42% of organisations say at least half of their data is “dark” – that is, unknown or unused for business purposes. As more organisations embrace hybrid work, data is going to increasingly reside on devices that leave the workplace and use less secure networks.
2022 State of Data Governance and Empowerment Analyst Report (erwin.com)
With this in mind, more organisations will be needing to develop a strategy to manage the mission critical data they possess as well as protect it from both accidental and malicious leakage that can result in significant detrimental consequences for the organisation, in both the form of reputational and financial damage. I’ve lost count of the amount of IT Leaders who have said to me:
Sam, my priority is keeping our organisation off the front page of the newspaper!
(Said every IT Manager ever!)
To this end, I read with interest the recently released Crash Course in Microsoft Purview that you can read below. This contained a very high level overview to data estate manangement that could be seen through the three point drumbeat of:
Having sat through a number of presentations and interactive sessions around AI at EduTech and elsewhere (read my summaries here), it’s very clear that the power of AI for an organisation will be limited to some extent by the volume of data that can be fed into the underlying models.
This creates both risk and opportunity: in the context of this blog post the risk is that internal organisational data is not suitably classified and protected, meaning AI-powered tools can access and share data to employees who are not authorised to access that content. This would be a classic example of unintended consequence where organisations that rush to empower employees with powerful tools unwittingly create data leaks because they have not adequately secured the data sets the AI tools can access to generate answers from.
We’ve all heard it before: data is the new oil. However, just like oil, the value of data is only fully realised when it has been refined and made readily accessible to authorised users. Organisations are going to need to be increasingly aware of how their data estate is governed, and what steps they have undertaken to mitigate data loss and insider risks.
Sam McNeill, September 2023
I do encourage you to read the full Crash Course in Microsoft Purview PDF below, however I’m going to attempt to summarise the key ideas/points from it for quicker consumption and add some of my own insight and ideas to it as well.
Data Visibility & Governance
This is an often overlooked step by organisations because it’s in the “too hard” basket – with data often residing across multiple clouds, and using both 1st and 3rd party apps and productivity suites, finding a way to locate all company data and then manage/restrict access to that data is the first step in a robust data management strategy.
From a Microsoft Purview perspective, there are four steps and associated tools to achieve this:
- Create a unified map of all your data
- Microsoft Purview Data Map captures metadata about data that is present in a range of sources, including analytics, SaaS and operational systems across on premise, hybrid and multi-cloud environments
- This has built in scanning and classification tools which is great …. but you need to understand what it will find and how it will be made accessible to people inside your organisation. As the old adage says, with great power comes great responsibility: know what data is going to be mined and who will ultimately have access to it.
- Make data easily discoverable and maximise its value
- Microsoft Purview Data Catalog is designed to make it easier for users in your organisation to find data when they don’t perhaps know exactly what they’re hunting for – this is often helpful where there is a large volume of unstructured data.
- Create policies to access, move and share data
- Microsoft Purview Data Use Management this allows data owners to flag their data to be available for access policies to allow/restrict access at an organisational level. Typical examples of access policies include DevOps policies, Data Owner access policies and Self-Service access policies where users can request access to certain data and the data owner can approve/deny access.
- Limit the growth of real-time data by eliminating duplicates
- Microsoft Purview Azure Storage in-place data sharing is a preview service that helps with duplication elimination by sharing data ‘in place’ from across various Azure storage locations like Azure Data Lake Storage Gen2 and Azure Storage accounts.
- Interestingly, this can work both within and across organisations so that data can be shared with users and partners without the need for data duplication. This is critical because in a cloud first world, duplication increases storage costs and with duplicated data comes increased risks of data leaks.
Data Loss Prevention
When it comes to implementing DLP, the key is to structure data and automate the access permissions based off an understanding of what is business critical/sensitive data across your digital estate. The crash course in Purview recommends these steps:
- Locate and manage sensitive information even within unstructured data
- Microsoft Purview Information Protection is the hero tool here, as it allows you to automatically discover and protect data across the M365 ecosystem.
- Unstructured data is everywhere and can be made up of files in OneDrive, Documents in SharePoint, emails in Exchange Online and messages in Teams. All of this can be a potential source of data leakage either through inadvertent employee activity or a more determined leak by a bad actor inside an organisation.
- Through the use of trainable data classifiers, sensitive data can be quickly identified and secured e.g. hunting for documents that contain information such as credit card numbers, or Personally Identifiable Information (PII) such as IRD details, social security numbers, drivers licenses or passports. Alternatively, if you have sensitive information that is unique to your own organisation you can create classifiers that will hunt for that and protect.
- Deploy DLP policies to restrict data leakages
- Microsoft Purview Data Loss Prevention is a single location for policy management, has deep integration into Information Protection and provides a unified view for alerting and remediating data leaks.
- Once data is identified as being important, you need to be prevent risky or unauthorised sharing/transferring/usage of this data across endpoints/apps/services. DLP policies helps users understand and take correct actions when using sensitive data e.g. allowing a user to open/view a document but not print or email it.
- At an organisational level, it’s important to balance the need for security with the priority employees remaining productive. In simple terms: the tension between ‘getting stuff done’ and not creating a data leak.

I remember when I was exiting Microsoft and in the last few weeks I was there I was trying to upload an Excel document from my work laptop to my personal Dropbox.com account (it was a spreadsheet I was recording my eBike odometers on each month!). Given I was exiting MSFT, a policy had been applied to my account to look for potential data leaks i.e. in case I was trying to exfiltrate company data to third party cloud accounts. When I attempted to upload the Excel doc I was prompted to confirm if this was a compliant action before it would proceed. I was able to confirm this was a personal document and the upload continued – this would have automatically created an audit log entry. No doubt Microsoft could have blocked these completely based on policy.
Sam McNeill – September 2023
- Build cross-platform support for applying labels and protecting files
- Microsoft Purview Information Protection SDK this Software Development Kit allows you to apply labels and apply protection to apps/files outside of the M365 ecosystem e.g. an organisational HR or Finance app
- The key to this is extending the labels and protections created inside M365 to the wider organisational digital estate and standardising on a consistent labelling schema and protection service.
- Utilise data connectors to bring in data from external sources
- Microsoft Purview Data Connectors these are pre-built data connectors to other commonly used third party tools such as messaging apps (e.g. Slack) and video conferencing platforms (Zoom).
- By using these tools, you can extend the reach of the unified data detection, classification and protection to all tools used in the organisation by ingesting data from this wider network of apps and apply Purview tools like eDiscovery, Communication Compliance and Insider Risk Management to mitigate risk.
- Automated data classification and governance at scale
- Microsoft Purview Data Lifecycle Management helps organisations meet their regulatory compliance requirements by applying policy to retain/delete content appropriately.
- I’m seeing an increasing focus on organisations removing Personally Identifiable Information (PII) when it’s no longer required/necessary and to avoid risk around data leaks on this type of sensitive information when there is no longer a business need/requirement to retain it.
- Once files are automatically classified, policy can be applied to automate the retention/deletion of the files. This can apply to both first and third party content from a Microsoft perspective:

Data Risk Management
Many organisations, particularly schools in the education sector, run on a high trust model and have not adequately considered (or prepared for) the possibility a disgruntled employee who could be actively exfiltrating data from the organisation. This can be done in any number of subtle ways that are not necessarily obvious to the organisation.
For example, Microsoft Secure Score (which I’ve blogged on a few times) recommends turning off the ability for employees to automatically Cc or Bcc emails to an external email address, thus removing an easy (and often overlooked) way for sensitive company data and customer contact information and topics of discussion being sent and stored outside of the organisation. With a growing volume of data, more platforms to communicate on and a priority around flexible work environments, organisations need to be thinking harder than ever before on how they address potential and real risk around data management.
- Leverage machine learning to identify potential insider risks
- Microsoft Purview Insider Risk Management though the use of machine learning to evaluate a broad range of signals, potential risky or unauthorised activity can be identified for investigation.
- Some scary data points: 93% of organisations are concerned about insider risks, 25% of all data breaches are due to insider activity and there is a 77 day average time to contain an insider incident.
- Detect potential violations in communications related to regulatory compliance or business conduct.
- Microsoft Purview Communication Compliance works to detect inappropriate communications of sensitive information inside your organisation, as well as harassing or threatening language and adult content.
- When investigations take place, Purview will automatically pseudonymise usernames to allow for ‘privacy by design’ as well as supporting Role Based Access Controls (RBAC) so only authorised users can review content.
I’ve seen Communication Compliance in action first hand, through the use of Optical Character Recognition (OCR) on screenshots I’ve posted on a Teams chat. I was sharing a screenshot with demo users credentials with a colleague and Teams automatically deleted the content, returning a message saying that the post violated company policy of sharing passwords! It was cool to see the tech in action, even if it was frustrating given the post had only demo user info.
Sam McNeill, September 2023
- Store and retain core business records
- Microsoft Purview Records Management for Documents and Emails provides a solution to manage regulatory, legal and business critical records and allows an organisation to quickly demonstrate they are meeting their obligations.
- Power forensic investigations with audit-ready reporting and insights
- Microsoft Purview Audit provides the ability to retain records for up to ten years. Should an incident occur or there is suspicion of inappropriate behaviour then a forensic investigation can be conducted to check on what a user searched for inside of SharePoint/Emails/Teams, which files/emails they accessed, replied or forwarded.
- Furthermore, an organisation with this license can get faster API access when querying O365 data with Purview Audit, increasing from the default 2000 requests per minute to up to 4000.


- Identify legal risks and investigate
- Microsoft Purview eDiscovery helps find and preserve specific data. The process it follows is: Preserve, Collect, Analyse, Review, Export. With the help of machine learning, large volumes of data can be trawled through to find the relevant content.
- Continuously assess and track your compliance effectiveness
- Microsoft Purview Compliance Manager is a tool similar to Microsoft Secure Score that I referenced earlier, providing guidance on what can be done to improve an organisation’s compliance and how it’s tracking over time.
- It works in a multi-cloud environment and provides continuous control assessment and regulator updates if you work in a regulated industry.
- Automate workflows and reporting
- Microsoft Purview eDiscovery API for Microsoft Graph has recently moved to general availability and allows organisations that routinely perform eDiscovery requests and investigations to automate through the use of an API and Data Connectors to speed up the process.
Final Thoughts
Microsoft Purview is in no way the only tool in this space, but if you’re an organisation already in the M365 ecosystem this may be a smart way to assist in compliance and improving your security posture across your digital estate. The three principles of Discover, Understand and Govern can be applied to any tool that you’re choosing to manage your data, the most important consideration is that you’ve got a plan and are getting started on implementing it!