Back to Blog

Post 7: Securely Deploying AI Solutions

Post 7: Securely Deploying AI Solutions

Protecting AI systems has become an expansive topic, as the potential for misuse varies depending on the type of deployment at hand.  While most of the guidance available is presented as generalized frameworks, in this post I will attempt to condense these recommendations into tangible, actionable steps you can take to protect yourself, your data, and your organization.

It's crucial to emphasize that sharing confidential or sensitive information with anyone, including a shared large-language-model like ChatGPT, is never recommended. Within their own documentation, OpenAI directly states that ChatGPT leverages information its users and human trainers provide.  (https://help.openai.com/en/articles/7842364-how-chatgpt-and-our-language-models-are-developed#h_2df02d4917). Data leakage from input and conversations is also possible; in March 2023 users were able to see other users’ conversation titles due to a software bug, and in a separate incident a portion of payment information was exposed to subscribers.  

Once shared with these platforms and incorporated into the LLM’s training base, information entered is effectively impossible to delete – thus the importance for never sharing any confidential data with ChatGPT or any other online service that lacks strict data governance guardrails.  Categories of information to avoid include those containing sensitive company data, intellectual property (including proprietary software code), financial information, personal data including information that may be used as security questions and answers and usernames or passwords.  If working with publicly available yet company-specific information, one may opt to sanitize or obfuscate the data by replacing the company name with ‘XYZ,’ and replacing with the appropriate information when formulating the final document.

Security controls improve when leveraging enterprise-grade tools such as Microsoft Copilot, but concerns still exist and an organization-wide assessment should be made before enabling across the organization.  Per Microsoft, Copilot for M365 is compliant with existing privacy, security, and compliance commitments made by Microsoft to its M365 commercial customers, including the General Data Protection Regulation (GDPR) and European Union (EU) Data Boundary.  However, being a cloud-based service, it is still technically feasible that data could potentially be leaked or exposed through software error or misconfiguration.  In April, the U.S. Congress banned Microsoft Copilot use by congressional staffers, stating “the Microsoft Copilot application has been deemed by the Office of Cybersecurity to be a risk to users due to the threat of leaking House data to non-House approved cloud services.”  Gardner has also stated “internal exposure of insufficiently protected sensitive information is a serious and realistic threat. External web queries that go outside the Microsoft Service Boundary also present risks that can’t be monitored.”

Even with enterprise-level solutions, unauthorized access is still a concern.  The deployment of Copilot is especially problematic with organizations that have maintained poor control over their file and folder permissions structure, as users will now have a natural-language interface available to construct queries that they may not realize they have access to.  Microsoft suggests adopting a strategy of Zero Trust when deploying Copilot across the organization.  In this context, Zero Trust means that user authentication and authorization is explicitly verified, least privileged access is leveraged whenever possible, and in the event of a breach the level of exposure is kept to a minimum.

To remediate issues with unintended access to tenant data, a tool such as Microsoft Purview can be leveraged to classify data, apply sensitivity labels and protection policies, and provide data loss prevention (DLP).  Your organization can extend the search area to allow the Microsoft Edge browser to analyze and summarize content accessed through Bing search and its various open tabs.  Organization data can be summarized by Copilot in Edge and may include local resources on the user’s computer, Intranet resources, Microsoft 365 sites, Microsoft Azure resources, and plug-ins and connectors to third-party cloud product sites and cloud-based SaaS applications.  A full implementation of all of these capabilities would offer the most complete set of Copilot capabilities for the typical Microsoft M365 end-user.

In summary, there are significant concerns that need to be considered when leveraging AI within your organization and depending on your deployed solution you may have varying degrees of control.  When leveraging ChatGPT, the onus is entirely on the end user to ensure the content of their conversations and data do not violate company policies.  When leveraging Microsoft Copilot, enterprises should leverage permissions constructs within their tenant to ensure users do not have access to data outside of their purview.  When enabling web search and summarization capabilities, answers can be limited to Bing can be validated through grounding to ensure that the answers provided to the users are actually reflected in the source material.  Finally, targeted capabilities can be extended to end users through plug-ins and custom connectors, providing another mechanism to enforce restrictions to data access.

Jayson Tobias

Subscribe to our newsletter

Want to stay up to date on our latest articles and news? Subscribe to
our newsletter below.

Thanks for joining our newsletter.
Oops! Something went wrong.