Chalk X

Finding where to put your mark

How We Can Trust Generative AI

Summary:

  • Knowledge is the most important strategic resource for companies (Zack, M.H. 2015) Typically, you only want to use the knowledge you trust, and while you can’t trust AI you can trust the experts that validate what it creates.
  • There are new opportunities for companies to use AI as an organization expert. That same opportunity poses the risk of exposing and sharing knowledge externally.
  • AI will be difficult to trust because unlike other software it cannot be audited. The most effective AI engines are based on neural networks that develop over time, and much like humans, we don’t know why they know what they know, and neither do they. 

Parable:

Steve logs into his work laptop. He sees an e-mail in his inbox asking for his help:


To: steve@company.com 

From: CFO@company.com 

Subject: URG: Help needed with pricing!

Hey Steve,

I know you are an expert with spreadsheets. I wanted you to review the attached one and let me know ASAP if our new pricing looks good. 

Thanks, 

 CFO 

Knowing the importance of the email from such a high-level person in the company, Steve opens the spreadsheet immediately. He finds information about the company’s new product line and proposed pricing for each new product. He selects the column with pricing and asks a new AI engine, ApprAIsal, “Will the pricing in this spreadsheet net a 30% or greater profit for the company”. The bot replies, “Pricing for widgets on rows 10 and cogs on row 12 need to be increased by 10% or more.” Steve updates those rows and replies to the CFO with the updated spreadsheet.

Unbeknownst to Steve only a few days earlier the CFO generated this same spreadsheet using a different AI engine, Go2Mark. He asked, “Create pricing in column H, where we can capture most of the market as a new entrant.” The CFO approves the pricing and sends it over to the sales group. Using another AI engine, AIForce, the sales group generates customer discounts based on past sales.

In the end, there is a lackluster release of the new product line. Sales don’t match the projections that the CEO has from his forecast dashboard, generated from another AI engine. As the complexity of the problem grows the combination of results from AI engines decreases in usefulness due to the assumptions of the engines and their users. At the same time, each user is satisfied with the results that were provided to them.


ChatGPT Warns Us

I asked Chat GPT what are the risks of AI in companies. Here is what it said:

 There are several risks associated with companies using AI, including:

  • Bias and discrimination: AI systems can inherit the biases and prejudices of their creators or the data they are trained on. This can lead to discriminatory outcomes and perpetuate existing inequalities.
  • Security breaches: As AI systems collect and process large amounts of sensitive data, they can become vulnerable to hacking, cyber attacks, and other forms of security breaches.
  • Lack of transparency: AI algorithms can be complex and difficult to understand, making it hard for users to determine how decisions are being made.
  • Unintended consequences: AI systems can make decisions that have unintended consequences or unexpected outcomes, particularly when dealing with complex and unpredictable situations.
  • Job displacement: AI systems can automate tasks previously performed by humans, leading to job displacement and workforce disruption.
  • Legal and ethical concerns: The use of AI can raise legal and ethical concerns around issues such as privacy, intellectual property, accountability, and transparency.
  • Over-reliance on AI: Companies may become overly reliant on AI systems, leading to a loss of human judgment and decision-making skills.

Overall, companies must be aware of these risks and take steps to mitigate them in order to ensure the responsible and ethical use of AI.

– ChatGPT

There is one thing not on this list explicitly but implied in the above points, and that is trust. 

Trust

Knowledge is the most important strategic resource for companies (Zack, M.H. 2015). There are many realms of research and practice around Knowledge Management (KM) in corporations today. A high-level flow for knowledge management looks something like this:

The first part of the process is the discovery or creation of new knowledge or intellectual property. That needs to be stored somewhere. The storage varies based on the type of knowledge. Document repositories include things like file shares, SharePoint, wikis, and GitHub a software source code repository. Software is just another kind of knowledge that needs to be managed. The goal of these repositories is to share important information with the rest of the company.

The knowledge that is stored should only be used only when it is accurate, right, relevant, legal, and overall approved for use. In other words, we need to trust the data before we use it. How do we come to trust new information? Within a company, there are typically processes that get established to verify the information, this is the assessment phase of knowledge management. Once verified we assume it can be trusted and then shared with others.

Some companies have formal and/or informal workflows for validating, assessing, and labeling documents as verified. One formal way to do this is to have a review board of experts examine the work product before allowing it to be shared. Software repositories will automatically scan code for validity but may have a human gatekeeper to accept requests for code changes. If you are familiar with Git this is the pull request and merge process. Wikis do this informally by creating notifications when a document is changed, enabling other authors to assess the validity of changes. All of these methods have an audit trail of the approvers: review board meeting minutes, git commit history, or wiki change log.

One of the simplest ways a company can assess or validate knowledge is by having an “experts list”. We rely on experts when there is an asymmetry of knowledge between us and the experts. They know way more about a topic than we do so we have to trust their advice. The expert list is probably what we are most familiar with in our day-to-day life. We have a trusted list of go-to providers. Like a trusted auto-mechanic. We trust them because not only did they fix our car but they didn’t overcharge us. We may have picked them originally because they had been in business for years and our friends recommended them, people, we trust.

Doctors gain trust similarly, but there is a greater asymmetry of knowledge and greater risk in them providing bad advice. To try and mitigate this as a society we require that they go to more schooling and pass difficult tests. They must also practice with other doctors and be assessed by other experts in their field. We can see this trust documented in the form of diplomas on their wall. There are incentives for them not to lie and extra years of training with other doctors before we trust their ability on their own.

Trust is earned, verified (sometimes by experts), and continually assessed. Trust can be aggregated. We all can think of brands we trust and those that we don’t. We can also think of brands that we used to trust and no longer trust due to some violation of our trust. Once trust is gone it is hard to get back. The consequence of losing trust is a strong incentive for people to ensure what we are saying is right. AI has no such incentive.

How do we trust AI?

AI engines can produce things that look like something we can trust. It has good grammar, and explanations for what it is doing, and a lot of the time what it creates seems to work. Functionality is not sufficient to gain trust.

AI doesn’t operate on the same incentives we have. It doesn’t care if we go out of business if it gives bad advice or if we are fired for taking its advice. There is no visible process in which AI determines if new knowledge can be trusted. There are no notifications if or where it learned something new like Git or a wiki has. No approval body sanctions its knowledge or answers. 

Let’s look at a simple AI-generated code sample. By default, ChatGPT creates the following user login code. 

Some issues:

  • There are no checks on the user input for a SQL injection attack.
  • No special formatting checks for the username.
  • A sub-optimal SELECT statement.
  • No obscuring the password on input making it prone to attack
  • No encrypting of the password. The code uses a hash which is not encryption but may look/sound like encryption to the untrained eye. 

I am not saying the code is worthless but it isn’t finished or robust enough with simple prompts. Even when trying to fix this using ChatGPT, I couldn’t get the SQL injection prevention I would have preferred. I could get it to add some encryption that functioned, but it was configured in a way that if not changed could cause massive issues later on. It is worth noting that to get something close to the right answer you need an experienced person to write the prompts.

The risk is a novice developer could ask a simple question and take what ChatGPT generated and use it because it will run. 

If you use generative AI without knowing what good looks like, then how do you know what to trust? It is like using spellcheck in a foreign language and not knowing which word to cruise.

In the parable above no one knows what good looks like. The scenario gets increasingly complex as a variety of disconnected AI prompts are used to create product pricing. Additionally, a set of slightly different prompts and AI engines create choices based on different goals. You can ask an AI engine how it made some of its decisions, and then have an expert audit that reasoning. The trick here is it won’t really know how it came up with the answer, but it will generate a response based on how most people would answer the question: “What caused you to make that decision?”

We think we are getting Hal from “2001: Space Odyssey” but instead we get Michael Scott from “The Office” in Hal’s Clothes 

Giving Knowledge to our Competitors 

Not only can we not explicitly trust what it creates, but we also shouldn’t trust what it does with the data. Most companies are careful with what data they expose to the market and where they store it. With suppliers and customers, it is common to get Non-Disclosure Agreements in place before working together to legally protect the sharing of information. Most generative AI tools learn by doing and if they are installed inside your company, where are they storing and using your data? There are various data inclusion/exclusion models AI vendors are offering. Ensuring that company data stays with the company and doesn’t enrich the AI company’s data and our company’s competitors’ abilities is a key factor in bringing AI into a firm. 

The possible upside of this AI data ingestion is the possibility of creating a neural network model for your company. This means ensuring good knowledge management hygiene and storing your current company data in a format that AI can use to build a model. The model could provide unique insights into your company like the culture, undiscovered experts, and new opportunities.

Ideally, you could create an audit of the AI neural network to see what it has learned. Currently, this is not possible and some think it may never be possible. Without a way to check where something came from or where the information is going, the biggest problem becomes trust.

Final Thoughts

What we have always needed are experts we can trust. The need for experts is going to continue to grow to sort through, fix, and verify AI-generated content. At the same time, the need for novices will diminish. The long-term question is how do we create experts? We get experts by starting as novices and creating simple things and learning and progressing to more and more complex problems.

While AI can be a great companion for experts, it can be a tool for destruction for novices. We look at the functional output of ChatGPT and praise it for its ability. There recently was a news report that ChatGPT-4 passed Uniform Bar Exam. This doesn’t mean we can trust it. There are billboards full of people who pass the bar exam, but do trust them only because of their ability to pass a test?

Lastly when using something generated by AI experts need to be transparent. We need to share where things came from. If they are generated by AI, then we need to attribute them to that. It should include the AI bot, version, prompts used, etc. Finding the right combination of attribution is a start for building trust around what it creates. Additionally, we should all be careful about asking it to solve complex problems where the right answer isn’t knowable until it is too late.

Published by

Leave a comment