Search This Blog

Powered by Blogger.

Blog Archive

Labels

Showing posts with label Business Insider. Show all posts

Microsoft ‘Cherry-picked’ Examples to Make its AI Seem Functional, Leaked Audio Revealed


According to a report by Business Insiders, Microsoft “cherry-picked” examples of generative AI’s output since the system would frequently "hallucinate" wrong responses. 

The intel came from a leaked audio file of an internal presentation on an early version of Microsoft’s Security Copilot a ChatGPT-like artificial intelligence platform that Microsoft created to assist cybersecurity professionals.

Apparently, the audio consists of a Microsoft researcher addressing the result of "threat hunter" testing, in which the AI examined a Windows security log for any indications of potentially malicious behaviour.

"We had to cherry-pick a little bit to get an example that looked good because it would stray and because it's a stochastic model, it would give us different answers when we asked it the same questions," said Lloyd Greenwald, a Microsoft Security Partner giving the presentation, as quoted by BI.

"It wasn't that easy to get good answers," he added.

Security Copilot

Security Copilot, like any chatbot, allows users to enter their query into a chat window and receive responses as a customer service reply. Security Copilot is largely built on OpenAI's GPT-4 large language model (LLM), which also runs Microsoft's other generative AI forays like the Bing Search assistant. Greenwald claims that these demonstrations were "initial explorations" of the possibilities of GPT-4 and that Microsoft was given early access to the technology.

Similar to Bing AI in its early days, which responded so ludicrous that it had to be "lobotomized," the researchers claimed that Security Copilot often "hallucinated" wrong answers in its early versions, an issue that appeared to be inherent to the technology. "Hallucination is a big problem with LLMs and there's a lot we do at Microsoft to try to eliminate hallucinations and part of that is grounding it with real data," Greenwald said in the audio, "but this is just taking the model without grounding it with any data."

The LLM Microsoft used to build Security Pilot, GPT-4, however it was not trained on cybersecurity-specific data. Rather, it was utilized directly out of the box, depending just on its massive generic dataset, which is standard.

Cherry on Top

Discussing other queries in regards to security, Greenwald revealed that, "this is just what we demoed to the government."

However, it is unclear whether Microsoft used these “cherry-picked” examples in its to the government and other potential customers – or if its researchers were really upfront about the selection process of the examples.

A spokeswoman for Microsoft told BI that "the technology discussed at the meeting was exploratory work that predated Security Copilot and was tested on simulations created from public data sets for the model evaluations," stating that "no customer data was used."