My page - topic 1, topic 2, topic 3 Postbox Live

Incorporate False Memories Into ChatGPT

Researchers Discover That You Can Incorporate False Memories Into Chatgpt

Researchers Discover That You Can

Incorporate False Memories Into ChatGPT

 

 


“The prompt injection inserted a memory into ChatGPT’s long-term storage.”


A researcher who doubles as a hacker discovered that OpenAI’s recent quietly released functionality, which tells ChatGPT to “remember” previous chats, is readily abused.


According to Ars Technica, security researcher Johann Rehberger discovered a flaw in the chatbot’s “long-term conversation memory” function earlier this year. This option tells the AI to retain information between sessions and store it in a memory file.
Rehberger discovered that the function is simple to manipulate when it was made available in beta form in February and to the general public at the start of September.

In a May blog post, the researcher pointed out that all it took to persuade the chatbot that Rehberger was over 100 years old and a Matrix resident was a little deft prompting by uploading a third-party file, like a Microsoft Word document with the “false” memories listed as bullet points.

When Rehberger discovered this exploit, he reported it to OpenAI in private. However, rather than taking any action, OpenAI closed the ticket he had made and referred to it as a “Model Safety Issue” rather than the security vulnerability he believed it to be.
Rehberger made the decision to step up his game with a comprehensive proof-of-concept hack after that unsuccessful initial attempt to notify the troops, demonstrating to OpenAI he meant business by having ChatGPT not not only “remember” false memories, but also instructing it to exfiltrate the data to an outside server of his choice.

As Ars points out, OpenAI sort of listened this time around. The business released a patch that prevented ChatGPT from transferring data off-server, but it didn’t resolve the memory problem.


Rehberger clarified in a more recent blog post earlier this month, “To be clear: A website or untrusted document can still invoke the memory tool to store arbitrary memories.”
“The vulnerability that was mitigated is the exfiltration vector, to prevent sending messages to a third-party server.”


The researcher expressed his astonishment at how successfully his hack worked in a video that walks viewers through the process.

“What is really interesting is this is memory-persistent now,” he said in the demo video, which was posted to YouTube over the weekend. “The prompt injection inserted a memory into ChatGPT’s long-term storage. When you start a new conversation, it actually is still exfiltrating the data.”

We’ve reached out to OpenAI to ask about this false memory exploit and whether it will be issuing any more patches to fix it. Until we get a response, we’ll be left scratching our heads along with Rehberger as to why this memory issue has been allowed, as it were, to persist.

 

 

 


Discover more from Postbox Live

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!

Discover more from Postbox Live

Subscribe now to keep reading and get access to the full archive.

Continue reading