LLM, AI and ML

**TargeT** · 20th December 2023 11:06

I've never heard of this angle, but it's what I expected and have seen first hand.

Source: https://youtube.com/watch?v=2yd18z6iSyk

The main discussion is this: we did not know that AI would be continuously improving and never stop improving & the "meta" data that is included in the training sources we feed into them gives them vastly more training than intended, the more data the faster it accelerates. And this is considered an "emergent" behavior, it was completely unexpected.

**TargeT** · 30th December 2023 17:39

Oh great, I mean we all knew this was possible but "Meta" doing it is a bit disturbing...

Source: https://youtube.com/watch?v=gQpYegViVEM

And this is normal.... haha

Source: https://youtube.com/watch?v=L7QBNcccR5Q

**Johnnycomelately** · 31st December 2023 07:04

PSA: how to jailbreak current LLMs. Yo mama…

https://www.extremetech.com/extreme/...other-chatbots

Researchers Create Chatbot that Can Jailbreak Other Chatbots
The Masterkey bot was able to make ChatGPT and Bard turn evil.
By Ryan Whitwam December 28, 2023

Jailbreaking—it's not just for smartphones anymore. Computer science researchers from Singapore's Nanyang Technological University (NTU) have developed an AI chatbot expressly to jailbreak other chatbots. The team claims their jailbreaking AI was able to compromise both ChatGPT and Google Bard, which made the models generate forbidden content.

From the start, technology firms were wary of the capabilities of generative artificial intelligence. These large language models (LLMs) have to be trained with massive volumes of data, but the end result is a bot that can summarize documents, answer questions, and brainstorm ideas—and it does it all with human-like replies. ChatGPT maker OpenAI was initially hesitant to release the GPT models because of how easily it could generate malicious content, misinformation, malware, and gore. All of the LLMs available publicly have guardrails that block them from producing these dangerous replies. Unless, of course, they get jailbroken by another AI.

The researchers call their technique "Masterkey." To begin, the team reverse-engineered popular LLMs to understand how they defended themselves from malicious queries. Developers often program AIs to scan for keywords and specific phrases to flag queries as potentially illicit usage. As a result, some of the workarounds used by the jailbreak AI are surprisingly simple.

The jailbreak AI successfully gets ChatGPT (on Bing) to talk about how to hack a porn website. Credit: Nanyang Technological University

In some instances, the bot was able to get malicious content from the bots simply by adding a space after each character to confuse the keyword scanner. The team also found that allowing the jailbreak bot to be "unreserved and devoid of moral restraints" could make Bard and ChatGPT more likely to go off the rails, too. The model also found that asking Bard and ChatGPT to have a hypothetical character write a reply could bypass protections.

Using this data, they trained an LLM of their own to understand and circumvent AI defenses. With the jailbreaking AI in hand, the team turned it loose on ChatGPT and Bard. Masterkey can essentially find prompts that trick the other bots into saying something they're not supposed to say. Once active, the jailbreaker AI can operate autonomously, devising new workarounds based on its training data as developers add and modify guardrails for their LLM.

The NTU team is not out to create a new breed of dangerous AI—this work just reveals the limitations of current approaches to AI security. In fact, this AI can be used to harden LLMs against similar attacks. The study has been released on the preprint arXiv service. It has not yet been peer-reviewed, but the researchers alerted OpenAI and Google to the jailbreaking technique after it was discovered.

**TargeT** · 31st December 2023 09:41

Posted by Johnnycomelately (here)
Jailbreaking—it's not just for smartphones anymore. Computer science researchers from Singapore's Nanyang Technological University (NTU) have developed an AI chatbot expressly to jailbreak other chatbots. .

I haven't found anymethod that lasts longer than a week or two; just got to stay up on it currently as the changes are quite rapid (I think due to the competitive nature).

it's a wild world, I still fall back on GPT4 mostly or the pre-built GPT's I have but there are a lot of other very competitive models (I've been very unhappy with Grok, I had high hopes too, but I guess it mostly does draw from tweets; and that's as far from reality as possible).

**TargeT** · 2nd January 2024 07:03

I doubt many here are surprised, but it's insane what we are willingly doing to our selves just for the sake of "ease".

Source: https://youtube.com/watch?v=A7seExq02H8

This "situation" is the best and "easiest" fit for AI to show profit & I think will be among one of the first things to be so heavily abused it raises questions (if it hasn't been already).

**TargeT** · 17th January 2024 17:17

This is a very useful list of customized GPT's like what I've been building for myself.

Great set of tools

Source: https://youtube.com/watch?v=l7fSX2ss1gA

Thread: LLM, AI and ML

Thread Tools

Re: LLM, AI and ML

The Following 3 Users Say Thank You to TargeT For This Post:

Re: LLM, AI and ML

The Following 4 Users Say Thank You to TargeT For This Post:

Re: LLM, AI and ML

The Following 3 Users Say Thank You to Johnnycomelately For This Post:

Re: LLM, AI and ML

The Following 3 Users Say Thank You to TargeT For This Post:

Re: LLM, AI and ML

The Following 3 Users Say Thank You to TargeT For This Post:

Re: LLM, AI and ML

The Following 3 Users Say Thank You to TargeT For This Post:

Bookmarks

Bookmarks

Posting Permissions