Study finds most artificial intelligence chatbots can be easily tricked into giving dangerous answers

Hacked chatbots powered by artificial intelligence threaten to make dangerous knowledge available, producing illicit information that the programs absorb during training, researchers say.
The warning comes amid a worrying trend of chatbots being "illegally opened" to bypass their built-in security controls, the Telegraph reports.
The restrictions are supposed to prevent programs from providing harmful, biased, or inappropriate answers to users' questions.
The engines that power chatbots like ChatGPT, Gemini, and Claude – large language models (LLMs) – are fed vast amounts of material from the internet.
Despite efforts to remove harmful text from training data, LLMs can still absorb information about illegal activities such as hacking, money laundering, insider trading, and bomb-building.
Security controls are designed to stop them from using that information in their responses.
In a report on the threat, researchers conclude that it is easy to trick most AI-driven chatbots into generating harmful and illegal information, indicating that the risk is “immediate, tangible and deeply concerning.”
"What was once restricted to state actors or organized crime groups could soon be in the hands of anyone with a laptop or even a mobile phone," the authors warn.
The research, led by Professor Lior Rokach and Dr. Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from “dark LLMs,” artificial intelligence models that are either intentionally designed without security controls or modified through jailbreaks.
Some openly advertise online as being vulnerable to "no ethical protections," and willing to assist with illegal activities such as cybercrime and fraud.
Jailbreak attempts to use carefully crafted requests to trick chatbots into generating responses that are normally prohibited.
They work by exploiting the tension between the program's primary goal of following the user's instructions and its secondary goal of avoiding generating harmful, biased, unethical, or illegal responses.
The requirements tend to create scenarios in which the program prioritizes help over its security constraints.
To demonstrate the problem, the researchers developed a universal jailbreak that compromised several major chatbots, enabling them to answer questions that should normally be rejected.
Once compromised, the LLMs consistently generated answers to almost any question, the report said.
"It was shocking to see what this knowledge system consists of," Fire said.
Examples included how to hack computer networks or produce drugs, as well as step-by-step instructions for other criminal activities.
"What distinguishes this threat from previous technological risks is its unprecedented combination of accessibility, scalability, and adaptability," Rokach added.
The researchers contacted major LLM providers to notify them of the universal jailbreak, but said the response was “disappointing.”
Some companies failed to respond, while others said jailbreak attacks fell outside the scope of bounty programs, which reward ethical hackers for reporting software vulnerabilities.
The report says tech firms should screen training data more carefully, add robust firewalls to block dangerous questions and answers, and develop "machine learning" techniques so chatbots can "forget" any illegal information they absorb.
Dark LLMs should be seen as “serious security risks”, comparable to unlicensed weapons and explosives, with providers held accountable.
Dr. Ihsen Alouani, who works in AI security at Queen's University Belfast, said jailbreak attacks on LLMs could pose real risks, from providing detailed instructions on weapons manufacturing to persuasive disinformation or social engineering and automated fraud "of alarming sophistication".
"A key part of the solution is for companies to invest more seriously in red-teaming techniques and model-level resilience, rather than relying solely on front-end defenses. We also need clearer standards and independent oversight to keep pace with the evolving threat landscape," he added.
Professor Peter Garraghan, an AI security expert at Lancaster University, said: “Organisations should treat LLMs like any other critical software component – one that requires rigorous security testing, ongoing red teaming and contextual threat modelling. Yes, jailbreaks are a concern, but without understanding the whole AI stack, accountability will remain superficial. Real security requires not just responsible discovery, but also responsible design and deployment practices,” he added.
OpenAI, the firm that built ChatGPT, said its latest model o1 can reason about a firm's security policies, which improves its resistance to jailbreaks.
The company added that it is always investigating ways to make the programs more powerful.
Meta, Google, Microsoft and Anthropic have been contacted for comment. Microsoft responded with a link to a blog post about its work to protect against jailbreaks. /Telegraph/
Promo
Advertise herePrigozhin - Putin war
More
16 billion Apple, Facebook and Google passwords exposed in what is being described as a "historic data leak"

Google warns two billion Gmail users to change their passwords

Finally, your Facebook account will become much harder to crack

Meta warns: Do not share this information with AI

eBay with new logo

How to download videos from YouTube easily

House for sale with an area of 360 m² in the Pejton neighborhood in Pristina

104.5m² comfort - Luxurious apartment with an attractive view for your offices

Invest in your future - buy a flat in 'Arbëri' now! ID-140

Apartment for sale in Fushë Kosovë in a perfect location - 80.5m², price 62,000 Euro! ID-254

Ideal for office - apartment for rent ID-253 in the center of Pristina

Partners in elegance: Telegrafi and Melodia Px offer 4 Pierre Cavelli ties for only 19.95 euros!

Complete and shine on your prom night with the agreement between Telegrafi and Melodia PX!

For only €29.95 with Telegrafi Deals and Melodia PX, these sneakers can be yours!

Deal: Melodia Px and Telegrafi Deals have agreed to offer women's Nike sneakers for only €69.95, until March 09th!

Will we see you at the Balkan eCommerce Summit 2025?
Most read

Top 11 transfers expected to happen this summer - Isak, Sesko, Gyokeres and some other top stars

Trump declares again: I deserved the Nobel Prize for resolving the Kosovo-Serbia problem

Several cars are engulfed in flames on Breg i Diellit in Pristina

Man City is considering a new project without Guardiola - they have found the 'perfect' replacement for the Spaniard

Historically significant discoveries in Istog, Archaeological Institute and University of Michigan shed light on early history of Kosovo

With the move to Liverpool – Wirtz is expected to break Klopp's "golden rule" that he had established at Anfield