Breaking LLM Safety: Jailbreaks, Agents, and Threats
Hosted on Luma
Fetched about 1 month ago
Friday, January 30, 2026
to Friday, January 30, 2026
Artificial IntelligenceCybersecurityMachine LearningNatural Language ProcessingSocial Impact
Event Type
in person
49
Participants
4
Est. Projects
Organizers
Alex Johnson
alex@example.org
Jamie Rivera
jamie@example.org
Sam Chen
sam@example.org
Quality Score
Quality Score
72/100
High confidence
Organiser16/20
Event Maturity14/20
Sponsors18/25
Participants12/20
Modern LLM security failures are not edge cases, they emerge from how these systems are trained, scaled, and deployed. The talk breaks down famous jailbreaks and real-world attacks to show how safety guardrails are bypassed in practice. It explores how agentic systems and MCP expand the attack surface beyond prompts into memory, tools, and orchestration. We will examine patterns behind emergent misalignment and why well behaved models can fail in unexpected ways. Finally, the session covers concrete engineering strategies to anticipate, detect, and mitigate these threats before they scale.
About the speaker: Mohammed Arsalan is a generative AI consultant, prompt engineer, and AI safety researcher with over five years of experience at the intersection of LLM security, NLP, and adversarial AI. Based in Bengaluru, he is a multiple-time global hackathon winner across platforms like Hugging Face, Cohere, Adobe, and MachineHack, and a Microsoft AI Startup selectee. He actively shares research and real-world insights on emerging generative AI attacks with a large professional audience. www.linkedin.com/in/sallu-mandya/ To attend online:
Add to calendar: https://bit.ly/4t73B1c Gmeet link: https://meet.google.com/uiq-zbyc-ubw?hs=122&authuser=0
Operations12/15
Why this score
Strong organiser track record
Returning event
Well-sponsored
Missing data
Prize details
Code of conduct
Breaking LLM Safety: Jailbreaks, Agents, and Threats | Hackathon Radar