Real work across real platforms. Click any domain to explore case studies, methodology, and examples.
End-to-end contribution to large language model development - from building high-quality training datasets through video annotation, to evaluating model outputs, to adversarially stress-testing safety guardrails.
Policy enforcement, escalation management, and platform integrity across Meta, Twitter, and gaming ecosystems.
High-volume, high-stakes content review across hate speech, violence, CSAM, and policy violations at scale.
Sentiment analysis, crisis detection, and abuse pattern reporting for brand communities and platform health.
Case studies, examples, and methodology from across my domains.
Adversarial prompting case study - identifying jailbreak chains and persona override attacks on production LLMs.
Decision framework and edge case examples - how context, intent, and severity determine enforcement actions.
Frame-by-frame action labeling for robotic manipulation - building training data for next-generation embodied AI.
Structured evaluation rubrics for assessing safety, accuracy, helpfulness, and policy compliance of AI-generated content.
How I assess flagged content - context, intent, severity, and action. A structured approach built over 4+ years.
Weekly community health reporting - identifying abuse patterns, sentiment shifts, and crisis events before they escalate.

Started reviewing content at scale in 2019. Six years later, I adversarially test large language models - finding the gaps before bad actors do. I bring a rare combination: the instincts of someone who has reviewed thousands of real harmful cases, and the technical curiosity to understand how AI systems fail.
My background in psychology and history gives me an unusual lens - I think about why systems fail, not just how. I'm building toward AI Governance and Responsible AI.
Open to remote roles in Trust & Safety, LLM Training, and AI Governance. If you're building safer AI, I'd like to help.