End-to-end contribution to large language model development — adversarially stress-testing safety guardrails, evaluating model outputs for quality and compliance, and building high-quality training datasets through video annotation.
Adversarially stress-testing large language models to find safety vulnerabilities, policy gaps, and alignment failures before they reach users. Every attack is documented, classified by severity, and reported with remediation recommendations.
All prompts were tested on Gemini. Harmful model outputs are shown as proof of guardrail failure. Prompts are shown in full as they contain no harmful information themselves — the vulnerability lies in how the model responded.
Structured evaluation of LLM-generated responses for safety, accuracy, helpfulness, and policy compliance — human-in-the-loop review at the RLHF frontier. Every response is scored against a consistent rubric.
| Dimension | What I Assessed | Failure Signals |
|---|---|---|
| Task Completion | Did the model actually do what was asked? Is the response complete, on-topic, and useful to the user? | Incomplete answer · Off-topic response · Misunderstood the request |
| Factual Accuracy | Are the facts, claims, and reasoning correct? Does the model hallucinate or present false information confidently? | Fabricated citations · Wrong facts · False confidence · Outdated information |
| AI Performance | Is the grammar correct? Is the response well-formatted, appropriately toned, agreeable, and easy to read? | Poor grammar · Bad formatting · Wrong tone · Sycophancy · Low readability |
Vision-Language-Action annotation across three video categories — building training data that teaches AI systems to understand intent, execute actions, and recover from failure.
I observe a table containing several objects, including a monitor, snacks, and a blue chips packet, along with a Pepsi can serving as the target destination. Since I want to grasp the blue chips packet, I lower my gripper downward toward it, aligning the jaws on either side of the packet. I expect my gripper to be correctly positioned around the packet, ready to close and secure it.
My gripper is now positioned around the blue chips packet as expected, with the jaws aligned on either side. Since I want to secure the packet for lifting, I close my gripper firmly around the blue chips packet. I expect my gripper to achieve a firm, stable hold on the packet.
My gripper has secured a firm hold on the blue chips packet as expected. Since I want to move the packet closer to the Pepsi can, I move my arm upward and forward in the direction of the Pepsi can, keeping my gripper orientation stable. I expect my arm to be in close proximity to the Pepsi can, ready to lower and release.
My arm is now near the Pepsi can as expected. Since I want to place the packet down on the table, I lower my arm downward toward the table surface near the Pepsi can. I expect the packet to make contact with the table surface.
The packet has made contact with the table surface near the Pepsi can as expected. Since I want to release it at this location, I open the gripper, releasing the blue chips packet onto the table surface near the Pepsi can. I expect the packet to remain stably placed on the table, with the gripper fully disengaged and clear of the object.
I am in an open area with the LOL Comedy Club building visible to my left. Since I want to stand facing the building, I walk forward while moving rightwards simultaneously, looking toward the left. I expect to be standing facing the LOL Comedy Club building.
I am now facing the LOL Comedy Club building as expected. Since I want to position myself directly in front of the gate, I walk forward while moving rightwards to get in front of the entrance gate. I expect to be standing directly in front of the gate, ready to enter.
I am now directly in front of the gate as expected. Since I want to enter the LOL Comedy Club, I move forward while slightly turning to the left, passing through the gate. I expect to be inside the LOL Comedy Club.
I am descending the slope at speed. A jump ramp is visible ahead. My knees are bent and my weight is shifted forward. Since I want to clear the ramp cleanly, I must compress my body further to load energy into the legs before the lip, maintaining forward lean. I deepen my crouch, bending my knees to approximately 90°, and extend both arms forward to balance my centre of mass over the skis. I expect to reach the ramp in a fully loaded, stable position, ready to extend my legs for a controlled takeoff.