I am a Researcher specialising in reinforcement learning, multi-agent systems, and large language models. My work focuses on applying these models in the medical and financial domains, designing multi-agent frameworks, improving foundational models, and leading various projects. I also led the team behind Laila, a fine-tuned version of Llama 3.1 (405B, 70B, and 8B) that assists biologists by interfacing with real lab equipment. Currently, I am conducting fundamental AI research in large-scale training and reinforcement learning.
Working on fundamental AI research in reinforcement learning and large-scale training.
I helped design a high-performance multi-agent reinforcement learning framework, Mava. I also led the team behind Laila, a fine-tuned version of Llama 3.1 designed to assist biologists by interfacing with real lab equipment. Additionally, I led a team in developing an assistant that autonomously experiments with internal repositories to discover code improvements, optimising for downstream performance metrics.
Conducted research on cutting-edge machine learning algorithms for biomedical applications, including optical character recognition, computer vision, and 3D body tracking, during three separate internship periods.
Conducted research on various machine learning recommender systems, culminating in a presentation on their potential applications within the company.
Doctorate focusing on reinforcement learning in multi-agent systems, completed under the supervision of Prof. Willie Brink, Prof. Herman A. Engelbrecht, and Dr Arnu Pretorius.
Graduated with a Cum Laude award. Research project focused on integrating deep neural networks with probabilistic models, supervised by Prof. Johan du Preez.
Graduated with multiple Cum Laude awards for academic excellence.
For my final six months of undergraduate engineering studies, I developed a test grading system for the Applied Mathematics Department to replace the costly, custom multiple-choice templates. This system automatically detects handwritten student numbers, surnames, numeric responses, and multiple-choice answers, relying on a core data-driven machine learning pipeline. The most challenging aspect of this project was setting up a scalable and robust data pipeline. It needed to be resilient to human error, which can significantly impact model performance, while also being scalable enough to handle the entire university's workload. I received the Magnum Cum Laude award for this work, and the system was adopted by the department and later by the entire university. Over the years, I have continued improving the system over multiple versions:
Version 1 (2016-2017): Focused on using a radon transform to locate the answer sheet template. Probabilistic graphical models (PGMs) were used to interpret filled-in bubbles, and a CNN identified digits. Read the report and learn more about PGMs.
Version 2 (2018-2019): A website was developed, allowing lecturers to upload tests and automatically grade them. However, manual checks were still necessary for uncertain cases.
Version 3 (2020-2023): A complete overhaul introduced QR codes for accurate positioning and a CNN-Transformer architecture for interpreting handwritten answers. The system continuously learns from lecturer corrections, improving its accuracy over time. The AutoGrade website is available here.
Version 4 (2024): The AutoGrader system has now been adopted by the entire University, marking a significant milestone in its development and implementation. Ongoing work is focused on incorporating new capabilities using the latest open-source foundational vision models, further enhancing the system's accuracy and versatility.
A project series exploring curiosity-driven reinforcement learning methods. The series focuses on building RL agents without explicit reward functions:
Curious Agents: An Introduction: In the first blog post, I provide background motivation for curiosity-based exploration in RL.
Curious Agents II: Solving MountainCar without Rewards: This post demonstrates training an agent to solve MountainCar without providing external rewards.
Curious Agents III: BYOL-Explore: Next in the series, I implement DeepMind's BYOL-Explore and demonstrate its effectiveness on JAX-based environments.
Curious Agents IV: BYOL-Hindsight: The latest post discusses BYOL-Hindsight, which addresses limitations in previous curiosity-based algorithms. I tested the algorithm in a custom 2D Minecraft-like environment.