About

I am a Data Science student at the University of Melbourne with a focus on risk analysis, fraud detection, and behavioural analytics. My work centres on understanding how systems fail, how data can reveal manipulation, and how complex environments can be modelled and analysed.

I am particularly interested in financial crime, cybersecurity, and social systems where individual behaviour aggregates into measurable patterns.

Projects

Fraud Detection and Network Forensics (Ongoing)

Python | Statistical Modelling | Network Analysis

Building a fraud detection and transaction network forensics system using the Elliptic Bitcoin dataset, which contains labelled transaction data (licit vs illicit) together with engineered temporal and structural features. The dataset is modelled as a directed transaction graph, where nodes represent entities and edges represent transfers, allowing analysis of how funds propagate through the network over time.

The project combines statistical modelling and network analysis to identify suspicious behaviour at both the node and subgraph level. Using Python with libraries such as pandas and NumPy for data handling, scikit-learn for classification models, and NetworkX for graph construction and analysis, the work focuses on detecting anomalous transaction patterns, clustering behaviour, and structural signatures associated with illicit activity.

Key objectives include:

  • Identifying high-risk nodes using supervised models trained on known illicit transactions
  • Detecting anomalous patterns through feature distributions, temporal shifts, and network centrality measures
  • Analysing transaction flows to uncover coordinated behaviour, layering patterns, and potential laundering structures
  • Exploring how fraud manifests within local neighbourhoods of the graph rather than isolated transactions

The expectation is that fraudulent activity will not appear as random noise, but as structured behaviour within the network, such as tightly connected clusters, unusual flow patterns, or nodes with disproportionate influence or connectivity. The project aims to bridge statistical inference with graph-based reasoning to better understand how illicit financial behaviour emerges and propagates.

Quantum Computing in Bioinformatics (Internship)

Qiskit | Optimisation | Scientific Computing | Research

Completed a research internship at the Walter and Eliza Hall Institute of Medical Research, focusing on improving the usability and correctness of a quantum protein-folding model implemented in Qiskit. The work addressed technical and structural issues in a codebase handed down across multiple interns, where poor documentation and incorrect parameterisation limited practical use of the model.

The onboarding process was redesigned by replacing an extensive set of academic readings with a structured glossary and streamlined documentation, allowing new contributors to understand key concepts and begin working with the model significantly faster. In parallel, the implementation was refactored through detailed inline commentary, clarifying theoretical background, modelling assumptions, and execution flow directly within the code.

Key components included:

  • Designing a simplified onboarding pathway by consolidating essential concepts into a concise reference document, reducing reliance on external academic sources
  • Refactoring a Jupyter notebook implementation with structured inline documentation to improve readability and accessibility for users without prior quantum computing experience
  • Diagnosing failures in the Hamiltonian formulation, where constraint penalties (e.g. backtracking and long-range folding) were incorrectly scaled, preventing meaningful interaction between amino acids
  • Rescaling constraint terms to restore valid interaction dynamics and enable the optimiser to explore physically meaningful folding configurations
  • Replacing gradient-based optimisation with a non-derivative method better suited to the discrete and non-linear structure of the problem space
  • Testing the corrected model on simulated configurations to confirm functional behaviour while identifying remaining computational inefficiencies
  • Migrating the project from a Google Drive notebook to a structured GitHub repository, introducing version control and improving reproducibility
  • Producing a comprehensive technical report and handover documentation to enable future contributors to quickly understand and extend the model

The project emphasised practical problem-solving in complex optimisation systems, combining debugging, model correction, and documentation improvements to transform an unusable research prototype into a functional and maintainable framework for future work.

Where2 (Co-Founder & Developer)

React Native | Backend Systems | Supabase | Mobile Development

Co-founded and co-developing Where2, a mobile application designed to support group coordination through shared itineraries, messaging, and collaborative decision-making. The project spans the full product lifecycle, from initial concept and validation through to system design, implementation, and early-stage deployment.

Designed and implemented a multi-user system where plans, locations, and user interactions are synchronised in real time across devices. The application is built in React Native, with a backend architecture using Supabase to manage persistent state, authentication, and real-time updates across distributed users.

Key components included:

  • Designing a relational database schema to model users, plans, locations, and social relationships, including following systems and access control for shared plans
  • Implementing backend services for user identity, authentication, and state management across concurrent users interacting within the same plan
  • Handling real-time synchronisation of plan updates, messaging, and location data across multiple clients, ensuring consistency of shared state
  • Structuring application logic to manage user interactions such as invitations, following relationships, and collaborative editing of itineraries
  • Developing front-end interfaces in React Native based on iterative Figma prototypes, with emphasis on clarity of interaction and usability in multi-user workflows

Beyond engineering, responsible for product direction and early-stage execution, including developing pitch materials, producing demo videos, and communicating the product to potential users and stakeholders. This involved framing the problem space, demonstrating functionality, and iterating on product positioning.

The project focuses on building a scalable, real-time system for coordinating group activity, balancing technical implementation with product viability. Emphasis is placed on managing shared state across users, designing robust backend structures, and ensuring consistent behaviour in a multi-user environment.

CSIRO Image2Biomass Datathon

Python | PyTorch | Computer Vision | Deep Learning

Developed an end-to-end image-based prediction pipeline to estimate pasture biomass from high-resolution aerial imagery as part of the CSIRO Image2Biomass datathon. The project focused on translating large, unstructured visual data into quantitative environmental measurements, modelling biomass as a regression problem across multiple targets including green, dry, and dead plant matter.

Raw wide-format images were preprocessed by splitting into square views and applying standardisation and normalisation, aligning the data with the input requirements of modern vision models and ensuring consistent feature extraction. A self-supervised Vision Transformer (DINO) was used as a feature extractor, with representations adapted for multi-output regression using PyTorch and optimised via mean squared error loss.

Key components included:

  • Designing a preprocessing pipeline to transform large aerial images into model-compatible inputs while preserving spatial and textural information relevant to biomass estimation
  • Training multiple regression models on different data splits and combining predictions using weighted averaging to improve robustness and reduce overfitting
  • Implementing cross-validation to ensure generalisation under limited labelled data and evaluating model performance across multiple biomass targets
  • Optimising GPU usage through mixed-precision computation, controlled batch sizing, release of intermediate tensors, and disabling gradient tracking during inference to manage memory constraints efficiently
  • Applying post-processing adjustments to correct systematic prediction bias and improve alignment with observed biomass distributions

The project emphasised building a reliable prediction system under real-world constraints, balancing model performance, computational efficiency, and robustness to noisy data. Final outputs were structured for scientific evaluation and operational use, reflecting practical requirements in environmental monitoring and agricultural analytics.

Flood Simulation

QGIS | Flood Modelling | Spatial Analysis

Developed a geospatial flood modelling and risk assessment framework to analyse inundation patterns and evaluate mitigation strategies across flood-prone regions, including areas in Queensland and St Kilda. The project used GIS-based simulation to model flood behaviour under varying severity scenarios, treating flooding as a spatial system influenced by terrain, water flow, and environmental conditions rather than a static event.

Using tools such as ArcGIS, QGIS, GRASS GIS, and raster-based modelling techniques, the analysis simulated flood propagation across different landscapes and assessed how water levels and flow paths responded to structural interventions. Mitigation strategies including levees, temporary flood barriers, and sandbagging were incorporated into the model as physical constraints, allowing evaluation of their impact on flood extent, depth, and affected infrastructure.

Key components included:

  • Constructing raster-based flood models to simulate inundation under different severity levels and environmental conditions
  • Incorporating intervention scenarios by modifying terrain and flow constraints to represent flood protection measures
  • Analysing spatial impact in terms of affected regions, infrastructure exposure, and changes in flood extent under each mitigation strategy
  • Conducting a long-term cost analysis over a 50-year horizon, incorporating factors such as infrastructure durability, labour requirements, deployment logistics, and estimated damage to buildings and interiors
  • Accounting for worsening flood conditions under climate change scenarios to evaluate robustness of different mitigation approaches over time

The project integrated spatial modelling with economic reasoning to identify optimal flood mitigation strategies, providing recommendations on both the type and placement of interventions based on simulated outcomes and cost-benefit considerations. The objective was not only to model flood behaviour, but to support decision-making under uncertainty in a real-world risk management context.

Crash Data Analysis

Python | Data Analysis | Visualisation

Analysed a structured traffic incident dataset to explore how environmental and road conditions relate to accident severity. The dataset included features such as lighting conditions, speed limits, road type, intersection design, and time of day, allowing examination of how different contextual factors correlate with more severe outcomes.

The project focused on exploratory data analysis rather than predictive modelling, using Python with libraries such as pandas and NumPy for data manipulation, and matplotlib and seaborn for visualisation. This involved cleaning and structuring the dataset, examining distributions, and generating comparative visualisations to identify patterns across different conditions.

Key components included:

  • Exploring the distribution of accident severity across variables such as lighting, speed limits, and road type
  • Creating visual comparisons, such as bar charts and grouped plots, to identify differences in severity under varying conditions
  • Analysing how combinations of features, such as time of day and lighting conditions, relate to changes in observed outcomes
  • Identifying patterns and trends in the data that suggest higher-risk scenarios or environments

The project emphasised interpreting visual and descriptive patterns in the data to understand how accident severity varies across different contexts. Rather than building predictive models, the focus was on extracting meaningful insights from the data and communicating those findings clearly through visual analysis.

Skills

  • Programming & Development Python, C, SQL, React Native, Supabase
  • Data Analysis & Machine Learning pandas, NumPy, scikit-learn, PyTorch, statsmodel
  • Risk, Fraud & Security Anomaly detection, transaction analysis, network investigation
  • Visualisation & Communication Tableau, Power BI, matplotlib, seaborn

Contact

Open to internships and opportunities in data science, risk, and security.