Fraud Detection and Network Forensics (Ongoing)
Building a fraud detection and transaction network forensics system using the Elliptic Bitcoin dataset, which contains labelled transaction data (licit vs illicit) together with engineered temporal and structural features. The dataset is modelled as a directed transaction graph, where nodes represent entities and edges represent transfers, allowing analysis of how funds propagate through the network over time.
The project combines statistical modelling and network analysis to identify suspicious behaviour at both the node and subgraph level. Using Python with libraries such as pandas and NumPy for data handling, scikit-learn for classification models, and NetworkX for graph construction and analysis, the work focuses on detecting anomalous transaction patterns, clustering behaviour, and structural signatures associated with illicit activity.
Key objectives include:
- Identifying high-risk nodes using supervised models trained on known illicit transactions
- Detecting anomalous patterns through feature distributions, temporal shifts, and network centrality measures
- Analysing transaction flows to uncover coordinated behaviour, layering patterns, and potential laundering structures
- Exploring how fraud manifests within local neighbourhoods of the graph rather than isolated transactions
The expectation is that fraudulent activity will not appear as random noise, but as structured behaviour within the network, such as tightly connected clusters, unusual flow patterns, or nodes with disproportionate influence or connectivity. The project aims to bridge statistical inference with graph-based reasoning to better understand how illicit financial behaviour emerges and propagates.