Algorithm allowing the re-identification of Bitcoin users from heuristics based on the blockchain transaction history.
-
Install python dependencies from requirements file
pip3 install -r requirements.txt
-
Setup Apache Spark, an instance will be launched by the PySpark executor
-
Setup Neo4j, the Neo4j Graph Algorithms plugin and APOC Procedures plugin and launch an instance of it
-
Create a config files in
/app
in the following format :databaseconfig.py
: PySpark and Neo4j databases configurations
#!/usr/bin/env python pyspark = { 'hdfs_path': '/path/to_json_file_or_directory', 'memory': '2g' # Allocated memory for Spark } neo4j = { 'uri': 'bolt://localhost:7687', # Bolt instance URI 'user': 'neo4j', 'password': 'password' }
-
Running
main.py
will populate the Neo4j Graph Database with the discovered user network :python3 main.py
-
Pyspark : https://spark.apache.org/docs/0.9.0/python-programming-guide.html
-
Neo4j : https://neo4j.com/
-
Neo4j Graph Algorithms plugin : https://github.com/neo4j-contrib/neo4j-graph-algorithms
-
Neo4j APOC Procedures plugin : https://github.com/neo4j-contrib/neo4j-apoc-procedures
-
Tracking bitcoin users activity using community detection on a network of weak signals - https://arxiv.org/abs/1710.08158
-
Community detection in networks: A user guide - https://arxiv.org/abs/1608.00163
-
The Unreasonable Effectiveness of Address Clustering - https://arxiv.org/abs/1605.06369