Bitcoin network analysis: Spring 2022


Introduction
This is the page of the Bitcoin Network analysis course, part of the Master 2 Finance Technology Data at Paris 1 Pantheon Sorbonne.
The class covers: 1)Fundamentals of Network Science, in particular network discription, node centralities, community organisation, etc. 2)Network Analysis of the Bitcoin Blockchain Data.

Objectives
This class is thought to provide an introduction to network science, and its application to financial transactions in general, and cryptocurrencies in particular.

Overview of the course
The course is composed of 18h of classes, lectures/TP.

Attending Online

For now, it seems that we will be able to have live classes. In case you are sick or "cas contact", I'll also stream the class on zoom at the following address: Zoom link

Overview of the class

Below is an overview of the class, and slides of presentations. The class is organized in 3 full days.

Day 1

Morning
  1. Introduction Slides
  2. Describing Networks Slides
  3. Gephi 1 (Q1,2,3,4)
Afternoon
  1. CentralitiesSlides
  2. Gephi 2 (Q5)
  3. Python, data structure introduction\reminder: Online notebook
  4. Networkx: Start with the tutorial
  5. Networkx getting started: Exercises

Day 2

Morning: Bitcoin Network description
  1. Lecture PDF
  2. Pandas reminder
  3. Start analyzing data: pick a day and download the data from the Data section (older dates are simpler, less data). Describe it (total transactions, amounts spent, evolution per hour, top actors, etc. Then, transfrom into a graph using nx.from_pandas_edgelist and try to visualize, analyze the graph.

Afternoon: Communities - Project
  1. Lecture: communities PDF
  2. Community Experiment - Notebook solution
  3. Lecture: Project PDF
  4. Start thinking on the project

Day 3

Morning: Machine Learning on Graphs : Exercises - PDF - Proposed solution (without guarantee!)
Afternoon: Project
demo_ML-BTC.ipynb

Datasets

Here are some network datasets to play with:
  • GOT: small network of interactions between caracters in the Game of Thrones Book series
  • airports: medium network of airline connetions between airports
  • Top 100 known actors: Cumulated transactions between the 100 top known actors according to WalletExplorer
  • Transactions by day: each file corresponds to a day. Format is "parquet", so load with pd.read_parquet("myfile"). You need to first install the package "pyarrow", typically with "pip install pyarrow". value:value in satoshi.time: timestamp of the transaction.src_identity: source of the transaction.dst_identity: destination of the transaction.PriceUSD: Value of a bitcoin in USD for that day. The id of actors can be: a name (known actor), an integer (group of addresses, "wallet"), a bitcoin address (address belongs to no cluster).

Evaluation: Project

Students work in groups of 2, 3 or 4.
They have to send me their project before Sunday, March 6, 2022 March 20, 2022, 23:59.
Here are some examples of projects from previous years, provided without any guarantee: I picked randomly some projects. Examples

Tools

For tutorials and the experimental part of lectures, you need to use some softwares, detailed below.

Gephi

Gephi is a software for basic graph manipulation and visualization. Although you can't do much in term of graph analytics, it is convenient to explore and visualize graphs of small to medium size ( < 1000 nodes).
It can be donwloaded there : Gephi.
Gephi requires Java 8, and suffer from a few bugs on windows (but there is no better alternative). Here are solutions to common problems: On linux, you need to use the official JRE Java packages

Python

Most of the experiments are done in python. If you're not familiar with this language, there are numerous tutorials on the web. A good one for instance is from w3schools. If you want to be all set-up for experiments, here is a list of packages we will use. Note that some of them are only available with pip, and not anaconda. If you're using anaconda, you can neverthless use them, using the pip command (pip install package_name).
  • networkx. Generic network analysis
  • notebook. Jupyter notebook
  • cdlib. Community detection
  • tnetwork. Temporal networks
  • scikit-learn. Machine learning/Data mining
  • seaborn. ploting library