Influence Maximization in Continuous Time Diffusion Networks

The problem of finding the optimal set of source nodes in a diffusion network that maximizes the spread of information, influence, and diseases in a limited amount of time depends dramatically on the underlying temporal dynamics of the network. However, this still remains largely unexplored to date.

INFLUMAX finds the set of most influential source nodes in the continuous time influence maximization problem which accounts for the temporal dynamics of diffusion networks. It uses continuous time Markov chains to analytically compute the average total number of nodes reached by a diffusion process starting in any set of source nodes and exploits submodularity for finding a provable near-optimal solution efficiently.

Below, you can find some extra information:

About

INFLUMAX builds on the fully continuous time model of diffusion recently introduced by Gomez-Rodriguez et al. (ICML, 2011). This model accounts for temporally heterogeneous interactions within a diffusion network – it allows information (or influence) to spread at different rates across different edges, as shown in real-world examples. We compute the average total number of infected nodes analytically using the work of Kulkarni (Networks, 1986). The key observation is that the infection time of a node in a network with stochastic edge lengths is the length of the stochastic shortest path from the source nodes to the node. We then provide an approximation algorithm that finds a suboptimal set of source nodes with provable guarantees in terms of the average total number of infected nodes. For more details read our paper:

M. Gomez-Rodriguez, B. Schölkopf. Influence Maximization in Continuous Time Diffusion Networks.The 29th International Conference on Machine Learning (ICML), 2012.

Download the code

We provide an implementation of our algorithm in a comprehensive C++ package with some Fortran sections (Linux/MacOS). For networks operations, we used the library SNAP, developed by Jure Leskovec at Stanford University. For computing dominator trees, we hacked this code, by L. Georgiadis, R. E. Tarjan, and R. F. Werneck. For computing matrix exponentials, we hacked Expokit, by R. B. Sidje. If you have suggestions, comments, or you discover any bug in the the code, please contact manuelgr[at]stanford.edu.

Network Input format: The input file, with the network, should have two blocks separated by a blank line. Each line in the first block contains the id and name of a node:

<id>,<name>

Each line in the second block contains information about the edges:

<source>,<destination>,<alpha value>

Example of a valid input file:

0,nyt.com 1,latimes.com 2,huffingtonpost.com 3,salon.com 4,gizmodo.com 5,gawker.com 0,1,0.3 1,3,0.8 4,5,1.5 2,7,0.01 1,5,0.5

Within the package, you can find additional information in ReadMe.txt. We also provide code for generating synthetic networks, and sample input data as a toy.

Papers

To learn more about our algorithm, you can download our paper:

Moreover, we would like to acknowledge that we build on previous work: