December 2014, Vol. 26, No. 12, Pages 2827-2854
Automatic discovery of community structures in complex networks is a fundamental task in many disciplines, including physics, biology, and the social sciences. The most used criterion for characterizing the existence of a community structure in a network is modularity, a quantitative measure proposed by Newman and Girvan (2004). The discovery community can be formulated as the so-called modularity maximization problem that consists of finding a partition of nodes of a network with the highest modularity. In this letter, we propose a fast and scalable algorithm called DCAM, based on DC (difference of convex function) programming and DCA (DC algorithms), an innovative approach in nonconvex programming framework for solving the modularity maximization problem. The special structure of the problem considered here has been well exploited to get an inexpensive DCA scheme that requires only a matrix-vector product at each iteration. Starting with a very large number of communities, DCAM furnishes, as output results, an optimal partition together with the optimal number of communities c*; that is, the number of communities is discovered automatically during DCAM’s iterations. Numerical experiments are performed on a variety of real-world network data sets with up to 4,194,304 nodes and 30,359,198 edges. The comparative results with height reference algorithms show that the proposed approach outperforms them not only on quality and rapidity but also on scalability. Moreover, it realizes a very good trade-off between the quality of solutions and the run time.