2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970


Archive

About   Editorial Board   Contacts   Template   Publication Ethics   Peer Review Process

Theory of Probability and Mathematical Statistics
(Teoriya Imovirnostei ta Matematychna Statystyka)



PAGERANK FOR NETWORKS, GRAPHS AND MARKOV CHAINS

C. ENGSTROM, S. SILVESTROV

Download PDF

Abstract: In this work it is described how a partitioning of a graph into components can be used to calculate PageRank in a large network and how such a partitioning can be used to re-calculate PageRank as the network changes. Although considered problem is that of calculating PageRank, it is worth to note that the same partitioning method could be used when working with Markov chains in general or solving linear systems as long as the method used for solving a single component is chosen appropriately. An algorithm for calculating PageRank using a modi ed partitioning of the graph into strongly connected components is described. Moreover, the paper focuses also on the calculation of PageRank in a changing graph from two di erent perspectives, by considering speci c types of changes in the graph and calculating the di erence in rank before and after certain types of edge additions or removals between components. Moreover, some common speci c types of graphs for which it is possible to nd analytic expressions for PageRank are considered, and in particular the complete bipartite graph and how PageRank can be calculated for such a graph. Finally, several open directions and problems are described.

Keywords: PageRank, random walk, Markov chain, graph, strongly connected component.

Bibliography:
1. Google web graph. Google programming contest, 2002, http://snap.stanford.edu/data/web-Google.html.
2. F. Andersson and S. Silvestrov, The mathematics of Internet search engines, Acta Appl. Math. 104 (2008), 211-242.
3. K. E. Avrachenkov, J. A. Filar, and P. G. Howlett, Analytic Perturbation Theory and Its Applications, SIAM, Philadelphia, PA, 2013.
4. A. Arasu, J. Novak, A. Tomkins, and J. Tomlin, Pagerank computation and the structure of the web: Experiments and algorithms, Proceedings of the Eleventh International Conference on World Wide Web, Alternate Poster Tracks, 2002.
5. L. Becchetti and C. Castillo, The distribution of pagerank follows a power-law only for particular values of the damping factor, Proceedings of the 15th international conference on World Wide Web, ACM Press, 2006.
6. R. Bellman, Introduction to Matrix Analysis, Classics in Applied Mathematics, Society for Industrial and Applied Mathematics, 1970.
7. A. Berman and R. Plemmons, Nonnegative Matrices in the Mathematical Sciences, No. del 11 in Classics in Applied Mathematics, Society for Industrial and Applied Mathematics, 1994.
8. S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine, Proceedings of the Seventh International Conference on World Wide Web (Amsterdam, The Netherlands, 1998), Elsevier Science Publishers B. V., 107-117.
9. D. Dhyani, S. S. Bhowmick, and W.-K. Ng, Deriving and verifying statistical distribution of a hyperlink-based web page quality metric, Data Knowl. Eng. 46, no. 3 (Sept. 2003), 291-315.
10. C. Engstrom and S. Silvestrov, A componentwise pagerank algorithm, Applied Stochastic Models and Data Analysis (ASMDA 2015), The 16th Conference of the ASMDA International Society, 2015, 186-199.
11. C. Engstrom and S. Silvestrov, Non-normalized pagerank and random walks on n-partite graphs, 3rd Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2014), 2015, 193-202.
12. C. Engstrom and S. Silvestrov, Using graph partitioning to calculate pagerank in a changing network, 4th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2016), 2016, 155-164.
13. C. Engstrom and S. Silvestrov, Calculating pagerank in a changing network with added or removed edges, International Conference in Nonlinear Problems in Aviation and Aerospace (ICNPAA), AIP Conference Proceedings, vol. 1798, 2017.
14. C. Engstrom and S. Silvestrov, Generalisation of the damping factor in PageRank for weighted networks, Modern Problems in Insurance Mathematics (D. Silvestrov and A. Martin-Lof, eds.),EAA series, Springer, Cham, 2014, 313-333.
15. C. Engstrom and S. Silvestrov, PageRank, a look at small changes in a line of nodes and the complete graph, Engineering Mathematics II. Algebraic, Stochastic and Analysis Structures for Networks, Data Classi cation and Optimization (S. Silvestrov and M. Rancic, eds.), Springer Proc. Math. Stat., vol. 179, Springer, Heidelberg, 2016, 223-247.
16. C. Engstrom and S. Silvestrov, PageRank, connecting a line of nodes with a complete graph, Engineering Mathematics II. Algebraic, Stochastic and Analysis Structures for Networks, Data Classi cation and Optimization (S. Silvestrov and M. Rancic, eds.), Springer Proc. Math. Stat., vol. 179, Springer, Heidelberg, 2016, 249-274.
17. F. Gantmacher, The Theory of Matrices, Chelsea, 1959.
18. M. Gyllenberg and D. S. Sivestrov, Quasi-Stationary Phenomena in Nonlinearly Perturbed Stochastic Systems, De Gruyter Exp. Math., vol. 44, Walter de Gruyter, Berlin, 2008.
19. T. Haveliwala and S. Kamvar, The second eigenvalue of the google matrix, Technical Report 2003-20, Stanford InfoLab, 2003.
20. H. Ishii, R. Tempo, E.-W. Bai, and F. Dabbene, Distributed randomized pagerank computation based on web aggregation, Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference, CDC/CCC 2009, Proceedings of the 48th IEEE Conference, 3026-3031.
21. S. Kamvar and T. Haveliwala, The condition number of the pagerank problem, Technical Report 2003-36, Stanford InfoLab, June 2003.
22. S. Kamvar, T. Haveliwala, and G. Golub, Adaptive methods for the computation of pagerank, Linear Algebra Appl. 386 (2004), 51-65.
23. V. S. Koroliuk and N. Limnios, Stochastic Systems in Merging Phase Space, World Scienti c, Singapore, 2005.
24. P. Lancaster, Theory of Matrices, Academic Press, 1969.
25. A. N. Langville and C. D. Meyer, A reordering for the pagerank problem, SIAM J. Sci. Comput. 27, no. 6 (Dec. 2005), 2112-2120.
26. C. P. Lee, G. H. Golub, and S. A. Zenios, A two-stage algorithm for computing pagerank and multistage generalizations, Internet Mathematics 4 (2007), 299-327.
27. J. Leskovec and A. Krevl, SNAP Datasets: Stanford large network dataset collection, http://snap.stanford.edu/data, June 2014.
28. D. Silvestrov and S. Silvestrov, Asymptotic expansions for stationary distributions of perturbed semi-Markov processes, Engineering Mathematics II. Algebraic, Stochastic and Analysis Structures for Networks, Data Classi cation and Optimization (S. Silvestrov and M. Rancic, eds.), Springer Proc. Math. Stat., vol. 179, Springer, Heidelberg, 2016, 151-222.
29. D. Silvestrov and S. Silvestrov, Asymptotic expansions for stationary distributions of non-linearly perturbed semi-Markov processes. I, II (2016). Part I: arXiv:1603.04734, Part II: arXiv:1603.04743.
30. D. Silvestrov and S. Silvestrov, Nonlinearly Perturbed Semi-Markov Processes, SpringerBriefs Probab. Math. Stat., 2017. (To appear.)
31. R. Tarjan, Depth rst search and linear graph algorithms, SIAM J. Comput. 1 (1972), no. 2, 146-160.
32. Q. Yu, Z. Miao, G. Wu, and Y. Wei, Lumping algorithms for computing google's pagerank and its derivative, with attention to unreferenced nodes, Information Retrieval 15 (2012), no. 6, 503-526.