quarta-feira, 29 de janeiro de 2014

NetworkX - Working with Graphs in Python #2 - Random Walks

This week we will work with the concept of Random Walks on Graphs.

A graph is, in very simple therms, a structure that associate Nodes with Edges. On example, think about a country, where the cities are connected with roads, so a City is a Node, and a Street that connects to another city is a Edge.

A random walk is a mathematic formalism of a path consisted of random "steps", a random choice of edges. For the statistical purpose, is a special case of a Markov Chain.

I've used a graph that is widely knows as Les Miserables, that connects the homonym book characters in order of apparition. The nodes are the characters and the edges are the amount of times that two characters appear together.


The code below executes 10 times the random walk with size 10000, starting on different nodes and random sorting the next path. The execution shows the 10 most visited vertex during the random walks. Don't forget to import NetworkX packages and MatPlotLib to plot the graphs on a window.

Random Walk Algorithm

#Reads the graph from a GML File, named lesmis.gml G = nx.read_gml("lesmis.gml",relabel=True) #Start the counter contador = 0 #Start execution counter execucoes = 0 #Execute 10 times this command sequence while execucoes < 10: #Choose a random start node vertexid = rdm.choice(G.nodes()) #Dictionary that associate nodes with the amount of times it was visited VisitedVertex = {} print("Execucao #%d" % (execucoes + 1)) #Execute the random walk with size 10000 (10000 steps) while contador < 10000: #Accumulate the amount of times each vertex is visited if vertexid in VisitedVertex: VisitedVertex[vertexid] += 1 else: VisitedVertex[vertexid] = 1 #Visualize the vertex neighborhood Vertex_Neighbors = G.neighbors(vertexid) #Choose a vertex from the vertex neighborhood to start the next random walk vertexid = rdm.choice(Vertex_Neighbors) #Iteration counter increment contador = contador + 1 #Organize the vertex list in most visited decrescent order mostvisited = sorted(VisitedVertex, key = VisitedVertex.get,reverse = True) #Separate the top 10 most visited vertex top_10 = mostvisited[:10] print(top_10) #Restart the cycle execucoes = execucoes + 1 contador = 0 print("Feche a janela de exibicao para encerrar.") #Draw graph to a window nx.draw_circular(G) plt.savefig('myfig') plt.show() This is useful to visualize the nodes that have more connections than others inside a graph. On a social network, this corresponds to a person that have many people connected to its profile.

Random Walks is a real complex concept, and you can check on Graph Theory books for more information. This is just a didatic sample to help you understand the concept.

Hope it's useful! Enjoy! \o

Nenhum comentário:

Postar um comentário