Can we estimate the distinct paths given a group of nodes and connections
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
If we have a system of 30,000 nodes and 126,000 connections, where there is not necessarily a path from any given node to another, are we able to get an estimate for the number of distinct paths (or indirect connections) in the system?
To give a small example, say we have the following nodes and connections:
a <-> b
b <-> c
x <-> y
This gives us the following paths (total = 8):
a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x
Given a system of 30k nodes and 126k connections, can we estimate the total "paths". I assume that the actual number will depend highly on how interconnected all the nodes are but is there way to calculate based on an even distribution what the total number might be?
I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.
graph-theory combinations
 |Â
show 1 more comment
up vote
1
down vote
favorite
If we have a system of 30,000 nodes and 126,000 connections, where there is not necessarily a path from any given node to another, are we able to get an estimate for the number of distinct paths (or indirect connections) in the system?
To give a small example, say we have the following nodes and connections:
a <-> b
b <-> c
x <-> y
This gives us the following paths (total = 8):
a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x
Given a system of 30k nodes and 126k connections, can we estimate the total "paths". I assume that the actual number will depend highly on how interconnected all the nodes are but is there way to calculate based on an even distribution what the total number might be?
I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.
graph-theory combinations
1
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
1
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
1
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15
 |Â
show 1 more comment
up vote
1
down vote
favorite
up vote
1
down vote
favorite
If we have a system of 30,000 nodes and 126,000 connections, where there is not necessarily a path from any given node to another, are we able to get an estimate for the number of distinct paths (or indirect connections) in the system?
To give a small example, say we have the following nodes and connections:
a <-> b
b <-> c
x <-> y
This gives us the following paths (total = 8):
a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x
Given a system of 30k nodes and 126k connections, can we estimate the total "paths". I assume that the actual number will depend highly on how interconnected all the nodes are but is there way to calculate based on an even distribution what the total number might be?
I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.
graph-theory combinations
If we have a system of 30,000 nodes and 126,000 connections, where there is not necessarily a path from any given node to another, are we able to get an estimate for the number of distinct paths (or indirect connections) in the system?
To give a small example, say we have the following nodes and connections:
a <-> b
b <-> c
x <-> y
This gives us the following paths (total = 8):
a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x
Given a system of 30k nodes and 126k connections, can we estimate the total "paths". I assume that the actual number will depend highly on how interconnected all the nodes are but is there way to calculate based on an even distribution what the total number might be?
I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.
graph-theory combinations
asked Jul 17 at 15:17
Jacob Regan
412
412
1
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
1
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
1
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15
 |Â
show 1 more comment
1
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
1
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
1
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15
1
1
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
1
1
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
1
1
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
up vote
0
down vote
I'm basing this on Jacob's rough explanation on the background of the question in the comments.
I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).
First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).
Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.
If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.
In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
I'm basing this on Jacob's rough explanation on the background of the question in the comments.
I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).
First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).
Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.
If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.
In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.
add a comment |Â
up vote
0
down vote
I'm basing this on Jacob's rough explanation on the background of the question in the comments.
I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).
First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).
Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.
If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.
In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
I'm basing this on Jacob's rough explanation on the background of the question in the comments.
I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).
First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).
Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.
If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.
In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.
I'm basing this on Jacob's rough explanation on the background of the question in the comments.
I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).
First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).
Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.
If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.
In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.
answered Jul 18 at 14:53


Ingix
2,205125
2,205125
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2854604%2fcan-we-estimate-the-distinct-paths-given-a-group-of-nodes-and-connections%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
– gt6989b
Jul 17 at 15:21
1
@gt6989b Yes, connections are all 2-way
– Jacob Regan
Jul 17 at 15:22
1
Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
– gt6989b
Jul 17 at 15:25
Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
– user145640
Jul 17 at 15:27
As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
– Ingix
Jul 18 at 13:15