Can we estimate the distinct paths given a group of nodes and connections

up vote
1
down vote

favorite

If we have a system of 30,000 nodes and 126,000 connections, where there is not necessarily a path from any given node to another, are we able to get an estimate for the number of distinct paths (or indirect connections) in the system?

To give a small example, say we have the following nodes and connections:

a <-> b

b <-> c

x <-> y

This gives us the following paths (total = 8):

a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x

Given a system of 30k nodes and 126k connections, can we estimate the total "paths". I assume that the actual number will depend highly on how interconnected all the nodes are but is there way to calculate based on an even distribution what the total number might be?

I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.

asked Jul 17 at 15:17

Jacob Regan

412

1

interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
â€“Â gt6989b
Jul 17 at 15:21

1

@gt6989b Yes, connections are all 2-way
â€“Â Jacob Regan
Jul 17 at 15:22

1

Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
â€“Â gt6989b
Jul 17 at 15:25

Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
â€“Â user145640
Jul 17 at 15:27

As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
â€“Â Ingix
Jul 18 at 13:15

Â |Â
show 1 more comment

up vote
1
down vote

favorite

To give a small example, say we have the following nodes and connections:

a <-> b

b <-> c

x <-> y

This gives us the following paths (total = 8):

a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x

I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.

asked Jul 17 at 15:17

Jacob Regan

412

1

interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
â€“Â gt6989b
Jul 17 at 15:21

1

@gt6989b Yes, connections are all 2-way
â€“Â Jacob Regan
Jul 17 at 15:22

1

Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
â€“Â gt6989b
Jul 17 at 15:25

Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
â€“Â user145640
Jul 17 at 15:27

As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
â€“Â Ingix
Jul 18 at 13:15

Â |Â
show 1 more comment

up vote
1
down vote

favorite

To give a small example, say we have the following nodes and connections:

a <-> b

b <-> c

x <-> y

This gives us the following paths (total = 8):

a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x

I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.

asked Jul 17 at 15:17

Jacob Regan

412

To give a small example, say we have the following nodes and connections:

a <-> b

b <-> c

x <-> y

This gives us the following paths (total = 8):

a-b, a-c, b-a, b-c, c-b, c-a, x-y, y-x

I am not a mathematician, so I'm sorry if my terminology is wildly off, but hopefully I have put across what I am looking for.

asked Jul 17 at 15:17

Jacob Regan

412

asked Jul 17 at 15:17

Jacob Regan

412

asked Jul 17 at 15:17

Jacob Regan

412

asked Jul 17 at 15:17

Jacob Regan

412

1

interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
â€“Â gt6989b
Jul 17 at 15:21

1

@gt6989b Yes, connections are all 2-way
â€“Â Jacob Regan
Jul 17 at 15:22

1

Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
â€“Â gt6989b
Jul 17 at 15:25

Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
â€“Â user145640
Jul 17 at 15:27

As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
â€“Â Ingix
Jul 18 at 13:15

Â |Â
show 1 more comment

1

interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
â€“Â gt6989b
Jul 17 at 15:21

1

@gt6989b Yes, connections are all 2-way
â€“Â Jacob Regan
Jul 17 at 15:22

1

Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
â€“Â gt6989b
Jul 17 at 15:25

Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
â€“Â user145640
Jul 17 at 15:27

As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
â€“Â Ingix
Jul 18 at 13:15

interesting problem. Are your connections always symmetric (i.e $a to b implies b to a$)?
â€“Â gt6989b
Jul 17 at 15:21

@gt6989b Yes, connections are all 2-way
â€“Â Jacob Regan
Jul 17 at 15:22

Some thoughts to ponder. If you know the number and size of connected components (in your example, $3$ and $2$) then you can compute directly. To model this, consider a random graph on $n$ vertices and independently distribute edges with probability $p$. Then, there are an expected number of $m = pn(n-1)/2$ edges, so you can work out $p$ if you know $m$ and $n$. You can look at limiting theorems in random graphs to figure out the number/expected distribution of connected components
â€“Â gt6989b
Jul 17 at 15:25

Terminology notes: paths usually refer to sequences of nodes that each have a connection, e.g. (a,b,c) would be a path. What you are asking for would usually be referred to as something like "pairs of connected nodes".
â€“Â user145640
Jul 17 at 15:27

As gt6989b noted, once you can figure out the connected components (see en.wikipedia.org/wiki/Connected_component_%28graph_theory%29) of your network, you can calculate that number exactly. Depending one what your network consists of, using a random graph may yield incorrect results. Is there a way to find out automatically and within a reasonable time (<1s), to what other nodes a given node is directly connected? If yes, it should be relatively easy to find the connected components.
â€“Â Ingix
Jul 18 at 13:15

Â |Â
show 1 more comment

1 Answer
1

active

oldest

votes

up vote
0
down vote

I'm basing this on Jacob's rough explanation on the background of the question in the comments.

I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).

First you need to create a list of all your nodes. Then you need to apply a graph traversal algorithm (https://en.wikipedia.org/wiki/Graph_traversal). I don't think you need anything more fancy than breath-first-search oder depths-first-search and I don't think it matters in this case which one you take. The only thing that needs to be done is to use data structures that make it easy to check if a given node is contained inside a possibly large 'list' of nodes (so use hash sets or something similar).

Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.

If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.

In your database, add a field "Component" and make it 1 for all nodes in $C_1$, 2 for all nodes in $C_2$, etc. Then when you later want to check if 2 nodes are connected, just get their Component values and compare them: If they are the same, they are connected, otherwise they are not connected.

answered Jul 18 at 14:53

Ingix

2,205125

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2854604%2fcan-we-estimate-the-distinct-paths-given-a-group-of-nodes-and-connections%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

I'm basing this on Jacob's rough explanation on the background of the question in the comments.

I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).

Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.

If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.

answered Jul 18 at 14:53

Ingix

2,205125

add a commentÂ |Â

up vote
0
down vote

I'm basing this on Jacob's rough explanation on the background of the question in the comments.

I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).

Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.

If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.

answered Jul 18 at 14:53

Ingix

2,205125

add a commentÂ |Â

up vote
0
down vote

I'm basing this on Jacob's rough explanation on the background of the question in the comments.

I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).

Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.

If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.

answered Jul 18 at 14:53

Ingix

2,205125

I'm basing this on Jacob's rough explanation on the background of the question in the comments.

I'll describe what I'd to if this was a one-time effort (or only needed to be done infrequently like once a year).

Starting with any node, this finds all nodes that are (directly or indirectly) connected to it. The list of nodes thus found (including the starting node) form your first connected component $C_1$.

If there are nodes not in $C_1$, take one of them as starting node and do it all again, producing $C_2$. Continue until your components contain all nodes.

answered Jul 18 at 14:53

Ingix

2,205125

answered Jul 18 at 14:53

Ingix

2,205125

answered Jul 18 at 14:53

Ingix

2,205125

answered Jul 18 at 14:53

Ingix

2,205125

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik