/
The Location of Papers in Topic Space-Time The Location of Papers in Topic Space-Time

The Location of Papers in Topic Space-Time - PowerPoint Presentation

goldengirl
goldengirl . @goldengirl
Follow
344 views
Uploaded On 2020-06-23

The Location of Papers in Topic Space-Time - PPT Presentation

Imperial College London Page 1 httpknowescapeorg Identification location and temporal evolution of topics meeting Part of COST Action TD1210 KNOWeSCAPE Budapest August 2930 2016 ID: 783959

imperial college page london college imperial london page amp path time space networks longest londonpage temporal network vertex dimension

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "The Location of Papers in Topic Space-Ti..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

The Location of Papers in Topic Space-Time

© Imperial College London

Page

1

http://knowescape.org/“Identification, location and temporal evolution of topics” meeting, Part of COST Action TD1210 KNOWeSCAPEBudapest August 29-30, 2016.

Work done with:

James CLOUGH

pronounced

CLUFF

Tim Evans

Centre for Complexity Science

Social and Cultural Analytics Lab

Slide2

© Imperial College London

Page 2

Time & Networks

Netometry – Networks & Geometry

DimensionCoordinate AssignmentSummary

Slide3

Networks

Network = Vertices + Edges© Imperial College London

Page 3

Slide4

Timed Edges = Temporal Edge Networks

© Imperial College LondonPage

4

Communication Networks

Email

Phone

Letters

Focus of much recent work

[

Holme & Saramäki 2012,

Lambiotte & Masuda 2016

]

Slide5

Timed Vertices = Temporal Vertex Networks

© Imperial College London

Page 5

Each vertex represents one event occurring at one time

Time constrains flowsDirected edges point in one direction onlyNo loops

Slide6

Timed Vertices = Temporal Vertex Networks

© Imperial College London

Page 6

Each vertex represents one event occurring at one time

Directed Acyclic Graph (DAG)  Computer Science

Poset (

partially ordered set)

 Discrete Mathematics

i

j

 

Slide7

Applications - Flows

Flow of Ideas & Innovations: Citation Networkspapers, patents, court judgements

Flow of Projectsmanagement of tasks, critical pathsFlow of Mathematical Logic

spreadsheet formulaeFlow of Space-Time CausalityCausal Set approach to quantum gravity

© Imperial College LondonPage 7

Slide8

My Thesis

© Imperial College London

Page

8

Constrained Network

New Network Methods

Slide9

Causality and Networks

© Imperial College London

Page

9Time Constraint on Vertices

New Methods for

Temporal Vertex Networks

Transitive Reduction

Clough, Gollings, Loach, Evans 2015

Longest Path

DimensionGeometry

Slide10

© Imperial College London

Page 10

Time & Networks

Netometry – Networks & GeometryDimension

Coordinate AssignmentSummary

Slide11

Networks and Geometry

Standard tools for data and network analysis use Euclidean space where distance

L

is e.g.Wireless Networks, RGG etc.

Network VisualisationsDimensional Reduction of DataMDS (Multidimensional Scaling)PCA (Principal Component Analysis)© Imperial College LondonPage 11

 

Karate Club

[Evans 2010]

Slide12

MDS Clustering

Standard clustering tool using usual Euclidean distance measures

© Imperial College London

Page 12

KDD cuparXiv hep-th 1992-2003Abstract text similarity and MDSColours:

8 k-means clusters

Slide13

Clusters vs Coordinates

No visually distinct clusters suggests continuous coordinates better than discrete categories

© Imperial College London

Page 13

KDD cuparXiv hep-th 1992-2003Abstract text similarity and MDSColours:

8 k-means clusters

Slide14

Networks and Geometry

Recent work goes beyond simple Euclidean space

© Imperial College London

Page

14

Flat” Euclidean

Standard MDS

 

Sphere

e.g. clustering population, air travel

 

Hyperbolic

e.g. Internet Backbone

[Boguna

,

Krioukov

,

Claffy

, 2009]

 

Slide15

Networks, Geometry and Time

Very little clustering work includes time

© Imperial College London

Page

15Riemannian Spaces

(curvature 0,+1,-1)

 

Space-times

de Sitter

Anti-de Sitter

Minkowski

 

 

 

 

Non-Riemannian Spaces

(curvature 0,+1,-1)

 

Slide16

Minkowskii Space

My approach is to use simplest space-timeMinkowski space in

D = (d+1) space-time dimensions© Imperial College London

Page 16

 

[Clough & Evans,

Physica

A 2016 (

arXiv:1408.1274

);

ISSI 2015 Proceedings,

arXiv:1507.01388

;

arXiv:1602.03103

]

Slide17

Minkowski

Space

Measure spatial distance in terms of time to reach that point

c.f. light years

Information flows into forward light cone only

© Imperial College London

Page

17

Future

time-like

region

Space-like

region

Past

time-like

region

y

x

Space-like

region

TIME

SPACE

Slide18

Causal Set - Null Model

PPP in D-dimensional space

Take one dimension as time

Add edge from point

x

to and

y if and time-like separation

© Imperial College LondonPage

18

> 0

 

 

Future

Space-

Like

Past

Standard model used in

Discrete maths e.g.

Bollobás

&

Brightwell

1991

Causal set quantum gravity e.g. see

Dowker

2006

Space-

Like

y

x

Transitively

complete

Slide19

© Imperial College London

Page 19

Time & Networks

Netometry – Networks & Geometry

DimensionCoordinate AssignmentSummary

Slide20

Dimension for Social DAG Spacetime

Formulae used derived for Causal Set Null Model= PPP (Poisson Point Process) in Minowkski

spacee.g. volume ~ (length scale)

DimensionDo they work for social DAG networks?© Imperial College London

Page 20

Slide21

Comparison of Data Sources

© Imperial College London

Page

21

MM DimensionMM DimensionN

# nodes in interval

N # nodes in interval

# runs

String Theoryhep-th

arXiv

Particle Phenonemology

hep-ph arXiv

D=2

D=3

[Clough & Evans 2016,

arXiv:1408.1274

]

Slide22

Dimensions

Data

Dimension

hep-th (String Theory)2hep-

ph (Particle Physics)3quant-ph (quantum physics)3astro-ph (astrophysics)3.5

US Patents

>4US Supreme Court Judgments

3 (short times), 2 (long times)

© Imperial College London

Page 22

String theory appears to be a narrow field

[Clough & Evans 2016,arXiv:1408.1274

]

Slide23

© Imperial College London

Page 23

Time & Networks

Netometry – Networks & Geometry

DimensionCoordinate AssignmentSummary

Slide24

Lorentzian MDS

Create Lorentzian distance matrix Length of longest path = timelike separationspace-like distances from longest path between first common neighbour in

forward and backwards light cones.

Not unique for D>2 but seems to work for reasonable N© Imperial College London

Page 24

space

time

arXiv:1602.03103

[Clough & Evans, 2016]

Slide25

Lorentzian MDS

Lorentzian distance matrix largest negative eigenvalue  eigenvector =

time direction (D-1) large positive eigenvalues 

eigenvectors = space directions© Imperial College London

Page 25arXiv:1602.03103[Clough & Evans, 2016]

Slide26

DAG Null Model Distance Matrix Eigenvalues

© Imperial College London

Page

26

time

time

time

time

2 space

no space

1 space

3 space

Eigenvalue Rank

PPP in

Minkowski

Spacetime

Slide27

Lorentzian

MDS works!

© Imperial College LondonPage

27Original Causal Set

1+1 EmbeddingLMDS coordinates for 1+1 dimensionsSubset 20 points of 200 shown for clarity

Slide28

Real Citation Networks© Imperial College London

Page 28

Embed (

ie. find coordinates) for papers in citation networks fromarXiv hep-th - 91% success

arXiv hep-ph - 88% successThese citation networks are more embeddable in Minkowski spacetime than you would otherwise expect

Slide29

arXiv

hep-th 1992-2003

© Imperial College London

Page 29

arXiv:1602.03103[Clough & Evans, 2016]Based on top 200 most cited hep-th papers.node size = citations

Slide30

arXiv hep-th 1992-2003

© Imperial College London

Page

302004

1990Publication Date

Slide31

arXiv

hep-th 1992-2003

© Imperial College London

Page 31

Colours represent 8 k-means clusters based on abstract word similarity

Slide32

© Imperial College London

Page 32

Time & Networks

Causally Invariant MeasuresTransitive Reduction & Innovation

Longest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide33

ConclusionsTemporal Networks come in two types

time on vertices or time on edges Constraints on networks require new tools in network analysis.

Transitive Reduction, Dimension, Longest PathGeometryGeneralised MDS to Lorentzian Spacetime

ReliableMight help to cluster papers© Imperial College London

Page 33Tim Evans, Imperial College Londonhttp://netplexity.org

Work done with:

James Clough, Jamie

Gollings, Tamar Loach, Sophia Goldberg, Hannah Anthony, Joana Teixeira, Chon Lei

Slide34

Bibliography

Albert, M. H. & Frieze, A. M., Random graph orders, Order, 1989, 6, 19-30.Bombelli, L., Ph.D. thesis, Syracuse University, 1987.Brightwell

, G. & Gregory, R. Structure of random discrete spacetime, Phys. Rev. Lett.,

1991, 66, 260-263Bruggemann, R.; Halfon, E.; Welzl, G.; Voigt, K. & Steinberg, C. Applying the Concept of Partially Ordered Sets on the Ranking of Near-Shore Sediments by a Battery of Tests J. Chem. Inf. Model., American Chemical Society (ACS), 2001, 41, 918-925.

Bruggemann, R., M¨unzer, B. and Halfon, E., An algebraic/graphical tool to compare ecosystems with respect to their pollution – the German river “Elbe” as an example - I: Hasse-diagrams, Chemosphere, 28 (1994) 863–872.Clough, J. R.; Gollings, J.; Loach, T.V. & Evans, T.S., Transitive reduction of citation networks, J.Complex Networks, 2015, 3, 189-203 [10.1093/

comnet/cnu039 arXiv1310.8224]Clough, J.R. & Evans, T.S. What is the dimension of citation space?

Physica A 2016 448, 235-247 [10.1016/j.physa.2015.12.053

arXiv:1408.1274 ]Clough, J.R. & Evans, T.S. Time and Citation Networks in "Proceedings of ISSI 2015 Istanbul: 15th International Society of Scientometrics

and Informetrics Conference, Istanbul, Turkey, 29 June to 3 July, 2015", ISBN 978-975-518-381-7; ISSN 2175-1935 [arXiv:1507.01388]Clough, J.R. & Evans, T.S. Embedding Graphs in Lorentzian

Spacetime, arXiv:1602.03103Evans, T.S., Complex Networks, Contemporary Physics,

2004, 45, 455-474 [10.1080/00107510412331283531 arXiv:cond-mat

/0405123]Evans, T.S., Clique Graphs and Overlapping Communities, J.Stat.Mech, 2010

, P12037 [10.1088/1742-5468/2010/12/P12037 arXiv:1009.0638]Evans, T.S. & Lambiotte

, R., Line Graphs, Link Partitions and Overlapping Communities, Phys.Rev.E, 2009, 80, 016105 [

10.1103/PhysRevE.80.016105 arXiv:0903.2181]Evans, T.S. & Lambiotte, R., Line Graphs of Weighted Networks for Overlapping Communities,

Eur.Phys.J. B 2010, 77, 265–272 [10.1140/

epjb/e2010-00261-8 arXiv:0912.4389]Expert, P.; Evans, T. S.;

Blondel, V. D. & Lambiotte, R. Uncovering space-independent communities in spatial networks,

PNAS, 2011, 108, 7663-7668 10.1073/pnas.1018962108 [10.1073/pnas.1018962108 arXiv:1012.3409]Garfield, E.; Sher, I. H. & Torpie, R. J., The use of citation data in writing the history of science, DTIC Document, 1964.

Holme, P. & Saramäki, J., Temporal Networks, Physics Reports, 2012, 519, 97-125.Hummon, N. P. & Dereian

, P., Connectivity in a citation network: The development of DNA theory, Social Networks,

1989

, 11, 39-63.

Masuda, N & Lambiotte, R, 2016. A Guide To Temporal Networks, World Scientific, ISBN 9781786341143.

Myrheim

, J.. Statistical geometry, 1978. Technical report, CERN preprint TH-2538, 1978.

Meyer, D.A. The Dimension of Causal Sets. PhD thesis, MIT, 1988.

Meyer, D.A. Dimension of causal sets, 2006.

Reid, D. D. Manifold dimension of a causal set: Tests in

conformally

flat

spacetimes

, Phys. Rev. D

2003

, 67

, 024034.

Winkler, P., Random orders, Order, 1985, 1, 317-331

© Imperial College London

Page

34

Tim Evans

Centre for Complexity Science

http://netplexity.org

Slide35

© Imperial College London

Page

35

Slide36

© Imperial College London

Page 36

Time & Networks

Causally Invariant MeasuresTransitive Reduction

& InnovationLongest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide37

Transitive Reduction (

Hasse Diagrams)

Remove all edges not needed for causal links

= Remove links provided all pairs of connected points remain connected= Remove links implied by transitivity

Uniquely defined because of causal structure© Imperial College LondonPage

37

Slide38

Transitive Completion

Add all edges implied by causal links

= Add edge if there is a path between a pair of point

= Edge for any vertices with partial orderUniquely defined because

of causal structure© Imperial College LondonPage 38

Slide39

Real DAG data

Edges represent some interaction e.g. citationsNot simply a causal relationship

© Imperial College London

Page 39

Transitively

Reduced

Real

Network

Transitively

Complete

0.0

0.5

1.0

(

Hasse

Diagram)

Slide40

Two Types of MeasurementCausally Invariant Measures

Only see causal structure Not sensitive to fraction of TR  TCe.g. Longest Path

Sensitive to TR  TC completion Fractione.g. degree = citation counts

© Imperial College London

Page 40

Slide41

© Imperial College London

Page 41

Time & Networks

Causally Invariant MeasuresTransitive Reduction & Innovation

Longest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide42

Dimension

Assume an embedding of DAGin Minkowskii space of D dimensions

Information flows into forward light cone only

 Find Dimension D

© Imperial College LondonPage 42

Future

time

like

Space-

Like

Past

time

like

Space-

Like

y

x

Slide43

Causal Set Model

PPP in D-dimensional space

Take one dimension as time

Add edge from point

x

to and y

if and time-like separation

© Imperial College LondonPage

43

> 0

 

 

Future

Space-

Like

Past

Standard model used in

Discrete maths e.g.

Bollobás

&

Brightwell

1991

Causal set quantum gravity e.g. see

Dowker

2006

Space-

Like

y

x

Transitively

complete

Slide44

Longest Path and Causal SetsLongest Path is the best approximation to the space-time geodesics in the Causal Set Models of

Minkowskii space [Brightwell

& Gregory, 1991]Also conjectured to be true of Causal Sets for general (curved) space-times

© Imperial College LondonPage

44

Slide45

Longest Path in Causal Sets

Longest Path (9)

Geodesic

Shortest

Path (4)

Edges of Light

Cones

© Imperial College London

Page

45

(Poisson Point Process in

1+1 dimension

Minkowskii

space)

Slide46

Box Counting Direct

Choose an interval at randomInterval = set of points on path from given

source to specified sink nodes, sink/source

chosen uniformly at random from vertex setMeasure Volume = N number of points in interval

(N>200 best)Measure length of interval = L longest path between source and sinkRepeat many timesFit to to find D and m

© Imperial College London

Page 46

[Lei, Teixeira, Clough, Evans, 2016 unpublished]

 

source

sink

Slide47

Longest Path L Scaling for Large N

Bollobás &

Brightwell (1991)show that for Minkowski

(Cone) space

 

© Imperial College London

Page

47

 

Greedy Path case

Finite Size

effects unknown?

Slide48

Silly Example

N=19, L=8N=6, L=4N=4, L=4Fit (excel!) to find

 

© Imperial College London

Page

48

Causal Set in 1+1

Minkowski

space,

PPP + connect

timelike

pairs

Slide49

Midpoint Method (box counting)

Choose an interval

random pair of points and all N

nodes lying on paths between the 1st two pointsFind midpoint such that two sub-intervals are roughly equal size

Then

 

© Imperial College London

Page

49

[

Bombelli

1988, Reid 2003]

N

1

=6

N

2

=4

N=18

 D = 1.61

2.16

Causal Set in

1+1

Minkowski

space,

PPP + connect

timelike

pairs

Slide50

Myrheim-Meyer Dimension Estimator

Number of causally connected pairs

in an interval with

N nodes in a D dimensional causal set is

Assumes

Minkowskii

space.

Causally invariant (works if not a transitively complete network).

 

© Imperial College London

Page

50

[

Myrheim

1978, Meyer 1988, Reid 2003]

Slide51

Example of Myrheim-Meyer Method

N=4 internal pointsS

2=4 causally connected pairs

 D=2.0

© Imperial College LondonPage 51

Slide52

Degree after Transitive Reduction

Not an effective measure as the dependence on dimension is too weakResults too noisy

N.B. in 2D can find whole degree distribution

which is Poisson in

ln

(N)

plus corrections

 

© Imperial College London

Page 52

[Bombelli et al 1987;

Eichorn & Mizera 2014; Clough & TSE 2016]

[Clough & TSE 2016]

Slide53

Dimension Measures for Social DAGResults derived for Casual sets

= PPP in Minowkski spaceDo they work for social DAG networks?

© Imperial College London

Page 53

Slide54

Comparison of Dimension Measures

© Imperial College London

Page

54

MM DimensionMidpoint DimensionN

# nodes in interval

N # nodes in interval

# runs / 5000

hep-th

arXiv

[Clough & Evans 2016,arXiv:1408.1274 ]

Slide55

Comparison of Data Sources

© Imperial College London

Page

55

MM DimensionMM DimensionN

# nodes in interval

N # nodes in interval

# runs

String Theoryhep-th

arXiv

Particle Phenonemology

hep-ph arXiv

D=2

D=3

[Clough & Evans 2016,

arXiv:1408.1274

]

Slide56

Dimensions

Data

Dimension

hep-th (String Theory)2hep-

ph (Particle Physics)3quant-ph (quantum physics)3astro-ph (astrophysics)3.5

US Patents

>4US Supreme Court Judgments

3 (short times), 2 (long times)

© Imperial College London

Page 56

String theory appears to be a narrow field

[Clough & Evans 2016,arXiv:1408.1274

]

Slide57

© Imperial College London

Page 57

Time & Networks

Causally Invariant MeasuresTransitive Reduction & Innovation

Longest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide58

Remaining links are a

“lower bound” on the necessary links

Transitive Reduction & Innovation

Conjecture:

Transitive Reduction removes unnecessary and indirect influences on innovation trail© Imperial College LondonPage 58

Links left after transitive

reduction are

sufficient

for innovation

Slide59

Citation Counts

Citations of academic papers sometimesmade for poor reasons:Cite your own paperCite a standard paper

 just copy citation from another paper? (80%? [

Simkin and Roychowdhury 2003;

Goldberg, Anthony, TSE 2015])© Imperial College LondonPage 59

New Paper

Recent

Paper

Standard

Paper

Transitive Reduction removes poor citations

may also remove some useful ones …

Slide60

Citation Counts

Conjecture:Transitive Reduction removes poor citations

Assess papers with ``reduced degree’’ = degree after transitive reduction - a

causally invariant measure© Imperial College London

Page 60

New Paper

Recent

Paper

Standard

Paper

Suggestion:

Reduced Degree < Paper Quality < Citation Count

Slide61

Degree Distribution before and after

Transitive Reduction –

arXiv:hep-th

© Imperial College London

Page 61After TR

Citation count

Frequency

Lose 80% of edges

hep-ph similar,

as Simkin &Roychowdhury

,Goldberg et al.

Before TR

[Clough et al. 2015,

arXiv1310.8224

]

Slide62

Degree Distribution before and after

Transitive Reduction – US Supreme Court

© Imperial College London

Page 62

After TR# Citations

Frequency

Lose 73% of edgesSimilar to hep-

th

Before TR

[Clough et al. 2015,

arXiv1310.8224

]

Slide63

Degree Distribution before and after

Transitive Reduction – US Patents

© Imperial College London

Page 63

Before TR# Citations

Frequency

Lose 15% of edges

Very different to arXiv and court judgments

After TR

[Clough et al. 2015,

arXiv1310.8224

]

Slide64

arXiv hep-th repository

© Imperial College London

Page

64

Citation count before TRCitation count after TR

equalWinner (806

 77)

Loser

(1641

3)

[Clough et al. 2015,

arXiv1310.8224

]

Slide65

Transitive Reduction and Citation Networks

Shows key differences between citation networks from different fields New test for models

Finds large differences in “reduced degree” between papers of similar citation counts Alternative recommendation system

© Imperial College London

Page 65[Clough et al. 2015,arXiv1310.8224]

Slide66

© Imperial College London

Page 66

Time & Networks

Causally Invariant MeasuresTransitive Reduction & Innovation

Longest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide67

Path LengthThe

length of a path is the number of links on that path.© Imperial College London

Page 67

Path Length

6

Slide68

Distance Between VerticesThe

distance between vertices may be set equal to the length of the shortest path.

© Imperial College LondonPage

68

Distance between vertices = 4

Slide69

Longest Path

Longest paths have no meaning for most networks as such paths typically visit a large fraction of network© Imperial College London

Page 69

Longest Path

Length 9

Slide70

Time and Longest Path© Imperial College London

Page 70

Longest Path Length 5

TIME

Longest paths are well defined when a time is defined

- you never go backwards

Slide71

Longest Path and DAG

Longest Paths are unchanged by transitive reduction

They only use links which are essential for the causal structure

Causally Invariant© Imperial College London

Page 71

TIME

Slide72

© Imperial College London

Page 72

Time & Networks

Causally Invariant MeasuresTransitive Reduction & Innovation

Longest Path & CentralityNetometry – Networks & GeometryDimensionCoordinate AssignmentSummary

Slide73

Centrality

Network centrality measures try to describe the importance of a network. e.g. degree, (shortest path) betweenness

, …© Imperial College London

Page 73

Size 

DegreeDarker = Higher

Betweenness

Slide74

Centrality Measures for DAGsSeems that we should use

longest path not shortest path for DAGs© Imperial College London

Page 74

Slide75

Small Example: DNA Citation Network40 key events (mostly single papers) in the development of the theory of DNA

Links are both direct citations and citations via an intermediary paper with a common author.[Asimov 1962; Garfield,

Sher, Torpie,1964; Hummon

& Dereian 1989]

© Imperial College LondonPage 75

Slide76

DNA Citation Network© Imperial College London

Page 76

TIME

[Fig 1,

Hummon

& Dereian

1989]

Miescher 1871

Watson &

Crick ‘53

27

Nobel

32

Ochoa ‘55

Slide77

Shortest Path

Betweenness for DNA Network

© Imperial College London

Page 77

Size = Shortest Path BetweennessColour = Longest Path Betweenness

Slide78

Longest Path

Betweenness for DNA Network

© Imperial College London

Page 78

Size = Longest Path BetweennessColour = ditto

Slide79

Centrality for DAGsLongest Path Centrality shows promise

Experimenting with other approachesComparing with “main path analysis” of bibliometrics field [

Hummon & Dereian 1989]

© Imperial College London

Page 79

Slide80

Rankings & Cube Space

© Imperial College London

Page 80

Slide81

Cube Box Space

Each point has

D

coordinates

Connect from

x

to

y

iff

 

© Imperial College London

Page

81

D=2

[

Bollobás

&

Brightwell

1991]

Slide82

Box Space Representation of Journal Rankings

© Imperial College London

Page

82Node = JournalDirected Edge

if journal has better IF, EF & CAHeight = minimum longest path distance to root Top 1000journals usedTop 40 shown

worse

Hasse

diagram technique

[Brüggeman et al. 1994]

Slide83

DAG Models

© Imperial College London

Page 83

Slide84

DAG ModelsCausal Set Models

PPP in space with at least one time directionMinkowski (Cone) space Cube Space (useful for multiple rankings)

[e.g. Bollobas & Brightwell

1991, Clough et al 2015, Clough & TSE 2016]Growing Network Models - Growth = Time[e.g. Price 1965, Barabasi-Albert 1999, Chung-Lu 2000, …]

More realistic citation models, see Goldberg, Anthony, TSE 2015 and refs thereinAny Network Model + Vertex OrderRandom Partial Orders e.g. Winkler 1985, Albert & Frieze 1989© Imperial College London

Page 84

Slide85

Spatial Coordinates

Similarities to network layout problems

However our work shows data needs

D>2

We exploit the space-time structure for speed© Imperial College LondonPage 85

Slide86

Optimisation Approach

Find coordinates which minimise

where

if edge from

i

to j (future

timelike)

if edge from j to I (past timelike)

if no edge from j to

i (spacelike)

 

© Imperial College London

Page 86

Slide87

© Imperial College London

Page 87

Time & Networks

Transitive ReductionDimension

SummaryLongest PathGeometryFurther Material

Slide88

Basics Statistics Before and After Transitive Reduction

Network

hep-

th

 

US patents

 

US Supreme Court

 

before/after TR

before

after

before

after

before

after

# nodes

27383

27383

3764094

3764094

25376

25376

#

edges

351237

62257

16510997

13996169

216198

59032

Clustering

0.249

0

0.0757

0

0.163

0

Mean in-degree

12.82

2.27

4.39

3.71

8.52

2.33

Median in-degree

4

2

2

2

5

2

1st Q in-degree

1

1

0

0

0

0

3rd Q in-degree

12

3

6

6

11

3

Gini

coef

0.729

0.481

0.684

0.67

0.62

0.51

© Imperial College London

Page

88

Slide89

Degree Distribution, Null Models and TR

© Imperial College LondonPage

89

Slide90

Degree by year of Publication – hep-th

© Imperial College London

Page 90

Slide91

Degree by year of Publication – US Patents

© Imperial College LondonPage

91

Slide92

Degree by year of Publication – US Supreme Court

© Imperial College LondonPage

92

Slide93

Midpoint Dimension for US Supreme Court

© Imperial College LondonPage

93

M.-M.

dimensionsimilar,largererrors

Slide94

US Patents, Myrheim-Meyer Dimension

© Imperial College London

Page 94

Midpoint

similar but slightly worse

Slide95

# 2-Chains S

2

Cone vs Cube Space Models

© Imperial College London

Page 95Dimension DIdentical

at D=2 as expectedCube space

gives larger dimension

estimates for D>2,

about 7% bigger if D~3

Slide96

© Imperial College London

Page 96

Time & Networks

Transitive ReductionDimension

Longest PathFinding CoordinatesGeometrySummary

Slide97

Finding Coordinates

© Imperial College London

Page 97

Slide98

© Imperial College London

Page 98

Old Material

Slide99

Timed Vertices + Edges

Temporal Edge

Networkse.g. Communication Networks – Email, phone, letters

as in rev Holme & Saramäki,

Temporal Networks 2012© Imperial College LondonPage 99

Slide100

Timed Edges = Temporal Edge Networks

© Imperial College LondonPage

100

Communication Networks

Email

Phone

Letters

Focus of much recent work

[

Holme &

Saramäki

2012

]

Slide101

Vertices + Timed Edges

Temporal Vertex

NetworksEach vertex represents an event happening on one time

 InnovationsPatentsResearch PapersCourt Judgments

 Citation Networks© Imperial College LondonPage 101

Slide102

Timed Vertices = Temporal Vertex Networks

Each vertex represents an event occurring at one time

Time constrains flows

Edges point in one direction only

No Loops© Imperial College LondonPage 102

time

Slide103

Key Properties of Temporal Vertex Networks

Directede.g. citation from newer

to older paper

or reverse if preferAcycliccan not cite a newer paper

Directed Acyclic Graph = DAG© Imperial College LondonPage 103

time

Slide104

Key Properties of Temporal Vertex Networks

Edges of a DAG define a

partial order of the set of vertices- a

poset

There are many ways order to vertices(total or linear orders)© Imperial College LondonPage 104

A

C

D

B

E

A

B

D

C

E

Slide105

DAGS and Flows

DAGs represent many different types of Flowse.g.

B,C  A

Flow of Ideas – Innovations, Citation Networkspapers, patents, court judgementsFlow of ProjectsManagement of Tasks, Critical paths

Flow of Mathematical LogicSpreadsheet formulae© Imperial College LondonPage 105

A

C

D

B

E

Slide106

Innovations

Each vertex represents a discoveryPatentAcademic PaperLaw Judgment

Information is now copied from

one node to all connected eventsin the future

Different process from a random walkSimilar to epidemics but different network© Imperial College LondonPage 106

Slide107

Applications

Flow of Ideas: Innovations, Citation Networkspapers, patents, court judgements

Flow of Projectsmanagement of tasks, critical pathsFlow of Mathematical Logic

spreadsheet formulaeFlow of Space-Time CausalityCausal Set approach to quantum gravity

© Imperial College LondonPage 107Spread like epidemics but very different network

Slide108

Timed Vertices = Temporal Vertex Networks

© Imperial College London

Page 108

Each vertex represents one event occurring at one time

Time constrains flows

Directed edges point in

one direction only

No loops

Directed Acyclic Graph (DAG)

Poset

(partially ordered set)

Discrete Mathematics

Slide109

Applications

Flow of Ideas & Innovations: Citation Networks

papers, patents, court judgementsFlow of Projectsmanagement of tasks, critical pathsFlow of Mathematical Logic

spreadsheet formulaeFlow of Space-Time CausalityCausal Set approach to quantum gravity

© Imperial College LondonPage 109

Slide110

Temporal Edge/Vertex Network Relationship

Types related by

Line Graph transformation

© Imperial College LondonPage

110

Slide111

Temporal Vertex -> Edge Network

Directed Line Graph transformation© Imperial College London

Page 111

A

C

D

B

E

1

1

2

3

4

5

6

7

2

3

4

5

6

7

Slide112

Temporal Edge -> Vertex Network

(Directed) Line Graph transformation© Imperial College London

Page 112

3

2

4

1

B

C

D

A

D

C

B

E

1

1

1

2

3

4

C

D

Time of email

Email

Chain

Email

Person

Person

Slide113

Temporal Spread vs. Temporal Events

Spreading on a geographical network is very different from temporal events network© Imperial College London

Page 113

1

2

1

1

0

y

x

Arrival time is shortest path

distance from source

A

C

D

B

E

A

C

D

B

E

DAG

Slide114

Temporal Spread vs. Temporal Events

Same geographical network,different source vertex

© Imperial College LondonPage

114

0

12

1

1

y

x

Arrival time is shortest path

distance from source

A

C

D

B

E

A

C

D

B

E

DAG

Slide115

Temporal Spread vs. Temporal Events

Same network, same process, different source vertex  different DAG

© Imperial College London

Page 115

A

C

D

B

E

A

C

D

B

E

DAG, source E

A

C

D

B

E

DAG, source C

or

Slide116

Temporal Edge/Vertex Network Relationship

Types related by

Line Graph transformation

© Imperial College LondonPage

116

[c.f. Evans and

Lambiotte

, 2009, 2010;

Ahn

et al. 2010]

Slide117

Networks and Geometry

Recent work combinesnetworks and isotropic,

homogeneous geometriesSpatial:

(curvature 0,+1,-1)

Internet Backbone

[Boguna,

Krioukov, Claffy, 2009]

Space-time:

(curvature 0,+1,-1)

[Krioukov

et al 2013]Only uses 2 dimensional space / space-times

 

© Imperial College London

Page 117

Slide118

Questions for “Netometry”

© Imperial College LondonPage

118

Dimension of space?Only 1 spatial dimension used in curved space work but we normally find more than 1 space dimensionAssigning spatial coordinatesDefine similarity/difference between topicsDo we need Curvature?

How do you measure it?Locally flat so perhaps find post-hocNon-metric spacesCube Box spaces [Bollobás & Brightwell 1991]= Networks + Geometry

Slide119

Minkowskii Space

My approach is to use simplest space-timeMinkowski space in

D = (d+1) space-time dimensions© Imperial College London

Page 119

 

[Clough & Evans,

Physica

A 2016 (to appear)

arXiv:1408.1274

;

ISSI Proceedings,

arXiv1507.01388

; also in preparation]

Slide120

Dimension

Assume an embedding of DAGin Minkowskii space of D dimensions

Information flows into forward light cone only

 Find Dimension D

© Imperial College LondonPage 120

Future

time

like

Space-

Like

Past

time

like

Space-

Like

y

x

Slide121

Shortest PathsThe

length of a path is the number of links on that path.© Imperial College London

Page 121

Slide122

Shortest PathsThe

length of a path is the number of links on that path.© Imperial College London

Page 122

Slide123

Shortest PathsThe

distance between vertices is the length of the shortest path© Imperial College London

Page 123

Slide124

Shortest PathsThe length

of a path is the number of links on that path.The distance between vertices is the length of the shortest pathThe Betweenness of a vertex is the number of shortest paths passing through that vertex

© Imperial College London

Page 124

Slide125

CentralityNetwork centrality measures try to describe the importance of a network.

© Imperial College London

Page 125

Size

 DegreeDarker = Higher

Betweenness

Slide126

Path LengthThe

length of a path is the number of links on that path.© Imperial College London

Page 126

Path Length

6

Slide127

Distance Between VerticesThe

distance between vertices is the length of the shortest path.

© Imperial College LondonPage

127

Distance between vertices = 4

Slide128

Vertex Betweenness

The Betweenness of a vertex is the number of shortest paths passing through that vertex

© Imperial College London

Page 128

Slide129

Longest Path

Longest paths have little meaning for most networks as paths typically visit a large fraction of network© Imperial College London

Page 129

Longest Path

Length 9

Slide130

Longest Path and DAGLongest Path is the best approximation to the space-time geodesics in the Causal Set Models of

Minkowskii space [Brightwell

& Gregory, 1991]Also conjectured to be true of Causal Sets for general (curved) space-times

© Imperial College LondonPage

130

Slide131

Longest Path and DAG

Longest Path (9)

Geodesic

Shortest

Path (4)

Edges of Light

Cones

© Imperial College London

Page

131

(Poisson Point Process in

1+1 dimension

Minkowskii

space)

Slide132

Box Space Representation of Journal Rankings

© Imperial College London

Page

132Node = JournalDirected Edge

if journal has better IF, EF & CAHeight = minimum longest path distance to root Top 1000journals usedTop 40 shown

worse

Slide133

ConclusionsTemporal Networks come in two types

time on vertices or time on edges Constraints on networks require new tools in network analysis.

Transitive Reduction, Dimension, Longest PathGeometrySignificant differences revealed between different fields and different types of data.

© Imperial College London

Page 133Tim Evans, mperial College Londonhttp://netplexity.org

Work done with: James Clough, Jamie Gollings

, Tamar Loach

Slide134

© Imperial College London

Page 134

Time & Networks

Transitive ReductionDimension

Longest PathGeometrySummary

Slide135

Time and Longest Path© Imperial College London

Page 135

Longest Path Length 5

TIME

Longest paths are well defined when a time is defined

- you never go backwards

Slide136

Longest Path and DAGLongest Path is the best approximation to the space-time geodesics in the Causal Set Models of

Minkowskii space [Brightwell

& Gregory, 1991]Also conjectured to be true of Causal Sets for general (curved) space-times

© Imperial College LondonPage

136

Slide137

Longest Path and DAG

Longest Path (9)

Geodesic

Shortest

Path (4)

Edges of Light

Cones

© Imperial College London

Page

137

(Poisson Point Process in

1+1 dimension

Minkowskii

space)

Slide138

Small Example: DNA Citation Network40 Key events (mostly single papers) in the development of the theory of DNA

Links are both direct citations and citations via an intermediary paper with a common author.[Asimov 1962; Garfield,

Sher, Torpie,1964; Hummon

& Dereian 1989]

© Imperial College LondonPage 138

Slide139

DNA Citation Network© Imperial College London

Page 139

TIME

[Fig 1,

Hummon

& Dereian

1989]

Miescher 1871

Watson &

Crick ‘53

27

Nobel

32

Ochoa ‘55

Slide140

Shortest Path

Betweenness for DNA Network

© Imperial College London

Page 140

Size = Shortest Path BetweennessColour = Longest Path Betweenness

Slide141

Longest Path

Betweenness for DNA Network

© Imperial College London

Page 141

Size = Longest Path BetweennessColour = ditto

Slide142

Centrality for DAGsLongest Path Centrality shows promise

Experimenting with other approachesComparing with “main path analysis” of bibliometrics field [

Hummon & Dereian 1989]

© Imperial College London

Page 142

Slide143

Temporal Edge/Vertex Network Relationship

Types related by

Line Graph transformation

© Imperial College LondonPage

143

[c.f. Evans and

Lambiotte

, 2009, 2010;

Ahn

et al. 2010]

Slide144

Temporal Vertex -> Edge Network

Directed Line Graph transformation© Imperial College London

Page 144

A

C

D

B

E

1

1

2

3

4

5

6

7

2

3

4

5

6

7

Slide145

Temporal Edge -> Vertex Network

(Directed) Line Graph transformation© Imperial College London

Page 145

3

2

4

1

B

C

D

A

D

C

B

E

1

1

1

2

3

4

C

D

Time of email

Email

Chain

Email

Person

Person

Slide146

Temporal Spread vs. Temporal Events

Spreading on a geographical network is very different from temporal events network© Imperial College London

Page 146

1

2

1

1

0

y

x

Arrival time is shortest path

distance from source

A

C

D

B

E

A

C

D

B

E

DAG

Slide147

Temporal Spread vs. Temporal Events

Same geographical network,different source vertex

© Imperial College LondonPage

147

0

12

1

1

y

x

Arrival time is shortest path

distance from source

A

C

D

B

E

A

C

D

B

E

DAG

Slide148

Temporal Spread vs. Temporal Events

Same underlying network, same process, different source vertex  different DAG

© Imperial College London

Page 148

A

C

D

B

E

A

C

D

B

E

DAG, source E

A

C

D

B

E

DAG, source C

or

Slide149

Minkowskii

Space© Imperial College London

Page 149

time

spaceFuturetimelike

Slide150

Analysis of DAGs

Paths and Dimensions in Temporal Networks

Joana

Teixeira

Third year project

Partner:

Chon Lei (Jacky)

Supervisor:

Dr Tim Evans

1

Slide151

Longest

1.970

0.006

1.808

0.021

Greedy

2.057

0.011

1.722

0.029

Longest

Greedy

[1]

Bollobas

,

Brightwell

(1991)

(1)

(2)

Longest Path is related to Dimension [1] via :

– number of nodes in interval

– length of longest path

D – dimension of network

– maximal chain constant (limits presented in [1])

 

2D

Minkowski

– iterated 500 times

 

Number of nodes

Dimension Calculation

8

2. Results - Theoretical Model

Slide152

Path length

Number of nodes

1/N

 

 

2D

Minkowski

– iterated 500 times

Path length

Number of nodes

 

1/N

 

2D

1.596

2.718

2D

1.596

2.718

Limits

for

for

Minkowski

space given in [1]

 

2D

1.571

0.006

1.901

0.004

2D

values obtained from our model for large N; D=2

 

)

 

 

Fitting Function:

Dimension Calculation

[1]

Bollobas

,

Brightwell

(1991)

9

2. Results - Theoretical Model