Progress Report for Visualizing Ties (Nick Gramsky)


Work to date has involved a comprehensive literature review of the topic, formulation of the problem, process and and further refinement of classification methods. Additionally I have updated the main project page to include the motivation behind this project, started writing scripts to gather a Twitter data set. Furthermore I have written a few macros to calculate the metadata necessary for any network once it is entered into a spreadsheet.

Project Refinement:

Users will still be able to visualize the change in reciprocity or general interaction (general degree) for each dyad in a link-node diagram. This visualization method will show how relationships vary over a temporal period but do so in a single instance of the network. Color will still be used to distinguish between the classifications of a given attribute.

Earlier potential methods to classify relationships between nodes involved defining a location withing a 2x3 matrix. This ambiguous method did not clarify if the software would calculate the change in reciprocity or general interactions between nodes. I have since decided to make the two mutually exclusive and force the user to chose to visualize either the change in general interactions or reciprocity between nodes.

Each of the two attributes (reciprocity/general interaction) will continue to be classified as evolving in one of three way:


Related Work:

A somewhat extensive literature review has yet to reveal anything that visualizes and analyzes the variance of individual relationships regarding reciprocity. Furthermore I have yet to see anything that does the same for general interactions. Variance in degree between nodes and what it says in regards to the relationship is discussed quite often in many research papers, but I have yet to see where the topic of how to classify such a variance is broached. Poisson methods in [6] are used to classify reciprocal relationships in general, but fails to quantify how the rate of reciprocity changes over the lifespan. [5] Visualizes reciprocity between contributors within a blog but does not do so via link-node methods nor do they show the rate of change of reciprocity.

Classification Methods:

Linear least square line fitting will determine how both attributes vary over time. Methods defined below will calculate data points based on all of the interactions between the nodes in a dyad and the slope of the best straight line will be calculated. The slope of this line will classify the relationship as either increasing, decreasing and the value of the slope will vary based on the methods used. There will be a level of fidelity for each classification and ultimately users will use a slider within NodeXL to define the granularity between each classification.

General Interaction
Rather than time-bin events, I will track the delta between interactions. I will calculate the time between each interaction and perform a linear least squares line fitting to the data. The slope of the line will determine the evolution of the relationship. IE: If the slope is increasing the relationship correlating to the two nodes is decreasing in activity and will be colored accordingly.

The mathematical classification of the change in reciprocity has yet to be finalized. My preference is not to have to time-bin interactions between the actors that comprise the dyad but rather analyze each interaction across the entire lifespan of the network. Currently I am looking at potentially using Poisson distributions as a means to determine if varying reciprocal interactions between the two nodes that comprise a dyad are trending in a downward or upward fashion. This exact formula is still being researched and worked.

Screen Mockups (NodeXL):

The relationship classification/color assignment control box has been changed due to change in use case. Users will decide to classify and visualize links between dyads based in variances of general interactions or reciprocity. A mock-up of the selection window now looks as follows:


2 Possible Paths Forward:

Need to decide if first priority is to finish NODEXL implementation or user study. Due to the length of the semester and the limited number of students working the project (just me!) I'm afraid a modification to the software and a quality user-study might be a bit out of reach. The order of these goals will need to be discuss between Dr. Bederson and myself.

- Methods for either path:

- NodeXL implementation
- Apply screenshots as noted above.
- Provide automated mechanism to calculate, classify and visualize relationships within NodeXL.
- Provide software updates to NodeXL
- Outline planned user study

- User study only
- Manually perform actions NodeXL could otherwise automatically perform
- Load several data sets and analyze the network with and without temporal relationship visualizations
- Perform User study on manually created visualizations.

Data Sets:

IEE VAST 2008 Cell Phone mini-challenge:

Ben Shneiderman's Email archive: Personal exchange, not publicly accessible

InfoVis2004 Dataset: Fekete, J.-D., Grinstein, G., Plaisant, C., IEEE InfoVis 2004 Contest, the history of InfoVis, (2004)

Other potential data sets:
IPv4 Routed /24 Autonomous System (AS) Links Dataset from Cadia:
Neighbor relationships of as links of routers running BGP on the Internet from 2007 - today

Occupy WallStreet Hashtag network (Twitter) - I have begun working on scraping scripts to gather this data.

References (noteworthy to date):

[1] S. Hill, D. Agarwal, R. Bell, and C. Volinsky. Building an effective representation of dynamic networks. Journal of Computational and Graphical Statistics, Sept, 2006.
Possible model to better change in resiprocity

[2] ROBERT V. HOGG and ELLIOT A. TANIS, Probability and Statistical Inference, 4th ed. (1993)

[3] Newman, Mark. Networks an Introduction. Oxford: Oxford Univ., 2010. Print.

[4] H. Ogatha. “Computer Supported Social Networking for Augmenting
Cooperation” Computer Supported Cooperative Work 10, 2001, pp.
189-209, Kluwer Academic Publishers.

[5] Adam Perer, Ben Shneiderman, and Douglas W. Oard. 2006. Using rhythms of relationships to understand e-mail archives. J. Am. Soc. Inf. Sci. Technol. 57, 14 (December 2006), 1936-1948.

[6] SANKARANARAYANAN, KADHAMBARI. Thesis of Masters of Compuyter Science of, University of Saskatchewan 2010 Comments: One of few papers to visualize reciprocity.

[7] ZHANG, H.; DANTU, R.; CANGUSSU, J. W. Quantifying Reciprocity in Social Networks. CSE (4). [S.l.]: IEEE Computer Society. 2009. p. 1031-1035. COMMENTS: Uses Poisson method to quantify reciprocity in a network. Possible calculation method.

11/23/11: Comment (Ben)
All looks good. As discussed, you will focus on implementation over user study. Looking forward to seeing how this turns out.