The Sedge system has been intensively evaluated by testing typical graph queries on several large graphs. We are providing some sample datasets employed in our paper, as well as some other interesting datasets. All graph sample files are in an adjacent list format. Sedge is also able to accept other input graph format (see the manual for details).

Graph dataset

Web graph

The web graph is from http://law.dsi.unimi.it/datasets.php, which collected several snapshots of the web network in UK. The following sample dataset is a segment of uk-2007-05.

Download UK web graph
uk-2007-05-seg a segment of uk-2007-05 (about 5M nodes)

Twitter graph

The Twitter graph is crawled from Twitter. For simplicity, we aggregated the multi-edges and the associated attributes as one edge which represents several messages sent from one user to another at different time. We provide a segment of the Twitter graph and delete all the user related information for the purpose of anonymity.

Download Twitter graph
twitter-seg a segment of the Twitter graph (about 10M nodes)

Synthetic scale-free graph

The scale-free graph is generated based on R-MAT. The graph matches "pow-law" behaviors and naturally exhibits "community" structure.

RDF dataset

SP2Bench

The SP2Bench Benchmark chooses the DBLP library as its simulation basis. It can generate arbitrary large RDF test data which mirrors vital real-world distributions found in the original DBLP data. The benchmark provides both the graph generator and the query generator (download).

FebBench

The FebBench benchmark is a comprehensive benchmark for testing cross domain query processing on semantic data. Both the datasets and the queries can be found in their homepage.