Thursday, August 27, 2015

Simple gremlin queries in Titan...

Titan is an open source graph database and, even though it isn't as easy to setup for use as Neo4J, it is easy enough to start using in just a few minutes. Here is an example of using Titan with HBase as the backend storage.

Setup

I'm using HBase 0.98.13. I downloaded hbase, untarred the files, changed to the hbase directory and ran "bin/start-hbase.sh".

I cloned Titan from the github repository and built it using the maven package command.

git clone https://github.com/thinkaurelius/titan.git
cd titan
mvn clean package

I started gremlin using "bin/gremlin.sh".

I followed the Titan Hbase instructions for initializing Titan for use with HBase within the gremlin console. You can also create a properties file that contains the same settings and load the settings from the gremlin console.

Gremlin

conf = new BaseConfiguration();
conf.setProperty("storage.backend","hbase");
conf.setProperty("storage.hbase.table", "test") 

g = TitanFactory.open(conf);

Now let's add some vertices.

alice = g.addVertexWithLabel('human')
alice.setProperty('name', 'alice')
alice.setProperty('age',25)
bob = g.addVertexWithLabel('human')
bob.setProperty('name', 'bob')
bob.setProperty('age',21)
clark = g.addVertexWithLabel('human')
clark.setProperty('name', 'clark')
clark.setProperty('age',93)
darwin = g.addVertexWithLabel('human')
darwin.setProperty('name', 'darwin')
darwin.setProperty('age',206)
ernie = g.addVertexWithLabel('android')
ernie.setProperty('name', 'ernie')


Let's list the vertices and their properties.

g.V().map()
==>{name=ernie}
==>{name=alice, age=25}
==>{name=darwin, age=206}
==>{name=clark, age=93}
==>{name=bob, age=21}

And now let's make these humans be friends with each other.

alice.addEdge('friend', bob)
alice.addEdge('friend', darwin)
bob.addEdge('friend', alice)
bob.addEdge('friend', darwin)
clark.addEdge('friend', darwin)
darwin.addEdge('friend',alice)
darwin.addEdge('friend', bob)
darwin.addEdge('friend', clark)



Now let's remove ernie from the graph.

g.V.has('name', 'ernie').remove()
==>null

Now we can see that ernie is gone

g.V.has('name', 'ernie').map()

(no results displayed, just the gremlin prompt)

Let's add ernie back, but this time he's a human.

ernie = g.addVertexWithLabel('human')
ernie.setProperty('name', 'ernie')



Let's try finding out who has friends

g.V().outE('friend').outV().name
==>darwin
==>darwin
==>darwin
==>alice
==>alice
==>bob
==>bob
==>clark


Wait - what happened? We see an entry for every friend edge, which is exactly what our gremlin query was asking for, but that doesn't look very nice.

Let's try the dedup method.

g.V().outE('friend').outV().dedup().name
==>darwin
==>alice
==>bob
==>clark


Ahh! That's more like it! But how else can we get that list?

g.V.filter{it.outE('friend').hasNext()}.toList()._().name
==>darwin
==>alice
==>bob
==>clark


Nice! We have two ways to get a distinct list.

No comments:

Post a Comment