4. Learning Neo4j

Graphs ahead

Now we needed to figure out how to store our chosen domain model in the chosen database. First we read up about graph databases, in particular our chosen one, Neo4j. The Neo4j data model consists of nodes and relationships, both of which can have key/value-style properties. What does that mean, exactly? Nodes are the graph database name for records, with property keys instead of column names. That's normal enough. Relationships are the special part. In Neo4j, relationships are first-class citizens, meaning they are more than a simple foreign-key reference to another record, relationships carry information. So we can link together nodes into semantically rich networks. This really appealed to us. Then we found that we were also able to index nodes and relationships by {key, value} pairs. We also found that we could traverse relationships both imperatively using the core API, and declaratively using a query-like Traversal Description. Besides those programmatic traversals there was the powerful graph query language called Cypher and an interesting looking DSL named Gremlin. So lots of ways of working with the graph.

We also learned that Neo4j is fully transactional and therefore upholds ACID guarantees for our data. Durability is actually a good thing and we didn't have to scale to trillions of users and movies yet. This is unusual for NOSQL databases, but easier for us to get our head around than non-transactional eventual consistency. It also made us feel safe, though it also meant that we had to manage transactions. Something to keep in mind later.

We started out by doing some prototyping with the Neo4j core API to get a feeling for how it works. And also, to see what the domain might look like when it's saved in the graph database. After adding the Maven dependency for Neo4j, we were ready to go.

Example 4.1. Neo4j Maven dependency

<dependency>
    <groupId>org.neo4j</groupId>
    <artifactId>neo4j</artifactId>
    <version>1.8.1</version>
</dependency>


Example 4.2. Neo4j core API (transaction code omitted)

enum RelationshipTypes implements RelationshipType { ACTS_IN };

GraphDatabaseService gds = new EmbeddedGraphDatabase("/path/to/store");
Node forrest=gds.createNode();
forrest.setProperty("title","Forrest Gump");
forrest.setProperty("year",1994);
gds.index().forNodes("movies").add(forrest,"id",1);

Node tom=gds.createNode();
tom.setProperty("name","Tom Hanks");

Relationship role=tom.createRelationshipTo(forrest,ACTS_IN);
role.setProperty("role","Forrest");

Node movie=gds.index().forNodes("movies").get("id",1).getSingle();
assertEquals("Forrest Gump", movie.getProperty("title"));
for (Relationship role : movie.getRelationships(ACTS_IN,INCOMING)) {
    Node actor=role.getOtherNode(movie);
    assertEquals("Tom Hanks", actor.getProperty("name"));
    assertEquals("Forrest", role.getProperty("role"));
}