Chapter 18. Programming model for Spring Data Graph

This chapter covers the fundamentals of the programming model behind Spring Data Graph. It discusses the AspectJ features used and the annotations provided by Spring Data Graph and how to use them. Examples for this section are taken from the imdb project of Spring Data Graph examples.

18.1. Overview of the AspectJ support

Behind the scenes Spring Data Graph leverages AspectJ aspects to modify the behavior of simple POJO entities to be able to be backed by a graph store. Each entity is backed by a node that holds its properties and relationships to other entities. AspectJ is used to intercept field access and to reroute it to the backing state (either its properties or relationships). For relationship entities the fields are similarly mapped to properties. There are two specially annotated fields for the start and the end node of the relationship.

The aspect introduces some internal fields and some public methods to the entities for accessing the backing state via getPersistentState() and creating relationships with relateTo and retrieving relationship entities viagetRelationshipTo. It also introduces finder methods like find(Class<? extends NodeEntity>, TraversalDescription) and equals and hashCode delegation.

Spring Data Graph internally uses an abstraction called EntityState that the field access and instantiation advices of the aspect delegate to, keeping the aspect code very small and focused to the pointcuts and delegation code. The EntityState then uses a number of FieldAccessor factories to create a FieldAccessor instance per field that does the specific handling needed for the concrete field.

18.2. Using annotations to define POJO entities and relationships

Entities are declared using the @NodeEntity annotation. Relationship entities use the @RelationshipEntity annotation.

18.2.1. @NodeEntity: The basic building block

The @NodeEntity annotation is used to declare a POJO entity to be backed by a node in the graph store. Simple fields on the entity are mapped by default to properties of the node. Object references to other NodeEntities (whether single or Collection) are mapped via relationships. If the annotation parameter useShortNames is set to false, the properties and relationship names used will be prepended with the class name of the entity. If the parameter fullIndex is set to true, all fields of the entity will be indexed. If the partial parameter is set to true, this entity takes part in a cross-store setting where only the parts of the entity not handled by JPA will be mapped to the graph store.

Entity fields can be annotated with @GraphProperty, @RelatedTo, @RelatedToVia, @Indexed and @GraphId

@NodeEntity
public class Movie {
	String title;
}

18.2.2. @RelatedTo: Connecting NodeEntities

Relationships to other NodeEntities are mapped to graph relationships. Those can either be single relationships (1:1) or multiple relationships (1:N). In most cases single relationships to other node entities don't have to be annotated as Spring Data Graph can extract all necessary information from the field using reflection. In the case of multiple relationships, the elementClass parameter of @RelatedTo must be specified because of type erasure. The direction (default OUTGOING) and type (inferred from field name) parameters of the annotation are optional.

Relationships to single node entities are created when setting the field and deleted when setting it to null. For multi-relationships the field provides a managed collection (Set) that handles addition and removal of node entities and reflects those in the graph relationships.

@NodeEntity
public class Movie {
	private Actor topActor;
}
@NodeEntity
public class Person {
	@RelatedTo(type = "topActor", direction = Direction.INCOMING)
	private Movie wasTopActorIn;
}
@NodeEntity
public class Actor {
	@RelatedTo(type = "ACTS_IN", elementClass = Movie.class)
	private Set<Movie> movies;
}

18.2.3. @RelationshipEntity: Rich relationships

To access the full data model of graph relationships, POJOs can also be annotated with @RelationshipEntity. Relationship entities can't be instantiated directly but are rather accessed via node entities, either by @RelatedToVia fields or by the relateTo or getRelationshipTo methods. Relationship entities may contain fields that are mapped to properties and two special fields that are annotated with @StartNode and @EndNode which point to the start and end node entities respectively. These fields are treated as read only fields.

@RelationshipEntity
public class Role {
	@StartNode
	private Actor actor;
	@EndNode
	private Movie movie;
}

18.2.4. @RelatedToVia: Connecting NodeEntitites via RelationshipEntities

To provide easy programmatic access to the richer relationship entities of the data model a different annotation @RelatedToVia can be declared on fields of Iterables of the relationship entity type. These Iterables then provide read only access to instances of the entity that backs the relationship of this relationship type. Those instances are initialized with the properties of the relationship and the start and end node.

@NodeEntity
public class Actor {
	@RelatedToVia(type = "ACTS_IN", elementClass = Role.class)
	private Iterable<Role> roles;
}

18.2.5. @StartNode: Starting NodeEntity of RelationshipEntity

Annotation for the start node of a relationship entity, read only.

18.2.6. @EndNode: Ending NodeEntity of RelationshipEntity

Annotation for the end node of a relationship entity, read only.

18.2.7. @Indexed: Making entities searchable by field value

The @Indexed annotation can be declared on fields that are intended to be indexed by the Neo4j IndexManager, triggered by value modification. The resulting index can be used to later retrieve nodes or relationships that contain a certain property value (for example a name). Often an index is used to establish the start node for a traversal. Indexes are accessed by a Finder for a particular NodeEntity or RelationshipEntity, created via a FinderFactory.

GraphDatabaseContext exposes the indexes for Nodes and Relationships. Indexes can be named, for instance to keep separate domain concepts in separate indexes. That's why it is possible to specifiy an index name with the @Indexed annotation. It can also be specified at the entity level, this name is then the default index name for all fields of the entity. If no index name is specified, it defaults to the one configured with Neo4j ("node" and "relationship").

18.2.8. @GraphTraversal

The @GraphTraversal annotation leverages the delegation infrastructure used by the Spring Data Graph aspects. It provides dynamic fields which, when accessed, return an Iterable of NodeEntities that are the result of a traversal starting at the current NodeEntity. The TraversalDescription used for this is created by a TraversalDescriptionBuilder whose class is referred to by the traversalBuilder attribute of the annotation. The class of the expected NodeEntities is provided with the elementClass attribute.

18.2.9. @GraphProperty: Cross-store persisted fields

It is not necessary to annotate fields as they are persisted by default; all fields that contain primitive values are persisted directly to the graph. All fields convertible to String using the Spring conversion services will be stored as a string. Transient fields are not persisted. This annotation is mainly used for cross-store persistence.

18.3. Indexing

The Neo4j graph database can use different index providers for exact lookups and fulltext searches. Lucene is used as a index provider implementation. There is support for distinct indexes for nodes and relationships which can be configured to be of fulltext or exact types.

Using the standard Neo4j API, Nodes and Relationships and their indexed field-value combinations have to be added manually to the appropriate index. When using Spring Data Graph, this task is simplified by eased by applying an @Indexed annotation on entity fields. This will result in updates to the index on every change. Numerical fields are indexed numerically so that they are available for range queries. All other fields are indexed with their string representation. The @Indexed annotation can also set the index-name to be used. If @Indexed annotates the entity class, the index-name for the whole entity is preset to that value. Not providing index names defaults them to "node" and "relationship" respectively.

Query access to the index happens with the Node- and RelationshipFinders that are created via an instance of org.springframework.data.graph.neo4j.finder.FinderFactory. The methods findByPropertyValue and findAllByPropertyValue work on the exact indexes and return the first or all matches. To do range queries, use findAllByRange (please note that currently both values are inclusive).

@NodeEntity
class Person {
    @Indexed(indexName = "people")
    String name;

    // automatically indexed numerically
    @Indexed
    int age;

}

@NodeEntity
@Indexed(indexName="groups")
class Group {
    @Indexed
    String name;

    @RelatedTo(elementClass = Person.class, type = "people" )
    Set<Person> people;
}

NodeFinder<Person> finder = finderFactory.createNodeEntityFinder(Person.class);

// exact finder
Person mark = finder.findByProperyValue("people","name","mark");

// numeric range queries
for (Person middleAgedDeveloper : finder.findAllByRange(null, "age", 20, 40)) {
    Developer developer=middleAgedDeveloper.projectTo(Developer.class);
}

Neo4jTemplate also offers index support, providing auto-indexing for fields at creation time of nodes and relationships. There is an autoIndex method that can also add indexes for a set of fields in one go.

For querying the index, the template offers query-methods that take either the exact match parameters or a query object / query expression and push the results wrapped uniformly as Paths to the supplied PathMapper to be converted or collected.

18.4. Finding nodes with finders

Spring Data Graph also comes with a type bound Repository-like Finder implementation that provides methods for locating nodes and relationships:

  • using direct access findById(id),

  • iterating over all nodes of a node entity type (findAll),

  • counting the instances of a node entity type (count),

  • iterating over all indexed instances with a certain property value (findAllByPropertyValue),

  • getting a single instance with a certain property value (findByPropertyValue),

  • iterating over all indexed instances within a certain numerical range (inclusive) (findAllByRange),

  • iterating over a traversal result (findAllByTraversal).

The Finder instances are created via a FinderFactory to be bound to a concrete node or relationship entity class. The FinderFactory is created in the Spring context and can be injected.

NodeFinder<Person> finder = finderFactory.createNodeEntityFinder(Person.class);
Person dave=finder.findById(123);
int people = finder.count();
Person mark = finder.findByPropertyValue("name", "mark");
Iterable<Person> devs = finder.findAllByProperyValue("occupation","developer");
Iterable<Person> davesFriends = finder.findAllByTraversal(dave,
    Traversal.description().pruneAfterDepth(1)
    .relationships(KNOWS).filter(returnAllButStartNode()));

18.5. Transactions in Spring Data Graph

Neo4j is a transactional datastore which only allows modifications within transaction boundaries and fullfills the ACID properties. Reading from the store is also possible outside of transactions.

Spring Data Graph integrates with transaction managers configured using Spring. The simplest scenario of just running the graph database uses a SpringTransactionManager provided by the Neo4j kernel to be used with Spring's JtaTransactionManager. Note: The explicit XML configuration given below is encoded in the Neo4jConfiguration configuration bean that uses Spring's @Configuration functioanlity. This simplifies the configuration. An example is shown further below.

<bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager">
    <bean class="org.neo4j.kernel.impl.transaction.SpringTransactionManager">
        <constructor-arg ref="graphDatabaseService"/>
    </bean>
</property>
<property name="userTransaction">
    <bean class="org.neo4j.kernel.impl.transaction.UserTransactionImpl">
        <constructor-arg ref="graphDatabaseService"/>
    </bean>
</property>
</bean>

<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager"/>

For scenarios running multiple transactional resources there are two options. First of all you can have Neo4j participate in the externally set up transaction manager using the new SpringProvider by enabling the configuration parameter for your graph database. Either via the spring config or the configuration file (neo4j.properties).

<context:annotation-config />
<context:spring-configured/>

<bean id="transactionManager" class="org.springframework.transaction.jta.JtaTransactionManager">
<property name="transactionManager">
    <bean id="jotm" class="org.springframework.data.graph.neo4j.transaction.JotmFactoryBean"/>
</property>
</bean>

<bean class="org.neo4j.kernel.EmbeddedGraphDatabase" destroy-method="shutdown">
<constructor-arg value="target/test-db"/>
<constructor-arg>
    <map>
        <entry key="tx_manager_impl" value="spring-jta"/>
    </map>
</constructor-arg>
</bean>

<tx:annotation-driven mode="aspectj" transaction-manager="transactionManager"/>
                

You can configure a stock XA transaction manager to be used with Neo4j and the other resources (e.g. Atomikos, JOTM, App-Server-TM). For a bit less secure but fast 1 phase commit best effort, use the implementation coming with Spring Data Graph (ChainedTransactionManager). It takes a list of transaction-managers as constructor params and will handle them in order for transaction start and commit (or rollback) in the reverse order.

<bean id="transactionManager"
        class="org.springframework.data.graph.neo4j.transaction.ChainedTransactionManager" >
    <constructor-arg>
        <list>
        <bean class="org.springframework.orm.jpa.JpaTransactionManager" id="jpaTransactionManager">
            <property name="entityManagerFactory" ref="entityManagerFactory"/>
        </bean>
        <bean
            class="org.springframework.transaction.jta.JtaTransactionManager">
            <property name="transactionManager">
                <bean class="org.neo4j.kernel.impl.transaction.SpringTransactionManager">
                    <constructor-arg ref="graphDatabaseService" />
                </bean>
            </property>
            <property name="userTransaction">
                <bean  class="org.neo4j.kernel.impl.transaction.UserTransactionImpl">
                    <constructor-arg ref="graphDatabaseService" />
                </bean>
            </property>
        </bean>
        </list>
    </constructor-arg>
</bean>

18.6. Session handling - attached and detached entities

By default newly created node entities are in a detached state. When persist() is called on the entity it is attached to the graph store and its properties and relationships are persisted as well. Changing an attached entity inside a transaction will write through the changes to the datastore. Whenever an entity is changed outside of a transaction it will be considered detached. The changed data is stored in the entity itself and not written back to the datastore.

All entities that are returned by library functions are initially in an attached state. Changing them outside of a transaction detaches them. For writing the changes back it is necessary to persist() them again.

Persisting an entity not only persists that single entity but will traverse its existing and new relationships and persist the cluster of detached entities that it is part of. The borders of this cluster are formed by attached entities. The persist operation creates its own, implicit transaction. When it is called withina external transaction it participates otherwise it is an atomic operation.

Please keep in mind that the session handling behaviour is still heavily developed. The defaults and also other aspects of the behaviour are likely to change in subsequent releases. At the moment there is no support for the creation of relationships outside of transactions and also more complex operations like creating whole subgraphs outside of transactions is not supported.

@NodeEntity
class Person {
    String name;
}
Person p = new Person().persist();

18.7. Reified types for entities

There are several ways to represent the Java type hierarchy of the data model in the graph. In general for all node and relationship entities type information is needed to perform certain repository operations. Some of this type information is saved in the graph database.

Implementations of NodeTypeStrategy take care of persisting this information on entity instance creation. They also provide the repository methods that use this type information to perform their operations like findAll, count, etc.

There are three available implementations to choose from.

  • IndexingNodeTypeStrategy

    Stores entity types in the integrated index. Each entity node gets indexed with its type and any supertypes that are also @NodeEntity-annotated. The special index used for this is called __types__. Additionally, in order to get the type of an entity node, each node has a property __type__ with the type of that entity.

  • SubReferenceNodeTypeStrategy

    Stores entity types in a tree in the graph representing the type hierarchy. Each entity has a INSTANCE_OF relationship to a type node representing that entity's type. The type may or may not have a SUBCLASS_OF relationship to another type node.

  • NoopNodeTypeStrategy

    Does not store any type information, and does hence not support finding by type, counting by type, or retrieving the type of any entity.

The default implementation is IndexingNodeTypeStrategy for new graphs. If using an existing graph, Spring Data Graph will default to the strategy first used when the graph was created.

18.8. Methods added to entity classes

The node and relationship aspects introduce (via ITD - inter type declaration) several methods to the entities that make common tasks easier. Unfortunately these methods are not generified yet, so the results have to be casted to the correct return type.

persisting the node-entity initially and after changes outside of a transaction, persist participates in a transaction or creates its own implict transaction.

nodeEntity.persist()

accessing node and relationship ids

nodeEntity.getNodeId() and relationshipEntity.getRelationshipId()

accessing the node or relationship backing the entity

entity.getPersistentState()

equals and hashcode are delegated to the underlying state

entity.equals() and entity.hashCode()

creating relationships to a target node entity and returning the relationship-entity instance

nodeEntity.relateTo(targetEntity, relationshipClass, relationshipType)

retrieving a single relationship-entity

nodeEntity.getRelationshipTo(targetEnttiy, relationshipClass, relationshipType)

creating relationships to a target node entity and returning the relationship

nodeEntity.relateTo(targetEntity, relationshipType)

retrieving a single relationship

nodeEntity.getRelationshipTo(targetEnttiy, relationshipType)

removing a single relationship

nodeEntity.removeRelationshipTo(targetEntity, relationshipType)

remove the node entity, its relationship and index entries

entity.remove()

projecting to a different target type

entity.projectTo(targetClass)

traversing, starting at the current node

nodeEntity.findAllByTraversal(targetType, traversalDescription)

18.9. Dynamic typing - Projection to unrelated, fitting types

As the underlying data model of a graph database doesn't imply and enforce strict type constraints like a relational model does, it offers much more flexibility on how to model your domain classes and which of those to use in different contexts.

For instance an order can be used in these contexts: customer, procurement, logistics, billing, fulfillment and many more. Each of those contexts requires its distinct set of attributes and operations. As Java doesn't support mixins one would put the sum of all of those into the entity class and thereby making it very big, brittle and hard to understand. Being able to take a basic order and project it to a different (not related in the inheritance hierarchy or even an interface) order type that is valid in the current context and only offers the attributes and methods needed here would be very benefitial.

Spring Data Graph offers initial support for projecting node and relationship entities to different target types. All instances of this projected entity share the same backing node or relationship, so data changes are reflected immediately.

This could for instance also be used to handle nodes of a traversal with a unified (simpler) type (e.g. for reporting or auditing) and only project them to a concrete, more functional target type when the business logic requires it.

// not related to Person at all
@NodeEntity
class Trainee {
    String name;
    @RelatedTo(elementClass=Training.class);
    Set<Training> trainings;
}

for (Person person : finder.findAllByProperyValue("occupation","developer")) {
    Developer developer = person.projectTo(Developer.class);
    if (developer.isJavaDeveloper()) {
        trainInSpringData(developer.projectTo(Trainee.class));
    }
}

18.10. Neo4jTemplate

The Neo4jTemplate offers the convenient API of Spring templates for the Neo4j graph database. There are methods for creating nodes and relationships that automatically set provided properties and optionally index certain fields. Other methods (index, autoindex) will index them.

For the querying operations Neo4jTemplate unifies the result with the Path abstraction that comes from Neo4j. Much like a resultset a path contains nodes() and relationships() starting at a startNode() and ending with aendNode(), the lastRelationship() is also available separately. The Path abstraction also wraps results that contain just nodes or relationships. Using implementations of PathMapper<T> and PathMapper.WithoutResult (comparable with RowMapper and RowCallbackHandler) the paths can be converted to Java objects.

Query methods either take a field / value combination to look for exact matches in the index or a lucene query object or string to handle more complex queries.

Traversal methods are the bread and butter of graph operations. As such, they are fully supported in the Neo4jTemplate. The traverseNext method traverses to the direct neighbours of the start node filtering the relationships according to its parameters.

The traverse method covers the full fledged traversal operation that takes a powerful TraversalDescription (most probably built from the Traversal.description() DSL) and runs it from the start node. Each path that is returned via the traversal is passed to the PathMapper to be processed accordingly.

The Neo4jTemplate provides configurable implicit transactions for all its methods. By default it creates a transaction for each call (which is a no-op if there is already a transaction running). If you call the constructor with the useExplicitTransactions parameter set to true, it won't create any transactions so you have to provide them using @Transactional or the TransactionTemplate.

Neo4jOperations neo = new Neo4jTemplate(grapDatabase);
Node michael = neo.createNode(_("name","Michael"),"name");
Node mark = neo.createNode(_("name","Mark"));
Node thomas = neo.createNode(_("name","Thomas"));
neo.createRelationship(mark,thomas, WORKS_WITH, _("project","spring-data"));
neo.index("devs",thomas, "name","Thomas");
neo.autoIndex("devs",mark, "name");
assert "Mark".equals(neo.query("devs","name","Mark",new NodeNamePathMapper()));

18.11. Bean Validation - JSR-303

Spring Data Graph supports property based validation support. So whenever a property is changed, it is checked against the annotated constraints (.e.g @Min, @Max, @Size, etc). Validation errors throw a ValidationException. For evaluating the constraints the validation support that comes with Spring is used. To use it a validator has to be registered with the GraphDatabaseContext, if there is none, no validation will be performed (any registered Validator or (Local)ValidatorFactoryBean will be used).

@NodeEntity
class Person {
    @Size(min = 3, max = 20)
    String name;

    @Min(0)
    @Max(100)
    int age;
}