Need help with gremlin-scala?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

465 Stars 73 Forks Apache License 2.0 1.6K Commits 30 Opened issues


Scala wrapper for Apache TinkerPop 3 Graph DSL

Services available


Need anything else?

Contributors list


Build Status Join the chat at Scaladex scaladoc

Gremlin-Scala for Apache Tinkerpop 3

A wrapper to use Apache Tinkerpop3 - a JVM graph traversal library - from Scala.

  • Beautiful DSL to create vertices and edges
  • Type safe traversals
  • Scala friendly function signatures
  • Minimal runtime overhead - only allocates additional instances if absolutely necessary
  • Nothing is hidden away, you can always easily access the underlying Gremlin-Java objects if needed, e.g. to access graph db specifics things like indexes


Getting started

The examples project comes with working examples for different graph databases. Typically you just need to add a dependency on

"com.michaelpollmeier" %% "gremlin-scala" % "SOME_VERSION"
and one for the graph db of your choice to your
(this readme assumes tinkergraph). The latest version is displayed at the top of this readme in the maven badge.

Using the sbt console

  • sbt gremlin-scala/Test/console
    ```scala import gremlin.scala._ import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory implicit val graph = TinkerFactory.createModern.asScala

val name = KeyString

val g = graph.traversal g.V.hasLabel("person").value(name).toList // List(marko, vadas, josh, peter) ```

Simple traversals

The below create traversals, which are lazy computations. To run a traversal, you can use e.g.

import gremlin.scala._
import org.apache.tinkerpop.gremlin.process.traversal.{Order, P}
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory

implicit val graph = TinkerFactory.createModern.asScala val g = graph.traversal

g.V //all vertices g.E //all edges

g.V(1).outE("knows") //follow outgoing edges g.V(1).out("knows") //follow outgoing edges to incoming vertex

val weight = KeyDouble for { person

Warning: GremlinScala is not a monad, because the underlying Tinkerpop GraphTraversal is not. I.e. while GremlinScala offers

etc. and you can use it in a for comprehension for syntactic sugar, it does not fulfil all monad laws.

More working examples in TraversalSpec.

Vertices and edges with type safe properties

import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
import scala.language.postfixOps
implicit val graph =

// Keys for properties which can later be used for type safe traversals val Founded = KeyString val Distance = KeyInt

// create labelled vertex val paris = graph + "Paris"

// create vertex with typed properties val london = graph + ("London", Founded -> "43 AD")

// create labelled edges paris --- "OneWayRoad" --> london paris london

// create edge with typed properties paris --- ("Eurostar", Distance -> 495) --> london

// type safe access to properties paris.out("Eurostar").value(Founded).head //43 AD paris.outE("Eurostar").value(Distance).head //495 london.valueOption(Founded) //Some(43 AD) london.valueOption(Distance) //None paris.setProperty(Founded, "300 BC")

val Name = KeyString val Age = KeyInt

val v1 = graph + ("person", Name -> "marko", Age -> 29) asScala

v1.keys // Set(Key("name"), Key("age")) // "marko" v1.valueMap // Map("name" -> "marko", "age" -> 29) v1.valueMap("name", "age") // Map("name" -> "marko", "age" -> 29)

More working examples in SchemaSpec, ArrowSyntaxSpec and ElementSpec.

Compiler helps to eliminate invalid traversals

Gremlin-Scala aims to helps you at compile time as much as possible. Take this simple example:

import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
implicit val graph =
val g = graph.traversal
g.V.outE.inV  //compiles
g.V.outE.outE //does _not_ compile

In Gremlin-Groovy there's nothing stopping you to create the second traversal - it will explode at runtime, as outgoing edges do not have outgoing edges. In Gremlin-Scala this simply doesn't compile.

Type safe traversals

You can label any step using

and the compiler will infer the correct types for you in the select step using an HList (a type safe list, i.e. the compiler knows the types of the elements of the list). In Gremlin-Java and Gremlin-Groovy you get a
Map[String, Any]
, so you have to cast to the type you think it will be, which is ugly and error prone. For example:
import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
def graph = TinkerFactory.createModern.asScala
val g = graph.traversal

// select all labelled steps g.V(1).as("a")"b").select.toList // returns a (Vertex, Edge) for each path

// select subset of labelled steps val a = StepLabelVertex val b = StepLabelEdge val c = StepLabelDouble

val traversal = g.V(1).as(a).outE("created").as(b).value("weight").as(c), c)).head // returns a (Edge, Double)

More working examples in SelectSpec. Kudos to shapeless and Scala's sophisticated type system that made this possible.

As of there is a typesafe

step that supports heterogeneous queries:
val traversal =
// derived type: GremlinScala[(List[Edge], List[Vertex])] 
val (outEdges, outVertices) = traversal.head

A note on predicates

tl;dr: use gremlin.scala.P to create predicates of type P.

Many steps in take a tinkerpop3 predicate of type

. Creating Ps that take collection types is dangerous though, because you need to ensure you're creating the correct P. For example
P.within(Set("a", "b"))
would be calling the wrong overload (which checks if the value IS the given set). In that instance you actually wanted to create
P.within(Set("a", "b").asJava: java.util.Collection[String])
. To avoid that confusion, it's best to just
import gremlin.scala._
and create it as
P.within(Set("a", "b"))

Build a custom DSL on top of Gremlin-Scala

You can now build your own domain specific language, which is super helpful if you don't want to expose your users to the world of graphs and tinkerpop, but merely build an API for them. All you need to do is setup your ADT as case classes, define your DSL as Steps and create one implicit constructor (the only boilerplate code). The magic in gremlin.scala.dsl._ allows you to even write for comprehensions like this (DSL for tinkerpop testgraph):

case class Person  (name: String, age: Integer) extends DomainRoot
case class Software(name: String, lang: String) extends DomainRoot

val traversal = for { person

See the full setup and more tests in DslSpec.

Common and useful steps

// get a vertex by id

// get all vertices g.V.toList

// group all vertices by their label

// group vertices by a property val age = KeyInt g.V.has(age).group(By(age))

// order by property decreasing val age = KeyInt g.V.has(age).order(By(age, Order.decr))

More working examples in TraversalSpec.

Mapping vertices from/to case classes

You can save and load case classes as a vertex - implemented with a blackbox macro.

  • You can optionally specify the label of your class using
  • Option
    members will be automatically unwrapped, i.e. a
    will be stored as the value of type
    in the database, or
    if it's
    . If we wouldn't unwrap it, the database would have to understand Scala's Option type itself.
  • The same goes for value classes, i.e. a
    case class ShoeSize(value: Int) extends AnyVal
    will be stored as an Integer.
  • List
    members will be stored as multi-properties, i.e. Cardinality.list
  • Annotating members with
    will instruct the marshaller to set the element id and/or the underlying element in the class. Note: you cannot specify the id when adding a vertex like this. Using
    only works when retrieving the vertex back from the graph and it therefor must be an
  • Your classes must be defined outside the scope where they are being used (e.g. in the code below the class
    cannot be inside
    object Main

Warning: this may not work with your particular remote graph, depending on the implementation/configuration. That's because the graph may choose to only return referenced elements which doesn't contain it's properties.

// this does _not_ work in a REPL
import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph

@label("my_custom_label") case class Example(longValue: Long, stringValue: Option[String], @underlying vertex: Option[Vertex] = None)

object Main extends App { implicit val graph = val example = Example(Long.MaxValue, Some("optional value")) val v = graph + example v.toCC[Example] // equal to example, but with vertex set

// find all vertices with the label of the case class Example graph.V.hasLabel[Example]

// modify the vertex like a case class v.updateAs[Example](_.copy(longValue = 0L)) }

You can also define your own marshaller, if the macro generated one doesn't quite cut it. For that and more examples check out the MarshallableSpec.

More advanced traversals

Here are some examples of more complex traversals from the examples repo. If you want to run them yourself, check out the tinkergraph examples in there.

What's the property distribution of all vertices?


What is

Die Hard's
average rating?

graph.V.has("movie", "name", "Die Hard")

Get the maximum number of movies a single user rated


What 80's action movies do 30-something programmers like? Group count the movies by their name and sort the group count map in decreasing order by value.

  .`match`("a").hasLabel("movie"),"a").out("hasGenre").has("name", "Action"),"a").has("year", P.between(1980, 1990)),"a").inE("rated").as("b"),"b").has("stars", 5),"b").outV().as("c"),"c").out("hasOccupation").has("name", "programmer"),"c").has("age", P.between(30, 40))
  .limit(Scope.local, 10)

What is the most liked movie in each decade?

  .groupBy { movie =>
    val year = movie.value2(Year)
    val decade = (year / 10)
    (decade * 10): Integer
  .map { moviesByDecade =>
    val highestRatedByDecade = moviesByDecade.mapValues { movies =>
        .sortBy { _.inE(Rated).value(Stars).mean().head }
        .reverse.head //get the movie with the highest mean rating

Serialise to and from files

Currently graphML, graphson and gryo/kryo are supported file formats, it is very easy to serialise and deserialise into those: see GraphSerialisationSpec. An easy way to visualise your graph is to export it into graphML and import it into gephi.

Help - it's open source!

If you would like to help, here's a list of things that needs to be addressed: * add more graph databases and examples into the examples project * port over more TP3 steps - see TP3 testsuite and Gremlin-Scala StandardTests * ideas for more type safety in traversals * fill this readme and provide other documentation, or how-tos, e.g. a blog post or tutorial

Why such a long version number?

The first three digits is the TP3 version number, only the last digit is automatically incremented on every release of gremlin-scala.


ScalaDays Berlin 2018

Further reading

For more information about Gremlin see the Gremlin docs and the Gremlin users mailinglist. Please note that while Gremlin-Scala is very close to the original Gremlin, there are differences to Gremlin-Groovy - don't be afraid, they hopefully all make sense to a Scala developer ;)

Random links: * Social Network using Titan Db: part 1 and part 2 * Shortest path algorithm with Gremlin-Scala 3.0.0 (Michael Pollmeier) * Shortest path algorithm with Gremlin-Scala 2.4.1 (Stefan Bleibinhaus)

Random things worth knowing

  • org.apache.tinkerpop.gremlin.structure.Transaction
    is not thread-safe. It's implemented with Apache's ThreadLocal class, see


... happen automatically for every commit on

from thanks to sbt-ci-release-early

YourKit endorsement

YourKit supports open source projects with innovative and intelligent tools for monitoring and profiling Java and .NET applications. YourKit is the creator of YourKit Java Profiler, YourKit .NET Profiler, and YourKit YouMonitor.

Breaking changes

Marshallable now treats Sets as multi-properties, i.e. one property for each element in the set. This is similar to how List properties are handled and allows for a natural representation of vertex properties whose cardinality is

in case classes. This change breaks compatibility with Marshallable's previous behaviour in which Sets were effectively treated as
cardinality properties, i.e. a single property whose value is the entire set.

The implementation for

with non literal values (e.g.
) was dropped due to it's bad performance. Please use String literals instead, e.g.

We now have a fully typed

step which supports heterogeneous queries. The old version is still available as
, since it may still be relevant in some situations where the union traversals are homogeneous.


modulator is now called
. E.g.
. Background: case insensitive platforms like OSX (default) and Windows fail to compile
object by
trait By
because they lead to two separate .class files. I decided for this option because it conforms to Scala's naming best practices. See

To fix problems with remote graphs and the arrow syntax (e.g.

vertex1 --- "label" --> vertex2
) there now needs to be an
implicit ScalaGraph
in scope. Background: the configuration for remote is unfortunately not stored in the Tinkerpop Graph instance, but in the TraversalSource. Since a vertex only holds a reference to the graph instance, this configuration must be passed somehow.
does contain the configuration, e.g. for remote connections, so we now pass it implicitly.

The type signature of GremlinScala changed: the former type parameter

is now a type member, which shortens the type if you don't care about Labels. The Labels were only used in a small percentage of steps, but had to be written out by users all the time even if they didn't care. Rewrite rules (old -> new), using
as an example:
GremlinScala[Vertex, _]
(existential type: most common, i.e the user doesn't use or care about the Labels)
GremlinScala[Vertex, HNil]
GremlinScala.Aux[Vertex, HNil]
GremlinScala[Vertex] {type Labels = HNil}
GremlinScala[Vertex, Vertex :: HNil]
GremlinScala.Aux[Vertex, Vertex :: HNil]
GremlinScala[Vertex] {type Labels = Vertex :: HNil}
) Notice: GremlinScala isn't a case class any more - it shouldn't have been in the first place.


step changed it's signature and now takes a traversal:
filter(predicate: GremlinScala[End, _] => GremlinScala[_, _])
. The old
filter(predicate: End => Boolean)
is now called
, in case you still need it. This change might affect your for comprehensions.

The reasoning for the change is that it's discouraged to use lambdas (see Instead we are now creating anonymous traversals, which can be optimised by the driver, sent over the wire as gremlin binary for remote execution etc.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.