Querying documents and graphs in one database with AQL – easily

General Tags: , 1 Comment

Note: The following article was originally written as an answer in ArangoDB google group. It may help other people to understand the scope of ArangoDB and/or AQL, so we posted it here as well.

AQL, the query language, provides access to the data which is stored inside collections. The collections contain documents, identified by unique keys.


For the following examples, I’ll create a “users” collection with a few documents inside:

arangod> db._create("users");
[
    ArangoCollection 10021952,
    "users"   (type document,
    status loaded)
]
arangod> fred = db.users.save({
    "name":"Fred",
    "gender":"m",
    "age":19,
    "state":"FL"
});
{ "_id" : "10021952/11660352", "_rev" : 11660352 }
arangod> john = db.users.save({
    "name" : "John",
    "gender" : "m",
    "age" : 34 });
{ "_id" : "10021952/11791424", "_rev" : 11791424 }
arangod> carla = db.users.save({
    "name" : "carla",
    "gender" : "f",
    "age": 42,
    "state" : "CA" });
{ "_id" : "10021952/11856960", "_rev" : 11856960 }
arangod> ana = db.users.save({
    "name" : "ana",
    "gender" : "f",
    "age" : 21,
    "state" : "NJ" });
{ "_id" : "10021952/11922496", "_rev" : 11922496 }
arangod> stu = db.users.save({
    "name" : "stu",
    "gender" : "m",
    "age" : 45, "likes" : [ "music " ]
});
{ "_id" : "10021952/12184640", "_rev" : 12184640 }

You can now access the data in this collection via AQL queries like this:
Return all users the collection, with all attributes:

stmt = db._createStatement({ "query" : "FOR u IN users RETURN u" });
cursor = stmt.execute();
while (cursor.hasNext())
{
    print(cursor.next());
}

Return a specific user (“fred”) from the collection, identified by its unique id:

stmt = db._createStatement({ "query" : "FOR u IN users FILTER u._id ==  @id RETURN u" });
stmt.bind("id", fred._id);
cursor = stmt.execute();
while (cursor.hasNext())
{
   print(cursor.next());
}

Return a few documents from the collection, identified by some non-key attributes (returns users stu and john):

stmt = db._createStatement({ "query" : "FOR u IN users FILTER u.age > 19 && u.gender == 'm' RETURN u" });
cursor = stmt.execute();
while (cursor.hasNext()) {
    print(cursor.next());
}

Create some projection:

stmt = db._createStatement({
   "query" : "FOR u IN users SORT u.age RETURN
   { name : u.name,
      isAbove30 : u.age > 30 }"
});
cursor = stmt.execute();
while (cursor.hasNext())
{
    print(cursor.next());
}

As you can see, you can use AQL for various kinds of data accesses (all documents, by primary key, by non-key-attributes etc.). Using an AQL query for simple data access patterns works fine but is a bit of an overkill. There are also specialised methods for simple data access patterns. But this should only be relevant if you plan on issuing a lot of queries.

In addition to the simple access patterns, AQL also supports rather complex queries. You can do many things with it that you can do with SQL, though AQL is more specialised for non-relational data.

We haven’t covered graphs yet. Graphs in ArangoDB can be established by connecting documents via so-called “edge” documents. The connected documents are then called “vertices” as in graph theory. Node that vertices and edges are documents, too. That means they can have any attributes you like. You can query these documents either as part of a graph using AQL or dedicated other functionality, but also individually (not as part of a graph).

To illustrate this, we’ll use the documents in the “users” collection as the vertices, and create an extra collection named “knows” for the edges. We have used the “users” data already (and we still can) but we can now use the documents in the “users” collection as part of a graph. Here’s an example setup for the connections between the users.

arangod> db._createEdgeCollection("knows");
[ArangoCollection 13669578, "knows" (type edge, status loaded)]
 
arangod> db.knows.save(john, stu, { "what" : "john knows stu" });
{ "_id" : "13669578/15111370", "_rev" : 15111370 }
 
arangod> db.knows.save(carla, john, { "what" : "carla knows john" });
{ "_id" : "13669578/15176906", "_rev" : 15176906 }
 
arangod> db.knows.save(carla, ana, { "what" : "carla knows ana" });
{ "_id" : "13669578/15242442", "_rev" : 15242442 }
 
arangod> db.knows.save(ana, carla, { "what" : "ana knows carla" });
{ "_id" : "13669578/15307978", "_rev" : 15307978 }
 
arangod> db.knows.save(carla, fred, { "what" : "carla knows fred" });
{ "_id" : "13669578/15373514", "_rev" : 15373514 }
 
arangod> db.knows.save(fred, john, { "what" : "fred knows john" });
{ "_id" : "13669578/15439050", "_rev" : 15439050 }
 
arangod> db.knows.save(fred, stu, { "what" : "fred knows stu" });
{ "_id" : "13669578/15504586", "_rev" : 15504586 }
 
arangod> db.knows.save(stu, fred, { "what" : "stu knows fred" });
{ "_id" : "13669578/15570122", "_rev" : 15570122 }

We can now use this mini-graph to run some AQL queries on it:
Query for “who is known (directly) by Carla” returns:

    ["carla", "John"]
    ["carla", "ana"]
    ["carla", "Fred"]
stmt = db._createStatement({
  "query" : "FOR p IN PATHS(users, knows,'outbound')
   FILTER p.source._id == @id && LENGTH(p.edges) == 1
   RETURN p.vertices[*].name
"});
stmt.bind("id", carla._id);
cursor = stmt.execute();
while (cursor.hasNext()) {
        print(cursor.next());
}

Query connections from Carla to others, via other users known by Carla
returns:

["carla", "John", "stu"]
["carla", "John", "stu", "Fred"]
["carla", "Fred", "John"]
["carla", "Fred", "John", "stu"]
["carla", "Fred", "stu"]
stmt = db._createStatement({ 
    "query" : "FOR p IN PATHS(users, knows, 'outbound') 
    FILTER p.source._id == @id && LENGTH(p.edges) > 1 RETURN
        p.vertices[*].name" 
    });
stmt.bind("id", carla._id);
cursor = stmt.execute();
while (cursor.hasNext()) {
  print(cursor.next());
}

Query “Who knows Carla returns: ana”

stmt = db._createStatement({ "query" : "FOR p IN PATHS(users, knows,
        'inbound') FILTER p.source._id == @id && LENGTH(p.edges) == 1 RETURN
        p.vertices[1].name" });
stmt.bind("id", carla._id);
cursor = stmt.execute();
while (cursor.hasNext()) {
  print(cursor.next());
}

In ArangoDB 1.1, PATHS() is the only graph-related functionality available in AQL. There is more you can do with graphs in ArangoDB. We hope to release ArangoDB version 1.2 soon, and it will provide a lot more graph-related functionality in AQL.

About Jan Steemann

Jan is a member of ArangoDB's core development team. He is an expert in data modelling with nosql & relational databases and writing high performance web applications. For ArangoDB, he wrote much of AQL (ArangoDB's query language).
  • jsteemann

    Actually I didn’t write this as an article…
    Seems someone else posted a digest of what I wrote on the ArangoDB Google group here.