Getting Connected - MongoClient Interface

We start looking at how to connect to a MongoDB server via the driver's API. The main interface to the driver is via the MongoClient interface. The MongoFactory class provides the ability to create a MongoClient instance from either a MongoClientConfiguration or a MongoDB Connection String.

MongoClient mongo = MongoFactory.createClient( "mongodb://server:port/db?maxConnectionCount=10" );

MongoClientConfiguration config = new MongoClientConfiguration();
config.addServer("server:port");
config.setMaxConnectionCount(10);

MongoClient mongo2 = MongoFactory.create( config );

The MongoClient instance returned represents the logical connection to the MongoDB servers. The close() method should be used to cleanly shutdown the connections to the server.

Manipulating Documents - The BSON Library

The driver includes a strongly typed BSON implementation. The BSON package (and driver in general) make extensive use of builders for constructing BSON Documents.

Building a document is done via the builder and then adding the appropriate fields to document in the desired order. The builder supports method call chaining for quick document construction.

DocumentBuilder builder = BuilderFactory.start();

builder.add("a", 123).add("b", 123L).add("c", "123");

Document        document = builder.build();

Most of the API methods of the driver accept a DocumentAssignable where a Document is needed. This allows passing a DocumentBuilder directly instead of manually calling the build() method.

To facilitate easier iteration over the elements of a document the interface extends the Iterable interface so the document can be used in enhanced for loops over its contained Elements.

Document document = ...;

for( Element element : document ) {
  // Process the element.
}

For accessing individual fields of the document there are two versions of the get(...) method. The Document.get(String) method takes the name of the element to retrieve and returns that element, if it exists in the document. The Document.get(Class, String) method takes the class of element and the name and returns the element, if it exists, and is assignable to the class's type.

//{
//  field : "abc",
//  intField : 123
//}
Document       document = BuilderFactory.start().add("field",    "abc")
                                                .add("intField", 123).build();

Element        field    = document.get("field");
IntegerElement intField = document.get(IntegerElement.class, "intField");

That is good for looking at a single field of a document but what about nested fields? Consider a document in the next code block's comments. In order to access all of the comment "text" field I would need to write something to pull out the comments, iterate over the sub document to pull out each text field and then save those in a list. Since this is such a common pattern the driver provides a helper method to do the iteration for you: Document.find(String...).

//{
//  comments : [
//     { text : "Now is the time...",       user : "alice" },
//     { text : "Well, it was the time.",   user : "bob"   },
//     { text : "Could be the time again.", user : "carol" },
//     { text : "Let me know when.",        user : "david" },
//     { text : "Now is the time...",       user : "edith" }
//  ]
//}
DocumentBuilder docBuilder = BuilderFactory.start();
ArrayBuilder    comments   = docBuilder.pushArray("comments");
comments.push().add("text", "Now is the time...").add("user", "alice");
comments.push().add("text", "Well, it was the time.").add("user", "bob");
comments.push().add("text", "Could be the time again.").add("user", "carol");
comments.push().add("text", "Let me know when.").add("user", "david");
comments.push().add("text", "Now is the time...").add("user", "edith");

Document            document = docBuilder.build();

List<Element>       elements = document.find("comments", ".*", "text");
List<StringElement> stringElements = document.find(StringElement.class, "comments", ".*", "text");

Both find(...) methods takes a "query path" to follow and find the elements of interest. The components of the query path may be regular expressions for greater flexibility. As stated in the Javadoc for both methods they may return empty lists but will never return null. The sister method findFirst(...) return the first element matching the query path.

So we can easily loop over all of the first level elements in the document and pick out specific fields. What if we need to process the entire tree of elements in the document? In that case you can implement the Visitor interface and receive callbacks as the document is traversed.

Document document = ...;
Visitor  visitor = ...;

document.visit(visitor);

Talking to the Server - MongoDatabase and MongoCollection

Commands, queries, updates and deletes are sent to the server via the MongoDatabase and MongoCollection interfaces. They are created from a MongoClient instance.

MongoClient mongo = ...;

MongoDatabase db = mongo.getDatabase( "database_name" );
MongoCollection collection = db.getCollection( "collection_name" );

Most non-administrative methods on the MongoCollection interfaces will have three forms:

  1. Synchronous - Normal method call semantics. The response to the request is the return value of the method.
    Document findOne(Document)
  2. Asynchronous Future - The response to the request can be retrieved from a Future.
    Future<Document> findOneAsync(Document)
  3. Asynchronous Callback - The response to the request will be provided via a callback.
    void findOneAsync(Callback<Document>, Document)

Use of the asynchronous variants is optional but provides the ability to send multiple requests to the server without blocking for each reply. In the example below we submit 10,000 delete requests to the server and then collect the result of each delete after sending the last request. We expect the first request to finish while we are still sending requests and to only need to wait for the last few replies in the second loop.

MongoCollection    collection = ...;

List<Future<Long>> deleteReplies = new ArrayList<Future<Long>>();
for( int i = 0; i < 10000; ++i ) {
        Future<Long> future = collections.deleteAsync(where("_id").equals(i));
        deleteReplies.add(future);
}
long totalCount = 0;
for( Future<Long> future : deleteReplies ) {
  totalCount += future.get().longValue();
}
System.out.println("Deleted " + totalCount + " documents.");

When querying for documents a MongoIterator is returned. It is important to close the iterator if the documents are not exhausted. This can be done with a traditional try/finally block or the, new in Java 7, try-with-resource block.

MongoCollection collection = ...;
try ( MongoIterator iter = collection.find( where("_id").lessThan(42) ) )
{
   for( Document document : iter ) {
       // Process the document.
   }
}

Under the hood the iterator uses the power of the asynchronous driver to request more documents from the server while the application is iterating over the current set of documents. This can have a significant positive impact on performance. Even greater performance improvements can be achieved if the batch size for the iterator is tuned such that the time to receive each batch matches the time to process each batch.

Query and Command Support

For queries and common commands a set of domain objects and associated builders are provided. The query builder provides a natural mechanism for defining even the most complicated of queries.

 import static com.allanbank.mongodb.builder.QueryBuilder.and;
 import static com.allanbank.mongodb.builder.QueryBuilder.or;
 import static com.allanbank.mongodb.builder.QueryBuilder.not;
 import static com.allanbank.mongodb.builder.QueryBuilder.where;
 
 Document query = 
           or( 
              where("f").greaterThan(23).lessThan(42).and("g").lessThan(3),
              and( 
                where("f").greaterThanOrEqualTo(42),
                not( where("g").lessThan(3) ) 
              )
           );

Aggregate Pipeline Support

A more complex example of the builder support is provided by the aggregate command. A a set of helper methods and classes have been created to ease the effort required to construct complex pipelines of operators including the structures and expressions they contain. The aggregate builder extends and integrates with the BSON DocumentBuilder, QueryBuilder and Expressions support. Consider the following pipeline inspired by an example in the MongoDB documentation:

 import static com.allanbank.mongodb.builder.AggregationGroupField.set;
 import static com.allanbank.mongodb.builder.AggregationGroupId.id;
 import static com.allanbank.mongodb.builder.AggregationProjectFields.includeWithoutId;
 import static com.allanbank.mongodb.builder.QueryBuilder.where;
 import static com.allanbank.mongodb.builder.Sort.asc;
 import static com.allanbank.mongodb.builder.Sort.desc;
 import static com.allanbank.mongodb.builder.expression.Expressions.field;
 import static com.allanbank.mongodb.builder.expression.Expressions.set;
 
 DocumentBuilder b1 = BuilderFactory.start();
 DocumentBuilder b2 = BuilderFactory.start();
 Aggregate.Builder builder = new Aggregate.Builder();
 
 builder.match(where("state").notEqualTo("NZ"))
         .group(id().addField("state")
                    .addField("city"),
                set("pop").sum("pop"))
         .sort(asc("pop"))
         .group(id("_id.state"), 
                set("biggestcity").last("_id.city"),
                set("biggestpop").last("pop"),
                set("smallestcity").first("_id.city"),
                set("smallestpop").first("pop"))
         .project(
                 includeWithoutId(),
                 set("state", field("_id")),
                 set("biggestCity",
                         b1.add(set("name", field("biggestcity"))).add(
                                 set("pop", field("biggestpop")))),
                 set("smallestCity",
                         b2.add(set("name", field("smallestcity"))).add(
                                 set("pop", field("smallestpop")))))
         .sort(desc("biggestCity.pop"));