Organizational Research By

Surprising Reserch Topic

Mongodb : Full text search options

We are planning to store millions of documents in Mongodb and full text search is very much required. I read ElasticSearch or Solr are the best available solution for full text search.

-- Is Elastic search is mature enough to be used for Mongodb full text search? We also be sharding the collections. Does Elasticsearch works with Sharded collection.

-- What are the advantage and disadvantage of using ElasticSearch or Solr.

-- Is Mongo capable of doing full text search?

asked May 16, 2015 in SOLR by rajesh
0 votes

Related Hot Questions

2 Answers

0 votes
There are some search capabilities in MongoDB but it is not as feature-rich as search engines.

We use Mongo with Solr to make content searchable. We prefer Solr because

It is easy to configure and customize
It has large community (This is really helpful if you are working with opensource tools)
Since we didn't work with ES i could not say much about it. You can found some discussions about Solr vs ES on the links below.

Solr vs ES 1
Solr vs ES 2
Solr vs ES 3
answered May 16, 2015 by rajesh
0 votes

Model Data to Support Keyword Search


Keyword search is not the same as text search or full text search, and does not provide stemming or other text-processing features. See the Limitations of Keyword Indexes section for more information.

In 2.4, MongoDB provides a text search feature. See Text Indexes for more information.

If your application needs to perform queries on the content of a field that holds text you can perform exact matches on the text or use $regex to use regular expression pattern matches. However, for many operations on text, these methods do not satisfy application requirements.

This pattern describes one method for supporting keyword search using MongoDB to support application search functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key index, this pattern can support application’s keyword search operations.


To add structures to your document to support keyword-based queries, create an array field in your documents and add the keywords as strings in the array. You can then create a multi-key index on the array and create queries that select values from the array.


Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array topics, and you add as many keywords as needed for a given volume.

For the Moby-Dick volume you might have the following document:

{ title : "Moby-Dick" ,
  author : "Herman Melville" ,
  published : 1851 ,
  ISBN : 0451526996 ,
  topics : [ "whaling" , "allegory" , "revenge" , "American" ,
    "novel" , "nautical" , "voyage" , "Cape Cod" ]

You then create a multi-key index on the topics array:

db.volumes.createIndex( { topics: 1 } )

The multi-key index creates separate index entries for each keyword in the topics array. For example the index contains one entry for whaling and another for allegory.

You then query based on the keywords. For example:

db.volumes.findOne( { topics : "voyage" }, { title: 1 } )


An array with a large number of elements, such as one with several hundreds or thousands of keywords will incur greater indexing costs on insertion.

Limitations of Keyword Indexes

MongoDB can support keyword searches using specific data models and multi-key indexes; however, these keyword indexes are not sufficient or comparable to full-text products in the following respects:

  • Stemming. Keyword queries in MongoDB can not parse keywords for root or related words.
  • Synonyms. Keyword-based search features must provide support for synonym or related queries in the application layer.
  • Ranking. The keyword look ups described in this document do not provide a way to weight results.
  • Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for keyword indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be more efficient for some kinds of content and workloads
answered May 16, 2015 by rajesh