Elasticsearch, Nest and Lucene.net

elasticsearchlucene.netnest

I know that Elasticsearch is based on Lucene but I wonder if Elasticsearch gives me any benefits developing a search engine rather than coding with Lucene.Net directly. Sorry, If question is a bit simple but I am confusing after searching the possibilities for creating a search engine.

I found more examples for simple lucene.net search but not many for Elasticsearch and Nest. Another question is what is the difference between Nest and Elasticsearch indeed? are they same?

if someone throws me some light here, maybe with a nice sample, I appreciate. what I need is? Easy, quick and fast search engine. what would be the best option? any other alternative can be also but only .net (c# or vb) thanks.

Best Answer

Lucene

Lucene and the .NET port, Lucene.Net, is a search engine library for supporting full-text search in an application; it builds an inverted index based on the Document (and the fields within the Document) that you feed it to support full-text search. An example of this is search within the Nuget Gallery source, where a nuget package and its properties is converted to a document to pass to Lucene. The inverted index is stored across files within a directory.

Elasticsearch

Elasticsearch is a distributed search engine that uses Lucene under the covers - An Elasticsearch cluster can be made up of one or more nodes, where each node can contain a number of shards and replicas; each shard is a complete Lucene index. Having such infrastructure enables fast performance and allows horizontal scaling to handle search across a large amount of data since you are no longer limited by the constraints of a single Lucene index on a single machine. In addition you can achieve high availability with fault tolerance and disaster recovery since data can be replicated across shards meaning there is no single point of failure. An example of Elasticsearch with NEST is up on my blog.

Which to use?

Well, it depends on your use case (it nearly always does, right?); if your application is one that gets installed onto a machine and all data is persisted locally, you might decide to use Lucene library within the application and persist the index directory to local disk. Similarly, if you have a simple web application that runs on a single server with a small number of users then using Lucene may also be a sensible choice. On the other hand, if your application runs across multiple machines in a web farm and requires search capabilities, going with a distributed search engine like Elasticsearch would be a good idea.

How well does Elasticsearch scale? Back in 2013, Github was using Elasticsearch to index 2 billion documents i.e. all the code files in every repository on the site - across 44 separate Amazon EC2 instances, each with two terabytes of ephemeral SSD storage, giving a total of 30 terabytes of primary data. Stackoverflow also uses Elasticsearch to power search on this site (perhaps a dev could comment with some figures/metrics?)

Related Topic