Apache Solr vs Elasticsearch
API
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Format | XML, CSV, JSON | JSON |
HTTP REST API | ||
Binary API | SolrJ | TransportClient, Thrift (through a plugin) |
JMX support | ES specific stats are exposed through the REST API | |
Official client libraries | Java | Java, Groovy, PHP, Ruby, Perl, Python, .NET, Javascript Official list of clients |
Community client libraries | PHP, Ruby, Perl, Scala, Python, .NET, Javascript, Go, Erlang, Clojure | Clojure, Cold Fusion, Erlang, Go, Groovy, Haskell, Java, JavaScript, .NET, OCaml, Perl, PHP, Python, R, Ruby, Scala, Smalltalk, Vert.x Complete list |
3rd-party product integration (open-source) | Drupal, Magento, Django, ColdFusion, Wordpress, OpenCMS, Plone, Typo3, ez Publish, Symfony2, Riak (via Yokozuna) | Drupal, Django, Symfony2, Wordpress, CouchBase |
3rd-party product integration (commercial) | DataStax Enterprise Search, Cloudera Search, Hortonworks Data Platform, MapR | SearchBlox, Hortonworks Data Platform, MapR etc Complete list |
Output | JSON, XML, PHP, Python, Ruby, CSV, Velocity, XSLT, native Java | JSON, XML/HTML (via plugin) |
Infrastructure
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Master-slave replication | Not an issue because shards are replicated across nodes. | |
Integrated snapshot and restore | Filesystem | Filesystem, AWS Cloud Plugin for S3 repositories, HDFS Plugin for Hadoop environments, Azure Cloud Plugin for Azure storage repositories |
Indexing
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Data Import | DataImportHandler - JDBC, CSV, XML, Tika, URL, Flat File | [DEPRECATED in 2.x] Rivers modules - ActiveMQ, Amazon SQS, CouchDB, Dropbox, DynamoDB, FileSystem, Git, GitHub, Hazelcast, JDBC, JMS, Kafka, LDAP, MongoDB, neo4j, OAI, RabbitMQ, Redis, RSS, Sofa, Solr, St9, Subversion, Twitter, Wikipedia |
ID field for updates and deduplication | ||
DocValues | ||
Partial Doc Updates | with stored fields | with _source field |
Custom Analyzers and Tokenizers | ||
Per-field analyzer chain | ||
Per-doc/query analyzer chain | ||
Index-time synonyms | Supports Solr and Wordnet synonym format | |
Query-time synonyms | Solr 6 provides proper multi-word synonyms via SynonymGraphFilter | Synonym Graph Token Filter is in beta in ES 6.2 |
Multiple indexes | ||
Near-Realtime Search/Indexing | ||
Complex documents | ||
Schemaless | ||
Multiple document types per schema | One set of fields per schema, one schema per core | |
Online schema changes | Schemaless mode or via dynamic fields. | Only backward-compatible changes. |
Apache Tika integration | ||
Dynamic fields | ||
Field copying | via multi-fields | |
Hash-based deduplication | Murmur plugin or ER plugin | |
Index-time sorting |
Searching
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Lucene Query parsing | ||
Structured Query DSL | JSON Query DSL is new in Solr 7.x | |
Span queries | via SOLR-2703 | |
Spatial/geo search | ||
Multi-point spatial search | ||
Faceting | Top N term accuracy can be controlled with shard_size | |
Advanced Faceting | New Analytics component and JSON faceting API | blog post |
Geo-distance Faceting | ||
Pivot Facets | ||
More Like This | ||
Boosting by functions | ||
Boosting using scripting languages | ||
Push Queries | Via Streaming Expressions | Percolation. Distributed percolation supported in 1.0 |
Field collapsing/Results grouping | ||
Query Re-Ranking | via Rescoring or a plugin | |
Index-based Spellcheck | Phrase Suggester | |
Wordlist-based Spellcheck | ||
Autocomplete | ||
Document-oriented Autocomplete | Solr suggester return phrases not documents. | |
Learning to Rank | Via https://github.com/o19s/elasticsearch-learning-to-rank | |
Query elevation | workaround | |
Intra-index joins | via parent-child query | via has_children and top_children queries |
Inter-index joins | Joined index has to be single-shard and replicated across all nodes. | |
Resultset Scrolling | via scan search type | |
Filter queries | also supports filtering by native scripts | |
Filter execution order | local params and cache property | |
Alternative QueryParsers | DisMax, eDisMax | query_string, dis_max, match, multi_match etc |
Negative boosting | but awkward. Involves positively boosting the inverse set of negatively-boosted documents. | |
Search across multiple indexes | it can search across multiple compatible collections | |
Result highlighting | ||
Custom Similarity | ||
Searcher warming on index reload | Warmers API | |
Term Vectors API | ||
SQL queries | Via Parallel SQL. SolrCloud only | |
Distributed Map/Reduce processing | Via Streaming Expressions. SolrCloud only |
Customizability
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Pluggable API endpoints | ||
Pluggable search workflow | via SearchComponents | |
Pluggable update workflow | via UpdateRequestProcessor | |
Pluggable Analyzers/Tokenizers | ||
Pluggable QueryParsers | ||
Pluggable Field Types | ||
Pluggable Function queries | ||
Pluggable scoring scripts | ||
Pluggable hashing | ||
Pluggable webapps | [site plugins DEPRECATED in 5.x] blog post | |
Automated plugin installation | Installable from GitHub, maven, sonatype or elasticsearch.org |
Distributed
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Self-contained cluster | Depends on separate ZooKeeper server | Only Elasticsearch nodes |
Automatic node discovery | ZooKeeper | internal Zen Discovery or ZooKeeper |
Partition tolerance | The partition without a ZooKeeper quorum will stop accepting indexing requests or cluster state changes, while the partition with a quorum continues to function. | Partitioned clusters can diverge unless discovery.zen.minimum_master_nodes set to at least N/2+1, where N is the size of the cluster. If configured correctly, the partition without a quorum will stop operating, while the other continues to work. See this |
Automatic failover | If all nodes storing a shard and its replicas fail, client requests will fail, unless requests are made with the shards.tolerant=true parameter, in which case partial results are retuned from the available shards. | |
Automatic leader election | ||
Shard replication | ||
Sharding | ||
Automatic shard rebalancing | Solr Autoscaling is new in Solr 7. | it can be machine, rack, availability zone, and/or data center aware. Arbitrary tags can be assigned to nodes and it can be configured to not assign the same shard and its replicates on a node with the same tags. |
Change # of shards | Shards can be added (when using implicit routing) or split (when using compositeId). Cannot be lowered. Replicas can be increased anytime. | each index has 5 shards by default. Number of primary shards cannot be changed once the index is created. Replicas can be increased anytime. The Shrink Index API lets you reindex the index into a new index with fewer shards. |
Shard splitting | You can use the Index Splitting API to index to a new index with primary shards split. | |
Relocate shards and replicas | can be done by creating a shard replicate on the desired node and then removing the shard from the source node | can move shards and replicas to any node in the cluster on demand |
Control shard routing | shards or _route_ parameter | routing parameter |
Pluggable shard/replica assignment | New Autoscaling API replaces the old rule-based replica assignment | Probabilistic shard balancing with Tempest plugin |
Avoid duplicate indexing on replicas | Solr 7 provides 3 kinds of replica types: NRT (default and the pre-Solr 7 behavior), tlog and pull. Non-SolrCloud master-slave replication can be achieved with tlog replica types. | |
Consistency | Indexing requests are synchronous with replication. A indexing request won't return until all replicas respond. No check for downed replicas. They will catch up when they recover. When new replicas are added, they won't start accepting and responding to requests until they are finished replicating the index. | Replication between nodes is synchronous by default, thus ES is consistent by default, but it can be set to asynchronous on a per document indexing basis. Index writes can be configured to fail is there are not sufficient active shard replicas. The default is quorum, but all or one are also available. |
Misc
Feature | Solr 7.2.1 | Elasticsearch 6.2.4 |
---|---|---|
Web Admin interface | bundled with Solr | Marvel or Kibana apps |
Visualisation | Banana (Port of Kibana) | Kibana |
Hosting providers | WebSolr, Searchify, Hosted-Solr, IndexDepot, OpenSolr, gotosolr | Found, Scalefastr, ObjectRocket, bonsai.io, Indexisto, qbox.io, IndexDepot, Compose.io, Sematext Logsene |