i'm trying to do some aggregations queries and encouter some issues.
GET /my_index/_search
{
"size" : 0,
"aggs":{
"group_by":{
"terms": {
"field" : "category"
}
}
}
}
this is returning me :
"hits": {
"total": 180,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 1,
"buckets": [
{
"key": "pf_rd_m",
"doc_count": 139
},
{
"key": "other",
"doc_count": 13
},
{
"key": "_encoding",
"doc_count": 12
},
{
"key": "ie",
"doc_count": 10
},
{
"key": "cadeaux",
"doc_count": 2
},
{
"key": "cartes",
"doc_count": 2
},
{
"key": "cheques",
"doc_count": 2
},
{
"key": "home",
"doc_count": 2
},
{
"key": "nav_logo",
"doc_count": 1
},
{
"key": "ref",
"doc_count": 1
}
]
}
}
as you can see, this tells me that there is 180 documents, but if i do the sum of doc_count of every single key in my buckets, i find more elements…
this is certainly do to elasticsearch tokenization mecanism (https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html)
so i tryed the solution in this es post, but still not working. here is my mapping
"properties":{
"status":{
"type":"integer",
"index":"analyzed"
},
"category":{
"type":"string",
"fields": {
"raw" : {
"type": "string",
"index": "not_analyzed"
}
}
},
"dynamic_templates": [
{ "notanalyzed": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
as you can see, i have a field named "category". and added "raw" as an not_analyzed string, but still returns me wrong numbers.
when i try this :
GET /my_index/_search
{
"size" : 0,
"aggs":{
"group_by":{
"terms": {
"field" : "category.raw"
}
}
}
}
this returns :
"hits": {
"total": 180,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
it's very strange. any help ?
Best Answer
As described in the documentation,
To overcome this issue at the expense of resources, Shard size parameter can be used.
Again, from the documentation:
Shard Size
If you add the shard size parameter to the query: