This plugin adds the possibility to create hierarchical aggregations. Each term is split on a provided separator (default "/") then aggregated by level. For a complete example see elastic/elasticsearch#8896
Two different aggregations are available:
path_hierarchyfor hierarchical aggregations onkeywordsfield orscriptsdate_hierarchyfor hierachical aggregations ondatefields. It is more optimized to use this aggregation for date instead of a script.
This is a multi bucket aggregation.
bin/plugin --install path_hierarchy --url "https://github.com/opendatasoft/elasticsearch-aggregation-pathhierarchy/releases/download/v7.6.0.0/pathhierarchy-aggregation-7.6.0.0.zip"
Requires Java 12
fieldorscript: field to aggregate onseparator: separator for path hierarchy (default to "/")order: order parameter to define how to sort result. Allowed parameters are_key,_countor sub aggregation name. Default to {"_count": "desc}.size: size parameter to define how many buckets should be returned. Default to 10.shard_size: how many buckets returned by each shards. Set to size if smaller, default to size if the search request needs to go to a single shard, and (size * 1.5 + 10) otherwise (more information here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size_3).min_depth: Set minimum depth level. Default to 0.max_depth: Set maximum depth level.-1means no limit. Default to 3.depth: Retrieve values for specified depth. Shortcut, instead of settingmin_depthandmax_depthparameters to the same value.keep_blank_path: Keep blank path as bucket. if this option is set to false, chained separator will be ignored. Default to false.min_doc_count: Return buckets containing at leastmin_doc_countdocument. Default to 0
# Add data:
PUT filesystem
{
"mappings": {
"properties": {
"path": {
"type": "keyword"
}
}
}
}
PUT /filesystem/_doc/1
{
"path": "/My documents/Spreadsheets/Budget_2013.xls",
"views": 10
}
PUT /filesystem/_doc/2
{
"path": "/My documents/Spreadsheets/Budget_2014.xls",
"views": 7
}
PUT /filesystem/_doc/3
{
"path": "/My documents/Test.txt",
"views": 1
}
# Path hierarchy request :
GET /filesystem/_search?size=0
{
"aggs": {
"tree": {
"path_hierarchy": {
"field": "path",
"separator": "/"
},
"aggs": {
"total_views": {
"sum": {
"field": "views"
}
}
}
}
}
}
Result :
{"aggregations": {
"tree": {
"sum_other_doc_count": 0,
"buckets": [
{
"key": "My documents",
"doc_count": 3,
"total_views": {
"value": 18
},
"tree": {
"buckets": [
{
"key": "Spreadsheets",
"doc_count": 2,
"total_views": {
"value": 17
},
"tree": {
"buckets": [
{
"key": "Budget_2013.xls",
"doc_count": 1,
"total_views": {
"value": 10
}
},
{
"key": "Budget_2014.xls",
"doc_count": 1,
"total_views": {
"value": 7
}
}
]
}
},
{
"key": "Test.txt",
"doc_count": 1,
"total_views": {
"value": 1
}
}
]
}
}
]
}
}
PUT calendar
{
"mappings": {
"properties": {
"date": {
"type": "date"
}
}
}
}
PUT /calendar/_doc/1
{
"date": "2012-01-10T02:47:28"
}
PUT /calendar/_doc/2
{
"date": "2012-01-05T01:43:35"
}
PUT /calendar/_doc/3
{
"date": "2012-05-01T12:24:19"
}
GET /calendar/_search?size=0
{
"aggs": {
"tree": {
"path_hierarchy": {
"script": "doc['date'].value.toOffsetDateTime().format(DateTimeFormatter.ofPattern('yyyy/MM/dd'))",
"order": {
"_key": "asc"
}
}
}
}
}
Result :
{"aggregations": {
"tree": {
"buckets": [
{
"key": "2012",
"doc_count": 3,
"tree": {
"buckets": [
{
"key": "01",
"doc_count": 2,
"tree": {
"buckets": [
{
"key": "05",
"doc_count": 1
},
{
"key": "10",
"doc_count": 1
}
]
}
},
{
"key": "05",
"doc_count": 1,
"tree": {
"buckets": [
{
"key": "01",
"doc_count": 1
}
]
}
}
]
}
}
]
}
}
}
field: field to aggregate on. This parameter is mandatoryinterval: date interval used to create the hierarchy. Allowed values are:years,months,days,hours,minutes,secondsDefault toyears.order: order parameter to define how to sort result. Allowed parameters are_key,_countor sub aggregation name. Default to {"_count": "desc}.size: size parameter to define how many buckets should be returned. Default to 10.shard_size: how many buckets returned by each shards. Set to size if smaller, default to size if the search request needs to go to a single shard, and (size * 1.5 + 10) otherwise (more information here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size_3).min_doc_count: Return buckets containing at leastmin_doc_countdocument. Default to 0
PUT calendar
{
"mappings": {
"properties": {
"date": {
"type": "date"
}
}
}
}
PUT /calendar/_doc/1
{
"date": "2012-01-10T02:47:28"
}
PUT /calendar/_doc/2
{
"date": "2012-01-05T01:43:35"
}
PUT /calendar/_doc/3
{
"date": "2012-05-01T12:24:19"
}
GET /calendar/_search?size=0
{
"aggs": {
"tree": {
"date_hierarchy": {
"interval": "days",
"order": {
"_key": "asc"
}
}
}
}
}
Plugin versions are available for (at least) all minor versions of Elasticsearch since 6.0.
The first 3 digits of plugin version is Elasticsearch versioning. The last digit is used for plugin versioning under an elasticsearch version.
To install it, launch this command in Elasticsearch directory replacing the url by the correct link for your Elasticsearch version (see table)
./bin/elasticsearch-plugin install https://github.com/opendatasoft/elasticsearch-aggregation-pathhierarchy/releases/download/v7.6.0.0/pathhierarchy-aggregation-7.6.0.0.zip
This software is under The MIT License (MIT).