Get list of distinct values from existing query result set

Hi all,

I have a elasticsearch database query (get_Text_records) from which I want to generate a list with distinct (or unique) values.
The following query shows all records in a table:

{{get_Text_Records.data.hits.hits.map((hits) => {
    return {
        id: hits._source.id,
        author_id: hits._source.author_id,
        text: hits._source.text,
    };
    });
}}

The ‘id’ field is unique for every record but the ‘author_id’ is not (if it helps understanding: one author has produced several texts)
I want to generate a list of author_id’s without duplications and use that for an other table. How to do that?

I read about “collapse”: { “field”: “author_id” } but could not get it to work.
I do not want to re-run the query on the server but instead, work with the existing results on the client side.

regards,
Tom

I think this is what you are looking for - Get distinct values from a field in ElasticSearch - #2 by vivek7mehta - Elasticsearch - Discuss the Elastic Stack

Thanks for the help. It works now, and I also managed to find out how the ‘collapse’ method works,

GET /indexdata/_search
{
 	"query": {
  	  "match": {"written_text": "{{searchterm.text}}" 
			}					
 	},
"size":"0",
"aggs" : {
	"alltexts" : {
		"terms" : { "field" : "author_id.keyword" }
  }
}
} 

gives back a small ‘bucket’ array with the id’s and their corresponding count. BTW you need to specify “.keyword” for the terms field name. Has to do with search efficiency.

GET /indexdata/_search
{
 	"query": {
  	  "match": {"written_text": "{{searchterm.text}}" 
			}					
 	},
	"collapse" : {"field": "author_id.keyword" 
   },
   "sort": ["author_id.keyword"], 
   "from": 0
} 

also gives back an array with all id’s , but combined with the entire record that they are in. This is much more data which should be restricted, but I do not know how (yet).

regards

1 Like