c# - ElasticSearch Nest. Get aggregation result within timespan excluding another timespan -
followed this question - unique user within timespan, i'd filter out users appeared within given timespan.
in example, list of users visited only in year 2016, not in year 2017. not case when use filter 2016
timespan user might appeared in 2017
also. so, possible trial make [2016..2017 users] - [2017 users]
set.
my trial send 2 queries([2016..2017 users]
, [2017 users]
) es , filtered out using userlist_20162017.except(userlist_2017)
in application.
but seems inefficient approach think. achieve elasticsearch nest query?
void main() { var client = new elasticclient(connectionsettings); var twoyearsago = new datetime(2016,1,1); var yearago = new datetime(2017,1,1); // 2016..2017 users var searchresponse20162017 = client.search<visitor>(s => s .size(0) .query(q => q .daterange(c => c.field(p => p.creationdate) .greaterthan(twoyearsago) .lessthan(dateetime.utcnow) ) ) .aggregations(a => .terms("unique_users", c => c .field(f => f.owneruserid) .size(int.maxvalue) ) ) ); // 2017 users var searchresponse2017 = client.search<visitor>(s => s .size(0) .query(q => q .daterange(c => c.field(p => p.creationdate) .greaterthan(yearago) .lessthan(dateetime.utcnow) ) ) .aggregations(a => .terms("unique_users", c => c .field(f => f.owneruserid) .size(int.maxvalue) ) ) ); var uniqueuser20162017 = searchresponse20162017.aggs.terms("unique_users").buckets.select(b => b.keyasstring).tolist(); var uniqueuser2017 = searchresponse2017.aggs.terms("unique_users").buckets.select(b => b.keyasstring).tolist(); // final result. seems naïve , inefficient. var uniqueuser2016only = searchresponse20162017.except(searchresponse2017); }
it's possible filter
sub aggregation; first, unique ids range 2016 , 2017 terms
aggregation, perform sub aggregation on ids not in range 2017. if document count terms
aggregation equal document count filter aggregation, id in 2016 , not 2017.
here's example
void main() { var pool = new singlenodeconnectionpool(new uri("http://localhost:9200")); var defaultindex = "examples"; var connectionsettings = new connectionsettings(pool) .defaultindex(defaultindex); var client = new elasticclient(connectionsettings); if (client.indexexists(defaultindex).exists) client.deleteindex(defaultindex); var examples = new[]{ new example(1, new datetime(2016, 01, 01)), new example(1, new datetime(2017, 01, 01)), new example(2, new datetime(2016, 01, 01)), new example(3, new datetime(2017, 01, 01)), }; client.bulk(b => b .indexmany(examples) .refresh(refresh.waitfor)); client.search<example>(s => s .size(0) .query(q => +q .daterange(c => c.field(p => p.date) .greaterthanorequals(new datetime(2016, 01, 01)) .lessthan(new datetime(2018, 01, 01)) ) ) .aggregations(a => .terms("ids_in_2016_and_2017", c => c .field(f => f.exampleid) .size(int.maxvalue) .aggregations(aa => aa .filter("ids_only_in_2016", f => f .filter(ff => +!ff .daterange(d => d .field(p => p.date) .greaterthanorequals(new datetime(2017, 01, 01)) .lessthan(new datetime(2018, 01, 01)) ) ) ) ) ) ) ); } public class example { public example(int exampleid, datetime date) { exampleid = exampleid; date = date; } public int exampleid { get; set; } public datetime date { get; set; } }
exampleid
2 in 2016 , not in 2017, doc count for 2016 , 2017 equal doc count 2016
{ "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 4, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "ids_in_2016_and_2017" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : 1, "doc_count" : 2, "ids_only_in_2016" : { "doc_count" : 1 } }, { "key" : 2, "doc_count" : 1, "ids_only_in_2016" : { "doc_count" : 1 } }, { "key" : 3, "doc_count" : 1, "ids_only_in_2016" : { "doc_count" : 0 } } ] } } }
*op appended: result list of userid.
var list = searchresponse1.aggs.terms("ids_in_2016_2017").buckets .select(o => new { userid = o.key, doccount = o.doccount == ((nest.singlebucketaggregate)o.aggregations["ids_only_in_2016"]).doccount }) .where(x => x.doccount == true) .select(x => x.userid) .tolist();
Comments
Post a Comment