Organizational Research By

Surprising Reserch Topic

How to get facet ranges in solr results?



*****ume that i have a field called price for the do*****ents in solr and i have that field faceted. i want to get the facets as ranges of values (eg: 0-100, 100-500, 500-1000, etc). how to do it?

i can specify the ranges beforehand, but i also want to know whether it is possible to calculate the ranges (say for 5 values) automatically based on the values in the do*****ents?

asked May 16, 2015 in SOLR by rajesh
0 votes

Related Hot Questions

3 Answers

0 votes

to answer your first question, you can get facet ranges by using the the generic facet query support. here's an example:

http://localhost:8983/solr/select?q=video&rows=0&facet=true&facet.query=price:[* to 500]&facet.query=price:[500 to *]

as for your second question (automatically suggesting facet ranges), that's not yet implemented. some argue that this kind of querying would be best implemented on your application rather that letting solr "guess" the best facet ranges.

answered May 16, 2015 by rajesh
0 votes
there may well be a better solr-specific answer, but i work with straight lucene, and since you're not getting much traction i'll take a stab. there, i'd create a populate a filter with a filteredquery wrapping the original query. then i'd get a fieldcache for the field of interest. enumerate the hits in the filter's bitset, and for each hit, you get the value of the field from the field cache, and add it to a sortedset. when you've got all of the hits, divide the size of the set into the number of ranges you want (five to seven is a good number according the user interface guys), and rather than a single-valued constraint, your facets will be a range query with the lower and upper bounds of each of those subsets.

i'd recommend using some special-case logic for a small number of values; obviously, if you only have four distinct values, it doesn't make sense to try and make 5 range refinements out of them. below a certain threshold (say 3*your ideal number of ranges), you just show the facets normally rather than ranges.
answered May 16, 2015 by rajesh
0 votes
i have worked out how to calculate sensible dynamic facets for product price ranges. the solution involves some pre-processing of do*****ents and some post-processing of the query results, but it requires only one query to solr, and should even work on old version of solr like 1.4.

round up prices before submission
first, before submitting the do*****ent, round up the the price to the nearest "nice round facet boundary" and store it in a "rounded_price" field. users like their facets to look like "250-500" not "247-483", and rounding also means you get back hundreds of price facets not millions of them. with some effort the following code can be generalised to round nicely at any price scale:

 public static decimal roundprice(decimal price) { if (price < 25) return math.ceiling(price); else if (price < 100) return math.ceiling(price / 5) * 5; else if (price < 250) return math.ceiling(price / 10) * 10; else if (price < 1000) return math.ceiling(price / 25) * 25; else if (price < 2500) return math.ceiling(price / 100) * 100; else if (price < 10000) return math.ceiling(price / 250) * 250; else if (price < 25000) return math.ceiling(price / 1000) * 1000; else if (price < 100000) return math.ceiling(price / 2500) * 2500; else return math.ceiling(price / 5000) * 5000; }
permissible prices go 1,2,3,...,24,25,30,35,...,95,100,110,...,240,250,275,300,325,...,975,1000 and so forth.

get all facets on rounded prices
second, when submitting the query, request all facets on rounded prices sorted by price: facet.field=rounded_price. thanks to the rounding, you'll get at most a few hundred facets back.

combine adjacent facets into larger facets
third, after you have the results, the user wants see only 3 to 7 facets, not hundreds of facets. so, combine adjacent facets into a few large facets (called "segments") trying to get a roughly equal number of do*****ents in each segment. the following rather more complicated code does this, returning tuples of (start, end, count) suitable for performing range queries. the counts returned will be correct provided prices were been rounded up to the nearest boundary:

 public static list> combinepricefacets(int nsegments, icollection> prices) { var ranges = new list>(); int productcount = prices.sum(p => p.value); int productsremaining = productcount; if (nsegments < 2) return ranges; int segmentsize = productcount / nsegments; string start = "*"; string end = "0"; int count = 0; int totalcount = 0; int segmentidx = 1; foreach (keyvaluepair price in prices) { end = price.key; count = price.value; totalcount = price.value; productsremaining -= price.value; if (totalcount >= segmentsize * segmentidx) { ranges.add(new tuple(start, end, count)); start = end; count = 0; segmentidx = 1; } if (segmentidx == nsegments) { ranges.add(new tuple(start, "*", count productsremaining)); break; } } return ranges; }
filter results by selected facet
fourth, suppose ("250","500",38) was one of the resulting segments. if the user selects "$250 to $500" as a filter, simply do a filter query fq=price:[250 to 500]
answered May 16, 2015 by rajesh