Query Execution
The API implements query functionality using the same Lease mechanism as used for update
operations.
This design decision is based on the assumption that Lucene components such as SearcherManager and
SearcherTaxonomyManager provide the functionality to ensure that searches we perform are in sync with
Lucene generation increments as a result of update operations.
Further testing is required to confirm the above assumption. Theoretically however, ensuring that the same Lease
is used to both execute the query and retrieve a document should mean that the TopDocs returned by a search
match the Document values that are subsequently retrieved. I.e. they are of the same document generation.
Querying
Three interfaces participate in a general abstraction to provide results of <R> for a query of <T>:
QueryFactory
The factory is used to encapsulate the building of a Lucene query which represents any domain-specific type.
It is essentially a mapping function from T to Query:
@FunctionalInterface
public interface QueryFactory<T> {
Query query(T value);
}
QueryExecutor
An executor is provides results for a query based on a domain-specific type. The default executors provided
by the API delegate query generation to a QueryFactory.
@FunctionalInterface
public interface QueryExecutor<T, R> {
Result<R> execute(T query);
}
Result
Query execution provides access to an iteration of domain-specific values.
public interface Result<T> extends Iterable<Hit<T>>, AutoCloseable {
long totalHits();
/**
* Provides this result as a closeable stream, adding a call to {@link #close()}
* on {@link Stream#onClose(Runnable)}.
* Should
* @return a stream of {@link Hit}s
*/
default Stream<Hit<T>> stream() {
return StreamSupport.stream(spliterator(), false)
.onClose(this::close);
}
/**
* Provides this result as a collection and releases resources
* @return a collection of {@link Hit}s
*/
default List<Hit<T>> toList() {
try(var stream = this.stream()) {
return stream.toList();
}
}
}
At time of writing, a Result is an AutoCloseable resource which assumes lazy retrieval of document data.
As stated in the introduction to this section, we ideally want the same Lucene searcher to both a) perform the
search and b) retrieve document data. Being AutoCloseable allows us to utilise try-with-resources achieve this
while also ensuring the searcher is released once we are done.
It is relatively simple to for an application to wrap this functionality in another function should eager retrieval
be required. Result provides stream and toList which can be used to facilitate this.
Examples
Basic query workflow
- ① Create a
QueryFactoryto act as a template for querying the_allfield withStringvalues - ② Use the factory from a
QueryExecutorto perform searches - ③ Use the executor to perform a search for documents matching an instance of
T - ④ Reuse the executor to perform subsequent searches
@Test
void textSearch(LuceneBackend backend) {
// ①
var parser = new QueryParser("_all", new StandardAnalyzer());
QueryFactory<String> queryByName = value -> {
try {
return parser.parse(value);
} catch (ParseException e) {
throw new QueryException("Failed to parse query", e);
}
};
// ②
var queryExecutor = new DefaultQueryExecutor<>(queryByName, backend.searcherLeaseFactory());
// ③
logger.info("Searching for \"Spain\"");
try(var result = queryExecutor.execute("Spain")) {
result.forEach(hit -> logger.info("[{}] Hit: {}", hit.score(), hit.value()));
assertThat(result.totalHits()).isEqualTo(2);
}
// ④
logger.info("Searching for \"spain +madrid\"");
try(var result = queryExecutor.execute("spain +madrid")) {
result.forEach(hit -> logger.info("[{}] Hit: {}", hit.score(), hit.value()));
assertThat(result.totalHits()).isEqualTo(1);
}
}
Mapping to domain types during iteration
Query execution can map results to domain types.
- ① Create a
QueryFactoryto act as a template for querying theiso.keywordfield - ② Create a
QueryExecutorwhose result iterator is mapped viadomainOperations::readDocument - ③ Use the executor to perform searches that provide instances of
ShortCountry
@Test
void keywordSearch(LuceneBackend backend,
DomainOperations<ShortCountry> domainOperations) {
// ①
QueryFactory<String> queryByIsoCode = value -> new TermQuery(new Term("iso.keyword", value));
// ②
var queryExecutor = new DefaultQueryExecutor<>(queryByIsoCode, backend.searcherLeaseFactory())
.withIterator(Result.IteratorFactory.mapping(domainOperations::readDocument));
// ③
try(var result = queryExecutor.execute("ESP")) {
result.forEach(hit -> logger.info("[{}] Country: {}", hit.score(), hit.value()));
assertThat(result.totalHits()).isEqualTo(1);
}
}
Unbounded / paged / streamed results
PagedQueryExecutor allows an unbounded number of hits to be returned. The simplest way to handle these results
is via the Result#stream() method.
- ① Create a
PagedQueryExecutor - ② Read all results made available via
Result#stream() - ③ Use any
Streamoperators to provide an alternative view of the results
@Test
void pagingResults(LuceneBackend backend) {
var queryByRegion = QueryFactories.text("region.text");
var queryOptions = QueryOptions.DEFAULT;
// ①
var pagedQueryExecutor = new DefaultPagedQueryExecutor<>(queryByRegion, backend.searcherLeaseFactory());
// ②
try(var stream = pagedQueryExecutor.execute("americas").stream()) {
var pagedCount = stream.count();
logger.info("Paged executor found {} hits", pagedCount);
assertThat(pagedCount).isGreaterThan(queryOptions.maxHits());
}
// ③
try(var stream = pagedQueryExecutor.execute("americas").stream()) {
var limitedCount = stream.limit(20).count();
logger.info("Limited result contain {} hits", limitedCount);
assertThat(limitedCount).isEqualTo(20);
}
}
Example application abstraction
Depending on the application requirements we might want to abstract away from the try-with-resources syntax. For example, when only a single result or only a known maximum number of results is required.
In such cases we could define a domain-specific operation, such as:
@FunctionalInterface
public interface CountryLookup {
Optional<ShortCountry> lookup(String term);
}
where it is trusted that either:
termwill uniquely match a single country, or- scoring rules will determine the best match out of many
An implementation of this function might then be specified as an injectable dependency, similar to:
@Singleton
@Provides
public CountryLookup countryByISOCodeLookup(@ISO QueryExecutor<String, ShortCountry> findByIsoCode) {
return iso -> {
try(var result = findByIsoCode.execute(iso)) {
var iterator = result.iterator();
return iterator.hasNext()
? Optional.of(iterator.next().value())
: Optional.empty();
}
};
}
and injected and used at the application level:
@Test
void lookupFunction(CountryLookup countryLookup) {
var countryOptional = countryLookup.lookup("CAN");
countryOptional.ifPresent(country -> logger.info("Found country: {}", country));
assertThat(countryOptional).isPresent();
assertThat(countryOptional.map(ShortCountry::name)).isEqualTo(Optional.of("Canada"));
}