Process large amounts of Elasticsearch data using TIBCO ActiveMatrix BusinessWorks 5
Sometimes you want to create a longitudinal study of patterns in your Elasticsearch data and you want to analyze the entire event stream matching your criteria. The scroll API provides a mechanism for asking Elasticsearch for every last entry matching a query and then to get the results back in chunks which sequentially represent the entire set of matching records.
The following is an excerpt from the Elastic webpage that explains the API:
While a search request returns a single “page” of results, the Elasticsearch scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.
You would want to have BusinessWorks to pipe the data to a stream processing application, built in something like Kafka Streams or Apache Spark. As a basis, I used patterns that I found in client helpers based on Python and Java programming languages.