DaskElasticSearch API

DaskElasticSearch

DaskElasticClient([host, port, …]) Basic object.
DaskElasticClient.read([query, index, …]) Method to read from elasticsearch index/ices
DaskElasticClient.save(data, index, doc_type) Save a dask dataframe to Elasticsearch

Read Dataframe

class dask_elk.client.DaskElasticClient(host='localhost', port=9200, client_klass=<class 'elasticsearch.client.Elasticsearch'>, username=None, password=None, wan_only=False, **client_kwargs)

Basic object. Client to connect with Elasticsearch

Parameters:
  • host (list[str]|str) – The host to connect to
  • port (int) – The port to connect to
  • client_klass (elasticsearch.Elasticsearch) – The class to use to create the Elasticsearch client instance
  • username (str|None) – The username if X-pack is activated
  • password (str|None) – The password if X-pack is activated
  • wan_only (bool) – If client should perform Node lookup. Set to True if ELK behind firewall or if nodes not accessed directly
  • client_kwargs (dict[str,T]) – Arguments to pass to Elasticsearch client object created by client_klass
read(query=None, index=None, doc_type=None, number_of_docs_per_partition=1000000, size=1000, fields_as_list=None, **kwargs)

Method to read from elasticsearch index/ices

Parameters:
  • query (dict(str,T)|None) – The query to push down to ELK.
  • index (str) – String of the index/ices to execute query on.
  • doc_type (str) – Type index belongs too
  • number_of_docs_per_partition (int) – Number of documents for each partition/task created by the readers
  • size (int) – The scroll size
  • fields_as_list (str|None) – Comma separated list of fields to be treated as object
  • kwargs – Additional keyword arguments to pass to the search method of python Elasticsearch client
Returns:

Dask Dataframe containing the data

Return type:

dask.dataframe.DataFrame

save(data, index, doc_type, action='index')

Save a dask dataframe to Elasticsearch

Parameters:
  • data (dask.dataframe.DataFrame) – Dataframe to save into ELK
  • index (str) – The index to save dataframe
  • doc_type (str) – Index doc type
  • action (str) – index if indexing you data ‘update’ if updating data
Returns:

The data with the save applied on it

Return type:

dask.dataframe.DataFrame