DaskElasticSearch API¶
DaskElasticSearch¶
DaskElasticClient ([host, port, …]) |
Basic object. |
DaskElasticClient.read ([query, index, …]) |
Method to read from elasticsearch index/ices |
DaskElasticClient.save (data, index, doc_type) |
Save a dask dataframe to Elasticsearch |
Read Dataframe¶
-
class
dask_elk.client.
DaskElasticClient
(host='localhost', port=9200, client_klass=<class 'elasticsearch.client.Elasticsearch'>, username=None, password=None, wan_only=False, **client_kwargs)¶ Basic object. Client to connect with Elasticsearch
Parameters: - host (list[str]|str) – The host to connect to
- port (int) – The port to connect to
- client_klass (elasticsearch.Elasticsearch) – The class to use to create the Elasticsearch client instance
- username (str|None) – The username if X-pack is activated
- password (str|None) – The password if X-pack is activated
- wan_only (bool) – If client should perform Node lookup. Set to True if ELK behind firewall or if nodes not accessed directly
- client_kwargs (dict[str,T]) – Arguments to pass to Elasticsearch client object created by client_klass
-
read
(query=None, index=None, doc_type=None, number_of_docs_per_partition=1000000, size=1000, fields_as_list=None, **kwargs)¶ Method to read from elasticsearch index/ices
Parameters: - query (dict(str,T)|None) – The query to push down to ELK.
- index (str) – String of the index/ices to execute query on.
- doc_type (str) – Type index belongs too
- number_of_docs_per_partition (int) – Number of documents for each partition/task created by the readers
- size (int) – The scroll size
- fields_as_list (str|None) – Comma separated list of fields to be treated as object
- kwargs – Additional keyword arguments to pass to the search method of python Elasticsearch client
Returns: Dask Dataframe containing the data
Return type: dask.dataframe.DataFrame
-
save
(data, index, doc_type, action='index')¶ Save a dask dataframe to Elasticsearch
Parameters: - data (dask.dataframe.DataFrame) – Dataframe to save into ELK
- index (str) – The index to save dataframe
- doc_type (str) – Index doc type
- action (str) – index if indexing you data ‘update’ if updating data
Returns: The data with the save applied on it
Return type: dask.dataframe.DataFrame