CSV endpoint sink¶
This is a data sink that registers an HTTP publisher endpoint that one can get entities in CSV format from.
A pipe that references the CSV endpoint
sink will not pump any
entities. In practice this means that a pump is not configured for the
pipe; the only way for entities to flow through the pipe is by
retrieving them from the CSV endpoint using a client that supports the HTTP protocol.
It exposes the URLs:
URL |
---|
|
|
The exposed URL may support additional parameters such as since
and limit
- see
the API reference for the full details.
Note that you can optionally specify the filename to use in the Content-Disposition
header of the HTTP response as
the last path element of the URL.
Prototype¶
{
"type": "csv_endpoint",
"columns": ["properties","to","use","as","columns"],
"quoting": "all|minimal|non-numeric|none",
"delimiter": ",",
"doublequote": true,
"include_header": true,
"escapechar": null,
"lineterminator": "\r\n",
"quotechar": "\"",
"encoding": "utf-8",
"encode_error_strategy": "replacement-strategy-to-use",
"skip-deleted-entities": true,
"filename": "my_data.csv",
"content_disposition": "attachment"
}
Properties¶
Property |
Type |
Description |
Default |
Req |
---|---|---|---|---|
|
List<String> |
A list of string keys to look up in the entity to construct the CSV columns. If |
Yes |
|
|
Enum<String> |
A string from the set of “all”, “minimal”, “non-numeric” and “none” that describes how the fields of the CSV
file will be quoted. A value of “all” means all fields will be quoted, even if they don’t contain the |
|
|
|
String |
The character to use as field separator. It will also affect which fields will be quoted if the |
|
|
|
Boolean |
Controls how instances of |
|
|
|
Boolean |
Controls if the |
|
|
|
String |
A one-character string used by the sink to escape |
|
|
|
String |
A character sequence to use as the EOL marker in the CSV output. The default is carriage return plus linefeed
( |
|
|
|
String |
A one-character string that controls how to quote field values. The default is the double quote character. See
|
|
|
|
Boolean |
If |
|
|
|
String |
Which encoding to use when converting the output to string values. The default is |
|
|
|
String enum |
An enumeration of “ignore”, “replace”, “xmlcharrefreplace” and “backslashreplace” that tells the sink how to deal
with illegal characters in the output data when the |
“backslashreplace” |
|
|
Boolean |
This can be set to |
|
|
|
String |
This property provides a hint to HTTP clients on what filename to use when downloading data (via the
|
||
|
String |
This property provides a hint to HTTP clients how to render the file data. The valid values are |
|
Example configuration¶
The pipe configuration given below will expose the my-entities
publisher endpoint and read the entities from the my-entities
dataset, picking the _id
, foo
and bar
properties as columns in the CSV file:
{
"_id": "my-entities",
"name": "My published csv endpoint",
"type": "pipe",
"sink": {
"type": "csv_endpoint"
"columns": ["_id", "foo", "bar", "zoo"],
"filename": "my_data.csv"
}
}
The data will be available at http://localhost:9042/api/publishers/my-entities/csv
(or alternatively
http://localhost:9042/api/publishers/my-entities/csv/some_other_filename.csv
)