• Added the dependency_tracking property to service metadata. It can be used to specify various dependency tracking related properties.


  • Added the max_entity_bytes_size property to the dataset sink.
  • Added the global_defaults.max_entity_bytes_size property to service metadata.


  • Added the global_defaults.default_compaction_type property to service metadata.


  • The union_datasets source now as a prefix_ids property that can be set to false to not add the dataset id as the prefix on entity ids.


  • The transform function rename will now rename properties with a null value. The old behaviour ignored such properties, but that was considered to be a bug.


  • Added support for create_table_if_missing SQL sink property for the Oracle, Oracle TNS and MySQL systems. Previously only the MS SQL and PostgreSQL systems supported this option.


  • Added support for optional string cast value(s) as argument to the uuid DTL function


  • The default value of the read_timeout property has been changed from 7200 seconds to 1800 seconds for the URL system and the Microservice system.


  • Added the fail! DTL function.


  • The replace DTL function now takes a dict argument that lets one specify more than one string replacement.


  • Updated the documentation for the supports_signalling property on dataset sources and the global_defaults.use_signalling_internally property of the service metadata section.
  • The The JSON push sink and REST sink no longer includes header values or entity data in the traceback details of the execution log on failures.
  • The execution log and dead letter entities no longer includes copies of the source or sink configuration properties of the pipe.
  • The properties of the event entities in the execution log are now truncated at 10 mb to avoid excessive event entity sizes. Note that this cut-off value might be decreased further in the future.
  • If the pump fails due to exceeding retry limits, the entity in question is no longer included in the traceback properties. Instead it's put in a separate exception_entity property. Note that this property is not included in the monitoring data, so you cannot devise notification rules that refer to it.



  • The RDF source will no longer add the <rdflibtoplevelelement> root wrapper element to literals with datatype This is a breaking change.


  • Added the hex DTL function.
  • Updated the integer DTL function to parse hexadecimal values.
  • The dataset sink now has a property called prevent_multiple_versions that makes the pipe fail if an entity already exists in the sink dataset. This is useful if one wants to prevent multiple versions of the same entity to be written.
  • The dataset sink now has a property called suppress_filtered. The default value is false unless it is a full sync and the source is of type dataset and include_previous_versions is false. The purpose of this property is to make it possible to opt-in or opt-out of a specific optimization in the pipe. The optimization is to suppress entities that are filtered out in a transform early so that they are not passed to the sink. This optimization should only be used when the pipe produces exactly one version per _id in the output. The optimization is useful when the pipe filters out a lot of entities.



  • Index version 2 is now the default version for dataset indexes. This index implementation (version 2) supports bidirectional traversal and that can be used to expose incremental feeds for one or more subsets of a dataset.



  • DTL property path strings can now be quoted. In practice this means that you can have periods in path elements if you quote them. Example: "'john.doe''s'.bar" is now equivalent to ["path", ["list", "foo", "john.doe's", "bar"], , "_S."]. A quoted path element must begin and end with a single quote. Single quotes can be escaped with ''.
  • Extended the JSON Pull Protocol document with information about response headers and an example using dataset subsets.


  • We've added support for a feature called completeness. When a pipe completes a successful run the sink dataset will inherit the smallest completeness timestamp value of the source datasets and the related datasets. Input pipes will use the current time as the completeness timestamp value. This mechanism has been introduced so that a pipe can hold off processing source entities that are more recent than the source dataset's completeness timestamp value. The propagation of these timestamp values is done automatically. Individual datasets can be excluded from completeness timestamp calculation via the exclude_completeness property on the pipe. One can enable the completeness filtering feature on a pipe by setting the completeness property on the dataset source to true.


  • Pipes now have a property called reprocessing_policy that can be set to cause automatic resets when external factors indicate that the pipe should be reset.


  • The dataset sink now has a property called set_initial_offset that specifies how the sink should set the initial offset on the sink dataset (a.k.a. the populated flag).


  • Added experimental support for automatic scheduling of internal (dataset to dataset) pipes and JSON pipes that read from external Sesam datasets via the REST API. See the supports_signalling property of these sources and the global use_signalling_internally and use_signalling_externally options in service metadata section. Please note the limitations and usage notes.


  • The embedded source now has configurable continuation properties, i.e. supports_since, is_chronological and is_since_comparable.


  • The "dtl" transform will now fail if the target entity's _id property is either missing or is not a string. It will also do so if the arguments to "create" and "create-child" is not a dict or is missing the _id property or the _id property is of a non-string type. This is a change in default behaviour, but it is possible to opt-out of this new behaviour by setting the id_required property to false. It would make it easier to discover logic errors.


  • The track_children property on the dataset sink is now inferred to be true if any of the pipe's transforms use the create-child DTL function. It is possible to override this by setting the property's value to false.


  • The lookup DTL function has been deprecated and replaced with the lookup-entity function. Note that the dataset referenced in its first argument must be populated before the parent pipe will run.


  • The valid characters in pipe and system ids have been restricted to be valid DNS name components. In practice this means that the first character must be a letter or a digit and the rest must be letters, digits and hyphens. The maximum length is 62. Invalid ids will trigger a validation warning.


  • A source that has supports_since=true, is_since_comparable=false and is_chronological=True will now use the chronological continuation strategy. Earlier it used no continutation strategy.


  • Added the discard DTL transform which can be used to discard the target entity. It is similar to filter, but will drop the target entity on the floor and not send it to the sink for deletion.
  • Added the case and case-eq DTL transforms. These are the sisters of the identically named DTL functions.


  • Made the URL system throw an error if it received an invalid 'Content-Length' response header value. The URL system used to ignore such errors; the new ignore_invalid_content_length_response_header property can be set to get the old behaviour.



  • Added a new coerce_to_decimal property to the Oracle and Oracle TNS systems. If set to true, it will force the use of the decimal type for all "numeric" types (i.e. numbers with precision and scale information). Currently what type the column data ends up as is not clearly defined by the oracle backend driver so in some cases it may yield a float value instead of a decimal value. This property should always be set to true if your flows care if numeric values are floats or decimals. The default value is false.


  • We've changed the default strategy for pipe execution logging. By default, we now will never log any runs which resulted in no processed/changed entities. You can opt-in to the previous behaviour by editing the log_events_noop_runs, log_events_noop_runs_changes_only and notification_granularity pump properties.


  • There is now a new index implementation (version 2) that supports bidirectional traversal and that can be used to expose incremental feeds for one or more subsets of a dataset. Index version 1 is currently the default. Nodes must be started with a special command line option in order to change the default value. Version 2 will be made the default at some point once we have enough experience with it.
  • The dataset and json sources now support the subset property. This property is used to specify a subset of the source dataset.
  • The hops and apply-hops DTL functions now support the prefilters property. This property is used to specify a subset of the dataset that it is hopped to.
  • The GET /api/datasets/{dataset_id}/indexes API endpoint now includes the indexes' version number.
  • The DELETE /datasets/{dataset_id}/indexes/{index_int_id} API endpoint has been added. It can be used to delete a dataset index.


  • Compaction is now incremental, so it will continue from where it got to the last time.
  • Compaction will be performed by the dataset sink if compaction.sink is set to true in the pipe configuration. This is only available for pipes using the dataset sink. If sink compaction is enabled no scheduled compaction will be done on the dataset as this is no longer neccessary. Index compaction will still require scheduled compaction, but this does not require a lock on the dataset. Note that sink compaction is currently experimental.
  • Automatic compaction will now kick if there are 10% or 10000 new dataset offsets since the last compaction. The 10000 cap is fixed for now.


  • The dataset sink will now mark the sink dataset as populated when all input datasets are populated and all entities have been read from them. Earlier it marked the sink dataset as populated after the first completed run. This was typically not what you wanted as it caused the sink datasets to be prematurely populated, which then caused unnecessary dependency tracking.
  • Added the initial_datasets property to the merge, merge_datasets, union_datasets, and diff_datasets sources. This property should only be used if some of the input datasets will never be populated. The property should then list the datasets that have to be populated before the sink datasets should be populated.


  • Casting decimal numbers containing a "scientific notation" shorthand (i.e. "1E-3", "10E14" etc) to a string using the DTL string function will now expand the exponent to its full representation (i.e. "1E2" -> "100", "1E-3" -> "0.001"). This is a change in behaviour.


  • Added support for specifying SOCKS5 proxies for the URL, REST and Twilio systems.


  • ["matches", "x*", ["list"]] now returns false instead of true. Note that this is a breaking change, but the old behaviour was considered a bug as it is both non-intuitive and most likely not what you want.


  • Added the sslmode property to the PostgreSQL system. Its default value (prefer) reflects the PostgreSQL client library default, hence you should only set this property if you need other behaviour than the default.



  • Added compaction.growth_threshold property to the pipe configuration. This lets you specify when dataset compaction kicks in.
  • The compaction.keep_versions property can now also be set to 0 and 1. The default value is 2; which is needed for dependency tracking to be fully able to find reprocessable entities. Setting it to a lower value means that dependency tracking is best effort only.


  • Added a new recreate_table_on_first_run boolean flag to the sql sink - it controls if Sesam should recreate the table from schema_definiton when the pipe is reset or runs for the first time. Note that this requires the create_table_if_missing property to also be set to true to take effect.
  • Altered the way the PK is created on schema definition generation. If the sink type is sql and create_table_if_missing is set to true, the default primary key is the _id property of the entities. Previously it would always look for a property with the same contents as _id (which is still the default for non-sql sink pipes).


  • Added a fallback_to_single_entities_on_batch_fail boolean flag to the pump configuration. The default reflects the current behaviour (true). It can be usefuly to set to false if the cost of processing a single entity at a time is high and there is a lot of entities in a batch (for example in a typical MS SQL sink in initial bulk upload mode).


  • Datasets that are not populated will no longer be compacted.


  • Receiver and publisher pipes can now be disabled.


  • Added support in the split DTL function to split string into characters using the empty separator.


  • Added a translation GUI for the GDPR platform. This GUI makes is much easier to customize the various texts used by the GDPR portal.


  • Added the the case-eq and case DTL functions. These can be used to express more complex conditional expressions. Earlier one had to nest if functions to achieve the same thing.


  • Changed the base64-encode and base64-decode DTL functions to only accept bytes and string input respectively.
  • Added support for bytes input to the string casting function. The encoding used is utf-8.
  • Added a bytes casting function that casts strings to (utf-8 encoded) bytes representation.


  • Added a RDF transform, similar to the XML transform. It will render entities to a NTriples string and embed it in the transformed entity.
  • Added the base64-encode and base64-decode DTL functions.


  • Added support for having secrets that apply only to one specific System.


  • Changed default behaviour of the :ref:'CSV source <csv_source>`: if dialect is set, this will override the default value of auto_dialect. Previously you would have to both turn off auto_dialect and set dialect. Note that if auto_dialect is false and no dialect has been set, the excel dialect is used as default.
  • The is_chronological property on the SQL source is now dynamic as it is true if the updated_column and table properties are set.
  • Added the is_chronological_full property to the SQL source . If explicity set to false then a full run will not consider the source to be chronological even though it is chronological in incremental runs. The default value is the value of the is_chronological, but can be set to false.


  • The old dead_letter_dataset pump configuration option (string) has been deprecated and replaced by use_dead_letter_dataset, which is a boolean flag (false by default). If set to true, the id of the dead letter dataset is automatically generated and linked to the parent pipe id (system:dead-letter:pipe-id). Note that entities written to this new dataset will no longer have the pipe id as part of their _id property. This new dataset will inherit the ACLs from its parent pipe (like pump execution datasets). If the pipe is removed, the automatically created dataset is also removed. The old dead_letter_dataset property will continue to work as before but will be removed at some future date.


  • Added the checkpoint_interval property to the pipe. The default has been changed from 1 to 100, which means that the pipe offset is now saved after every 100 batches instead of after every batch. The default is effectively every 10000 entities, but since it is dependent on batch_size the default value is 100 (i.e. 10000/batch_size). Note that the pipe offset is always saved at the end of every sync if it changed.
  • Pipes that perform deletion tracking will now have their pipe offset and deletion tracking state saved every 15 minutes or so. If a pipe is interrupted it will now be able to continue doing deletion tracking from where it last saved it's state.


  • Added the ljust and rjust DTL functions. They can be used to left-justify and right-justify strings.


  • A partial rescan can now be scheduled on a pump by specifying the two properties partial_rescan_count and partial_rescan_delta.


  • Added the hash128 DTL function. It generates 128 bit integer hashes from bytes and strings.


  • The sink dataset and the dead-letter dataset will now be asserted when the pipe is loaded. Receiver datasets, i.e. sink datasets that are used in combination with the http_endpoint source, will be automatically populated at the same time. Note that it is possible to opt-out of this behaviour by setting auto_populate_dataset to false on the http_endpoint source. Dead-letter datasets are automatically populated, and it is not possible to opt-out.

    Note that this is a change in behaviour, but in most situations it is the right thing to do. If the initial push to the receiver is a full sync, then it might be good to set auto_populate_dataset to false. The reason why this is useful for full syncs is because pipes doing hops against the dataset will then wait until the sync is complete and the dataset is populated.


  • Processing of namespaced identifiers have gotten a decent performance boost.
  • Regression: The make-ni DTL function will now return a sorted list of NIs. Earlier the sorting was done by sorting the keys of the source entity, which is a much expensive thing to do.


  • Added support for circuit breakers, a safety mechanism that one can enable on the dataset sink. The circuit breaker will trip if the number of entities written to a dataset in a pipe run exceeds a certain configurable limit.


  • Added the round DTL function. It rounds to the nearest digit using the "round half to even" rule.


  • Added oauth2 (BackendServerClient profile, aka "client credentials") option to the URL system


  • Changed the default value of the node configuration setting "pipe_cleanup_after_deletion" to "true". This means the node will remove any pipe-related data when a pipe is deleted (execution logs, acls, pipe offsets etc)


  • Added the map-values function. It maps over the values of dictionaries and returns a list of mapped values.


  • The combine DTL function now allows a single argument. This is useful when you want to turn an expression into a list of values. It is extra useful when you don't quite know if the value is a list or not. Example: ["combine", "_S.x"]


  • Added a content_disposition configuration property to be able to specify the type in the Content-Dispositon HTTP response header to the HTTP endpoint sinks.
  • Added the possibility to specify the filename of the HTTP endpoint sinks as the last element of the URL (overrides any filename set in the configuration of the sink).


  • Added the url-unquote function that URL unquotes any URL quoted characters in its input. See the related url-quote function.


  • The RDF source and SDShare source now supports the sort_lists property to automatically sort resulting properties containing lists (i.e. RDF statements having the same predicate). It is true by default.



  • Added encrypt-pgp and decrypt-pgp DTL functions that can encrypt strings to OpenPGP messages using a PGP public key and decrypt these messages back to strings using a PGP private key and its associated password.


  • Added encrypt-pki and decrypt-pki DTL functions that can asymmetrically encrypt strings to bytes and decrypt bytes to strings using a PKI public/private key-pair in DEM format (PKCSv8). The encryption is performed using RSA 2048 bits with sha-1 hashes and OAEP/MGF1 padding.




  • Added the intersects DTL function. This boolean function returns true if there is an overlap between the values in the two arguments.

  • The DTL compiler will now issue a warning if you try to perform two or more join expressions between the same two dataset aliases. It is there to notify you of possible cardinality issues and to tell you about the tuples function, which may be used to avoid cardinality issues.

    When there are two or more join expressions between the same two dataset aliases only the first one is treated as a join expression; the rest of them are equality comparisions. One can use the tuples function to combine them into one big join expression at the cost of composite indexes being used.


    Note that the eq function serves a dual purpose. It can both be used for join expressions and it can be used for equality comparisions. These two are different in that a join uses intersection (similar to the intersects function) and the equality comparison is an exact match. Use the intersects function if you want to check for intersection/overlap instead of an exact match.


  • The default value of the keep_existing_solr_ids configuration property in the The Sesam Databrowser sink has been changed from true to false.


  • The JSON push sink now supports customizable HTTP headers via a headers property.



  • If a pipe is running and the pipe-config is modified, the pipe will no longer be stopped. Instead a "An old version of the pipe is still running" warning will be displayed, and it is up to the user if they want to stop the running pipe or not.



  • Added a track_dead_letters option to the pump configuration. If set to true, it will delete "dead" entities from the dead letter dataset if a later version of it is successfully written to the sink. Note that using this option incurs a performance cost so use with care.


  • It is now possible to specify track-dependencies on all the HOPS_SPEC in a specific hops DTL function. This change was made so that one can disable tracking for any of the HOP_SPECs, not just the last one.


  • The json-parse and json-transit-parse DTL functions now accept an optional default value expression. The default value expression is used when the input value is not valid JSON.


  • The datetime-parse and datetime-format DTL functions now accept an optional timezone argument. This makes it possible to parse datetime strings and format datetime values in specific timezones.


  • When a pipe is reset then the pipe's retry queue is now also reset.
  • Bug fix: It is now possible to interrupt pumps that are performing retries.
  • Indexing of datasets changed so that each dataset is indexed for a maximum of five minutes in each iteration. This prevents some datasets from being blocked from indexing when there are other large datasets being indexed.




  • Added functionality for preventing all pipes from automatically running (useful in some debugging scenarios). See the Low level debugging page for details.


  • Added a is_sorted property to the RDF source to indicate that the input data is sorted on subject, enabling the source to avoid loading the entire file into memory. Note that it only works for nt (NTriples) format files without blank nodes.


  • Added a write_retry_delay property to pipe pumps. This is used in conjunction with max_consecutive_write_errors when the system the pipe is writing to is known to be sporadically (non-transiently) unavailable. See the Pump section for details.




  • Added the indexes property to the dataset sink. If set to "$ids" then an index will be maintained for the $ids property. This index will then be used by the dataset browser to look up entities both by _id and $ids.
  • The default value of the max_depth property in hops has been changed from null to 10. This means that the default is to stop the recursion at level 10.


  • The JSON push protocol has been simplified to make it easier to write receivers. It will now always send the entities as an array, even if it contains just a single object. The JSON push sink has been updated to reflect this. If you need single-object JSON POST/PUT operations, you should use the REST sink instead.
  • Systems now support environment variables in their config like pipes do


  • Added the tuples DTL function that can be used to create composite join keys.


  • The equality property on the merge source is now optional.


  • Changed the default value of the "schedule_interval" pump configuration property. Before, the default value was 30 seconds for all pipes. The new default value for pipes with a dataset sink and a dataset sink is now 30 seconds +/- 1.5 seconds. For all other pipes, the default is 900 seconds +/- 45 seconds. (The +/- part helps stagger the start-time of the pipes, so that we don't get lots of pipes starting at the same instant.)
  • Added a warning in the GUI for non-internal pipes that don't have a "schedule_interval" or a "cron_expression" attribute set.


  • Extended all systems to accept a new property worker_threads that limits the number of concurrent pipes that can run against a particular system. The default value is 10. For input pipes the source system is used and for output pipes the sink system is used. For internal pipes, the the pool has 50 worker threads (i.e. for dataset to dataset pipes or receiver/publisher endpoints).


  • Extended the URL system and REST system to accept default custom request headers using the headers property. Also fixed the REST system schema to reflect authentication options and the jwt_token property.


  • Extended the in DTL function to allow a single value in the second argument.



  • Added the _R variable, which can be used to refer to the root context in a DTL transform.


  • The base_url property of the URL system and REST system has been deprecated. It has been superseded by the the url_pattern property.



  • Added the is-changed DTL function that can be used compare data from the current and the previous version of the source entity.




  • Added a substring DTL function that returns a substring of another string given a start and end index.


  • Added include_replaced property to the dataset source. This property is used to filter out entities that are replaced by the merge source.


  • Added url_pattern property to URL system. This property gives you more control over how absolute URLs are produced. It can be used instead of the base_url property.


  • Added a jwt authentication scheme and jwt_token property to the URL system


  • Added text_body_template and text_body_template_property``properties to the :ref:``EMail message sink <mail_message_sink>. Use these to explicitly construct a plain-text version of your messages if sending multi-part messages.


  • For security reasons, the Mail and SMS sinks no longer support file-based templates. Note that this is a non-backwards compatible change. You can use environment variables and upload your existing template files using the environment variable API or the corresponding Management Studio form.


  • Datasets are now scheduled for automatic compaction once every 24 hours. The default is to keep the last 2 versions up until the current time. It is possible to customize the automatic compaction. See documentation on compaction for more information.


  • The SQL source no longer includes columns with null values by default. You can include them by setting the preserve_null_values property of the SQL source to true. Note that this is a change of the previous default behaviour.
  • The CSV source no longer includes empty string values by default. You can include these by setting the CSV source property preserve_empty_strings to true. Note that this is a change in the default behaviour.


  • The dict function now takes zero, one or an even number of arguments. If zero arguments given then an empty dict is returned. If an even number of arguments then a new dict with each pair of arguments as key and value. The latter is convenient for easy construction of dicts.
  • The transform functions add and default now take an expression in their first argument. This means that the properties can be dynamic and that there can be multiple. rename now takes dynamic arguments in the first and second positions.


  • Documented the pool_recycle option on SQL systems and changed its default from -1 (no recycling) to 1800 (30 minutes).


  • Added the merge source. This is a data source that is able to infer the sameness of entities across multiple datasets.



  • Added a uuid DTL function. It takes no parameters and returns a UUID object (type 4).


  • Added a disable_set_last_seen property to the Pipe properties. If set to true, it will not be possible to set or reset the last seen bookmark on the pipe using the API (i.e. protecting it from accidental changes by principals with write permission on the pipe).


  • Added a read_retry_delay property to pipe pumps. This is used in conjunction with max_read_retries when the source is known to be sporadically (non-transiently) unavailable. See the Pump section for details.


  • The documentation on cron expressions now makes it clear that they are evaluated in the UTC timezone.


  • The concat DTL function now takes a variable number of arguments. This avoids constructing unnecessary lists.


  • The url-quote DTL function now takes an optional SAFE_CHARS argument. This is especially useful when you don't want to quote the / character.


  • The section on Continuation Support has been extended. Each source now has a Continuation support table that shows the source's support for continuations.


  • Added the json and json-transit DTL functions.
  • The group-by DTL function has been changed to always return string keys. The string keys are the JSON transit encoded (same type of string as the json-transit function produces). The reason is that the entity data model (and JSON) only supports string keys. group-by has also gotten an optional STRING_FUNCTION argument which lets you specify a custom function to create the string keys.
  • The sorted, sorted-descending, min, max DTL functions have been updated to support mixed type ordering.





  • Added the range DTL function.


  • Added the Embedded source. This is a data source that lets you embed data inside the configuration of the source. This is convenient when you have a small and static dataset.


  • Added the XML transform and XML endpoint sink. These can be used to generate XML documents inline in entities or published to external consumers, respectively.


  • Changed the CSV endpoint sink to not output deleted entities by default. Added a new skip-deleted-entities config parameter that can be set to false if one want deleted entities to appear in the CSV output.


  • Added DTL Reference Guide section that explains how joins work.


  • Reworked DTL math functions to reflect that float is an allowed type in entities. If the function parameters are of mixed types, the result will be coerced to the type that is the most precise. I.e. float+decimal=decimal, int*float=float, int/div=decimal and so on. Not that this is a change in behaviour as entities that previously only had decimal as types after using DTL math functions if the input was of type float, now may end up with values that are floats instead. Use the dtl decimal cast-function to coerce the result to decimal if this is important to the application.
  • Added is-float and float DTL functions. Changed is-decimal function so it no longer returns true if the argument is a float. You will now have to add both a is-float and a is-decimal in an or clause to test for both types.


  • Added Elasticsearch support, which includes a system and a sink.
  • The Solr sink now supports batching.
  • Added the commit_at_end property to the Solr sink and the Sesam databrowser sink.
  • Moved the commit_within property from the Solr system to the Solr sink and the Sesam databrowser sink. The reason is that the commit rate is really specific to how and where it is used. This change is backward compatible, as the default value is taken from the system. It is recommended to update the configuration files accordingly.
  • Moved the prefix_includes and keep_existing_solr_ids properties from the Solr system to the Sesam databrowser sink. The reason is that they are only relevant there. This change is backward compatible, as the default value is taken from the system. It is recommended to update the configuration files accordingly.


  • Fixed the documentation for the merge DTL transform; it mistakingly stated that the merge transformation would not overwrite existing attributes in the target entity.
  • Updated the /api/config GET" endpoint to format the json in a more human-readable way.



  • Added the datetime-shift DTL function.
  • Added support for timezones to the datetime-parse DTL function.
  • Added missing sink- and source- prototypes in the "Edit pipe" gui in Management Studio.
  • Fixed a bug that prevented users from adding a system in Management Studio.