Changelog¶
2025-10-10¶
Added support for Automatic Private IP Addressing (APIPA) to the redundant VPN feature in Sesam. Contact support if you need to set up a redundant VPN.
2025-09-18¶
Added Kafka system, Kafka source and Kafka sink.
2025-08-28¶
Added support for Flexi subscriptions, allowing for more flexible and cost efficient scaling of subscriptions as needs change.
2025-08-04¶
The
decode_jwtJinja filter now supports a new argumentjwks_url. This can be used to obtain keys dynamically from a JSON Web Key Set (JWKS) URL, instead of specifying it under thekeyargument.
2025-06-26¶
Documented more of the properties that exists in the pump execution dataset.
2025-06-25¶
The
validation_expressionproperty now supports looking up global secrets. If the secret used in the expression is set on both the system and as a global secret, the system secret takes priority.
2025-06-17¶
Added the
bulk_schemaproperty to the SQL sink.use_bulk_operations=truenow works with a custom database schema in the SQL sink using the Microsoft SQL Server system.
2025-06-16¶
Added the
sslmodeproperty to the MySQL system.
2025-04-22¶
We are rolling out a major upgrade of the SQL system where we’ve upgraded the database drivers and the SQL alchemy library that underpins it. This fixes several known vulnerabilities in the library and the drivers.
Note that there is a change in behaviour in the Oracle driver which no longer pads up to
Ytrailing zeros in columns of datatypeNUMBER(X,Y), e.g. it will now return2.91instead of2.9100in a column of typeNUMBER(6,4). Thesqlsource may for this reason produce updated results for existing entities.
2025-04-11¶
Early retirement for the GDPR platform that was scheduled for end-of-life June 30th 2025.
Removed support for multitenancy and the Management Console tool due to lack of use.
Updated roadmap.
2025-04-02¶
The Notifications feature will reach end-of-life June 30th 2025. It is superseded by the Metrics feature.
The GDPR platform tool will reach end-of-life June 30th 2025.
2025-03-17¶
Added the
trigger_onproperty to the following transforms:
2025-03-10¶
Added the option to specify retry strategies for different HTTP status codes in the URL, REST and microservice systems. This can be configured in the new
retry_strategyproperty.Documented
batch_retriesin the pump properties.
2025-03-09¶
Pump execution log entities now have a property
tokensif custom authentication or OAuth2 authentication is used on any of the connected systems. The property exposes the expiry dates for the tokens used by these systems. The purpose of this is to show the resulting output of theexpires_in_expressionandexpires_at_expressionJinja expressions, which can potentially produce the wrong date if they are misconfigured.
2025-02-28¶
Added the
ownerproperty to POST api_json_web_tokens.Added a new method to PUT api_json_web_tokens. Which allows for updating jwt metadata: Name, description, owner.
2025-02-27¶
Added new Jinja filters:
bytes,base64_encode,base64_decode,datetime, anddatetime_filter. These filters work the same way as the corresponding DTL functions and should produce the same output.Added a new
decode_jwtJinja filter which decodes a JWT given a public key. The output is the decoded JWT in JSON format.Added a new section which documents our available Jinja filters.
2025-02-06¶
Added support for configuring a
get_refresh_token_operationwhen using custom authentication in the REST system. This is intended for authentication schemes that use tokens similar to OAuth2 refresh tokens. These refresh tokens are then typically used for fetching the access token. This new operation will run before theget_token_operationif configured.The responses from the
get_token_operationand the newget_refresh_token_operationare now available in the newtokenobject, which can be accessed with Jinja expressions. This means that there is no need to configure theaccess_token_propertyanymore, since all properties inside the response(s) can be accessed with{{ token.<property-name> }}.Marked the custom authentication feature as experimental.
2025-01-14¶
Added support for using custom authentication in the REST system. This enables flexible configuration for fetching an access token that will be used for authentication towards the system. The access token will be refreshed periodically, similar to how the existing OAuth2 machinery works. For existing systems that depend on a microservice for fetching an access token, it is highly recommended to switch over to using custom authentication instead so that a redeployment is not triggered whenever a new access token is fetched.
Some example configurations on how to use custom authentication can be found here .
2024-11-27¶
Added support for TTL (time to live) compaction for deletes. This can be enabled by setting
ttl_deletes_hoursin the pipe’s compaction section. When enabled, entities will be compacted away if the latest version of the entity has"_deleted": trueand is older thanttl_deletes_hours. All versions of the entity will be compacted away, and the only way to recover them is to restore from a backup that contains the entities.
2024-10-11¶
Added support for connectors.
Added support for multitenancy.
Added support for webhooks.
Updated roadmap.
2024-10-09¶
Integrated Search has been extended to add support for phrase search.
2024-10-07¶
Added
verify_sslas a global default in the service metadata. This determines the default value of theverify_sslproperty on URL systems. The default value isfalse.
2024-09-19¶
Extended Integrated Search to allow using a well-defined query syntax. Improvements have been made to the search results for namespaced identifiers that have been merged. They now have the same query result page.
Added the
trigger_onproperty to the http transform.
2024-03-13¶
Added documentation for how to change the logging level for workernodes.
2024-01-31¶
Added a new property
schema_urlandsystemto the JSON schema transform. Theschema_urlcan be used to avoid embedding the schema in your pipe configuration by pointing to an externally stored schema instead. If this is used, thesystemmust be set, and it must point to a valid URL system.
2024-01-12¶
Added the possibility to specify permissions to be applied to a system in a
permissionspipe property.
2023-12-22¶
Added support for conditional properties for the System, Pipe and service metadata configuration entities.
2023-12-20¶
Added support for using DTL to calculate value of the
completenessproperty on the dataset source at runtime.Added the completeness DTL function.
2023-12-12¶
Added the coalesce-args DTL function. This function is different from
coalescein that it evaluates its arguments in order and stops when it finds an argument that is not null. This can in many situations be a lot more efficient.Fixed a bug where timestamps were not parsed correctly during partial rescans.
2023-11-27¶
Extended the
prevent_multiple_versionsproperty of dataset sinks to also accept the enum"ignore"(in addition totrueor the default valuefalse). If set to"ignore"the pipe will silently ignore any updates to existing entities in the dataset (whereas atruevalue makes the pipe fail when encountering updates).
2023-11-07¶
Added a new phonenumber-parse DTL function.
Added a new phonenumber-format DTL function.
2023-10-11¶
Clarified that the system level
headersproperty on REST systems is used on all requests executed by the system. The keys in this property can be overridden in the individual operations but cannot be discarded.
2023-09-01¶
Active use of the sesam-py client will now prevent developer and developer-pro subscriptions from being hibernated. This feature was introduced in version 2.8.0.
2023-08-18¶
Hibernation for developer subscriptions are extended to developer pro subscriptions as well.
Any automated CI system that requires 24/7 uptime should be moved to a single node. You can still do CI testing with a developer subscription, but hibernation wake-up time must be expected.
2023-08-17¶
Execution log entries
circuit-breaker-commitandcircuit-breaker-rollbackare now written when a circuit breaker is committed or rolled back.Added the
traceproperty available on the REST transform, REST source, REST sink and HTTP endpoint source to theglobal_defaultssection of the service metadata. This property, if set, represents the default value for thetraceproperty on these components when not set explicitly in their config. The intention is to be able to turn this feature on globally when debugging or doing development without having to change the individual components.
2023-08-11¶
Added a new global default
run_at_startup_if_not_populatedto the service metadata. This setting determines the default value of run_at_startup_if_not_populated for pumps.
2023-08-10¶
2023-08-07¶
Added a new
next_page_termination_strategyoptionnot-full-pageand a new propertypage_sizeto the REST system. When this new strategy is enabled, paging will terminate if the number of entities in the response is less than the specifiedpage_size. This new property can also be used in Jinja expressions.
2023-07-04¶
We will from now on spin down developer-subscriptions that have had no interaction recently. “Interacted” is defined as clicking around in the Management Studio in the given subscription. After it has been interacted with it will be spun up again, taking about 15minutes. Improvements to the UI to reflect this is being worked on.
2023-06-30¶
Added a new
refresh_windowoption to theoauth2section of the URL system and REST systems. When using refresh tokens, this value (in seconds) is the window to pre-emptively refresh a token that is about to expire. It’s 30 seconds by default. Set this property to 0 if the system doesn’t allow tokens to be refreshed before they expire.
2023-06-26¶
Added a new
next_page_termination_strategyoptionsame-responseto the REST system that is enabled by default. When enabled, paging will terminate if the response is equal to the previous response.
2023-05-15¶
Corrected the documentation of sources that have the
supports_signallingproperty to reflect that the threshold for turning off implicit signalling is an hour, not two minutes. Note that you should explicitly turn on or off signalling support using thesupport_signallingproperty if you need to have control over this on your pipe.
2023-05-08¶
Added support for Tripletex authentication to the URL system and REST systems
Added an group DTL function.
2023-05-02¶
A dataset source with
subsetnow respects theinclude_previous_versionsproperty (which is false by default). Before this change historical versions were included. The dataset entities API will also now respect thehistoryrequest parameter for subsets.
2023-04-27¶
Updated the documentation of the path DTL function with a description of how non-string items in the PROPERTY_PATH list are treated (they are ignored).
2023-04-25¶
Added a new
require_populated_inputsetting as a global default in the service metadata and as a property on the dataset, merge, merge_datasets and union_datasets sources. It can be used to prevent a pipe from running unless the pipe’s source-datasets have been populated.
2023-03-29¶
Added
pageandis_first_pagebound parameters to the Jinja expressions for the REST transform and REST source. These are useful for including or excluding properties when doing paged operations.Added a
"manual"enum to thesince_property_locationof the REST source - if set, the source will not attempt to add any continuation-related parameter automatically.
2023-03-24¶
Updated our Terms of Service.
2023-03-17¶
We decided to revert our recent change of the default value of
allowed_status_codesin the REST transform from 200-299 to 200. The change did cause some problems with non-idempotent sinks. The default value is now 200-299.
2023-03-14¶
allowed_status_codesandignored_status_codescan now be specified on REST operations, but they can only be used with the REST transform.
2023-03-07¶
Added the possibility to specify permissions to be applied to the pipe in a
permissionspipe property.
2023-02-28¶
Added
validation_expressionproperty to the HTTP endpoint source. This allows custom request validation for receiver endpoints. This is particularly useful when clients cannot use JWT tokens for authentication.
2023-02-24¶
Added a new
error_expressionproperty to theoperationobject properties in the REST system (and any local variants). It is available to the REST source and REST transform and is intended to be used to test for error conditions in responses from systems that don’t use HTTP error codes properly. If it renders to a non-empty string the source or transform will fail. The contents of the rendered error is included in the exception raised to the pipe.
2023-02-23¶
Added a new
initial_completenessproperty to the dataset source.
2023-01-31¶
Restricted access to pipe runner API for subscriptions not having developer_mode enabled. The motivation is to avoid running tests in production systems as that is disruptive/destructive.
2023-01-30¶
Extended the completeness feature to propagate the completeness value of all upstream datasets. You can now also specify the specific upstream datasets that you want a dataset source to have completeness for.
2023-01-26¶
Changed the default value of
side_effectsfromfalsetotruefor the REST transform and HTTP transforms. Note that this is a change of behavior and will prevent previews from including these types of transforms by default. The motivation for this change is to prevent unintentional changes in the external systems accessed by the transforms when previewing a pipe. You can manually changeside_effectstofalseif you’re sure your transforms are free from such side-effects or if you don’t mind changes happening when previewing a pipe.
2023-01-25¶
Added the
sincebound parameter to thepayload,headersandparamsoperation object properties in the REST system (and any local variants) for the REST source.Documented some additional bound parameters available for paged responses in the templated properties for the REST system (and any local variants) and REST source and REST transform.
2023-01-24¶
Added support for the missing
"HEAD"and"OPTIONS"HTTP methods for operation objects in the REST system (and any local variants). Note that"HEAD"requests will always result in an empty response body, so will not work withreplace_entityset totruein the REST transform and requires aresponse_propertyto be set for the REST source.
2023-01-23¶
Added a special Jinja template marker string
"sesam:markjson"that can be used to generate json objects (both objects, lists and single values) from strings in thepayload,paramsandheadersoperation objects in the REST system (and any local variants). This feature is considered experimental and may change or be removed.
2023-01-20¶
Added a special Jinja template marker string
"sesam:markskip"that can be used to conditionally drop properties from thepayload,paramsandheadersoperation objects in the REST system (and any local variants). This feature is considered experimental and may change or be removed.
2023-01-19¶
Added a new
traceproperty on the REST transform, REST source and REST sink. It can be used to log the http requests and responses these components sends and receives, which can be useful during development or debugging.Renamed the
trace.log_authorization_header_redacted_bytesproperty of the HTTP endpoint source totrace.log_secret_redacted_bytes.Added docs on how to enable trace in the Preview panel in Management studio.
2023-01-18¶
Added “entity” and “source_entity” as bound parameters in various Jinja templateable properties in the REST system, REST transform, REST source and REST sink.
2023-01-17¶
Added a new
next_page_termination_strategyoptionsame-next-page-requestto operations in the REST system (and any local variants). If included in thenext_page_termination_strategyvalues, it will terminate the paging if it detects that the request to issue is identical to the previous request (i.e. the headers, url, parameters and payload are all the same values). Added this new strategy to the defaultnext_page_termination_strategy, which is now a list ofnext-page-link-emptyandsame-next-page-request.Added an “experimental” note to
next_page_termination_strategyto indicate that this property is still under development and subject to change/removal.
2023-01-11¶
It’s now possible to specify a
operationsproperty directly on the REST transform, REST source and REST sink. If present both in the pipe and the system, the pipe version will take precedence. Note that only the system version allows secrets. This is primarily intended as a convenience feature during development; in a production environment if multiple pipes use the sameoperationsconfiguration, you should consider storing it on the REST system so it can be reused and maintained in one place.
2023-01-10¶
Added support for http basic authentication to the Elasticsearch system.
Added new options to the
traceproperty of the HTTP endpoint source:log_authorization_header_redacted_bytes,log_response_body_maxsizeandlog_response_headers.
2023-01-09¶
Changed the default
allowed_status_codesin the REST transform from 200-299 to 200.REST transform, REST source and REST sink: reverted the
payloadmerge behavior from 2022-12-08. It will now work the way it did previously, i.e as a default fallback mechanism. Ifpayloadis defined multiple places, the order of precedence is 1) entity, 2) sink/source/transform and 3) operation. If you need to add a secret to thepayloadyou should add it only to theoperationsection on the REST system and then use thepropertiesproperty on the pipe side to dynamically add properties from the entities to thepayloadvia Jinja templating.
2023-01-06¶
Documented the
response_headers_propertyconfiguration property for the REST source.Documented the
index_mapping_properties,index_check_documentandfirst_run_delete_queryconfiguration properties for the Elasticsearch sink.
2023-01-04¶
Added a new
rescan_when_config_changessetting as a pipe property and as a global default in the service metadata.
2023-01-03¶
All Jinja templates are now using a more strict “undefined variables” check, this means that any reference to a non-existing variable in the template will now throw an exception instead of in some cases rendering an empty string. Note that this is a change in behavior.
For security reasons, all Jinja templates are by default executed in a restricted sandbox environment. Note that this means some functions and objects may no longer be available.
2022-12-30¶
Added a new property
mark_deletion_trackedto the dataset sinks. If set totrue(the default isfalse), a"$deletion_tracked":trueproperty will be added to entities deleted by deletion tracking during full runs or rescans.
2022-12-28¶
The
scopesub-property of theoauth2config element of the URL system and REST system now accept single strings as well as arrays of strings.Added a new experimental
trigger_onproperty to the REST transform. This property can be used to selectively pass through entities based on a property of the entity, for instance allowing a chain of REST transforms to use different transforms for different operations.REST system: added new
payload_typeenum"text"and changed the default to"json"if thepayload_typeis not set. Note that this is a change of behavior. Setting thepayload_typeto"text"sets thecontent-typeof the request to"text/plain"if thepayloadis not of typebytes(and isn’t set explicitly in theheadersproperty of the operation). If the type of the payload isbytesthecontent-typewill be set to"application/octet-stream". All other types will be serialized to a JSON encoded string.The
headersandparamsproperties of theoperationssection of the REST system can now be templated using Jinja expressions.The
payloadproperty of theoperationssection of the REST system and in the REST source , REST transform and REST sink configurations can now be templated using Jinja expressions.Added
previous_bodyandprevious_headersnamed parameters to relevant “templateable” properties of the REST system and in the REST source and REST transform. Note that these are only set for systems that supports paging, for all pages except the first one. Use Jinja’s “is defined” tests in templates that use these to set default values for the first page.
2022-12-22¶
Added a new
traceproperty to the HTTP endpoint source. It can be used to log incoming requests to the pipe’s execution log, which can be useful during development or debugging.Documented the
do_float_as_intanddo_float_as_decimalproperties in the HTTP endpoint source. (These properties have existed for a very long time, they have just not been documented until now.)
2022-12-16¶
Added a
next_page_termination_strategyproperty to operations in the REST system. This can be used to define how the REST source and REST transform decide when to terminate when using pagination. The default value isnext-page-link-emptywhich means that the paging is considered done if thenext_page_linktemplate evaluates to null (or an empty string). The other strategies areempty-resultandsame-next-page-linkwhich terminates pagination on empty results returned or if the next page link is the same as the current page link, respectively. The strategies can be combined as an array.Added
urlandrequest_paramsbound variables to thenext_page_linktemplate. The motivation for this is to support more services that need to construct their pagination links with parts of the current query parameters.Fixed a bug in the REST transform that would cause it to attempt to merge the
propertiesproperty in the entity with the static version defined in the operation or transform configuration. The correct behavior is to use the entity version if it exists and then fall back to the transform and operation, in that order, if it does not.
2022-12-13¶
Added a new
if_transform_emptyproperty to the REST transform. It can be used to make the transform fail if it returns an unexpected empty response. The default is to allow empty responses, which could lead to deletion tracking downstream. This property is analogous to theif_source_emptyproperty for sources.
2022-12-08¶
The
payloadproperty of an operation in the REST system will now be merged with the payload from the pipe if both are dicts. The motivation for this change is to allow payload properties that contain static secrets to be defined in the system.Added a new
allowed_status_codesto the REST transform. It can be used to pass through non-ok responses for further processing.Added a new
response_status_propertyto both the REST transform and REST system operation elements that, if specified, holds which property to use for the status code of the response.Documented the
response_headers_propertyconfiguration property for the REST transform and REST system operation element.
2022-12-02¶
Added a new debug option to the pump configuration section:
max_seconds_per_entity. It can be used to pinpoint entities that are particularly slow to transform. It will make the pipe fail if the batch uses on average more than the limit number of seconds per entity. It should be used in conjunction withbatch_sizeset to 1 on the pipe to be exact - the execution log will include the first entity in the batch that triggers this limit.
2022-12-01¶
Added support for OAuth 2 refresh token grants to the URL system and REST system.
2022-11-15¶
Made the
sincevariable available to theurlproperty in the REST system operation configuration. Note it’s only applicable to REST sources with continuation support.Updated the documentation of the REST component Jinja templates with what variables are available to them.
2022-11-11¶
A new payload type
multipart-formapplicable to the REST sink and REST transform has been added.Fixed the example for using the
formormultipart-formpayload types - it should use a single dictionary of key value pairs, not a list.
2022-11-09¶
The Diff datasets source has been deprecated
The REST source is no longer considered experimental.
2022-10-11¶
Added configuration warning to pipes with chained DTL transforms where other than the first transform use hops with dependency tracking enabled.
Added configuration warning to pipes that have hops with dependency tracking enabled, but do not use the “dataset” source.
2022-10-03¶
Pipe runs triggered by pumps using cron expressions or scheduled intervals larger than one hour (3600 seconds) are persisted, so if the service is down when they should have run they will be run as soon as the service starts up again.
2022-09-06¶
Deletion tracking done by background rescan is now done in batches and is interleaved with incremental synchronization. This means that deletion tracking will no longer stop-the-world.
2022-09-01¶
We’ve updated our Subscription Fee, payment terms. Note that prices are now listed in U.S. Dollar. For existing customers, the changes will take effect from December 1st 2022.
2022-08-17¶
Added the
if_source_emptyproperty to sources and the global defaultglobal_defaults.if_source_emptyto the service metadata. This property determines the behaviour of pipes when their source returns no entities. Previously synced entities will normally be deleted from the pipe dataset when it finishes running, even if no entities are received. Setting this new property tofailwill prevent this by making the pipe fail before it can perform a new sync.
2022-08-09¶
Added
escape_null_bytesproperty to the CSV source. If set totrue, any null characters in the input CSV file will be escaped before parsing the data. This prevents the source pipe from failing due to attempted reads of lines containing null characters. The property is set tofalseby default due to performance reasons.
2022-08-08¶
Added
verify_sslproperty to the LDAP system. Ifuse_sslis set totruethen this property controls if the certificate used for the connection should be verified. It istrueby default.
2022-08-05¶
Added
custom_ca_pem_chainproperty to the LDAP system. This property can hold a custom chain of certificates (in PEM format) that will be used to validate the SSL connection ifuse_sslis set totrue.
2022-07-27¶
Added a new property
global_defaults.always_index_idsto the service metadata. Enabling this will make all dataset sinks maintain an index on the$idsproperty, without the need for specifying theindexesproperty on each individual sink.
2022-07-01¶
Added a “discard-inferred-schema” pump operation to the service API. This operation will discard any inferred schema entries for the pipe and writes a special “pump-discard-inferred-schema” entity to the pipe execution log for reference. This operation can only be done on non-running pipes.
Behavioural change: all pipes that have
infer_pipe_entity_typesset totrue, and have a source with continuation support, will now discard their inferred schemas upon being reset.
2022-06-30¶
Added a new property include_completeness to pipes. This property specifies a list of dataset ids that should contribute to the completeness timestamp value of the sink dataset. By default, this property is equal to the pipe’s input datasets, minus any datasets listed in exclude_completeness.
Pipes that fail to infer their schemas due to limitations on the resulting schema size will no longer fail. The inferred schema will instead be truncated and marked as such and the pipe will not attempt to do schema inference the next time it runs.
2022-06-08¶
The VPN feature now supports high availability for connections. This means that you can set up redundant connections that can be failed over to. This is a multi subscription only feature.
2022-05-20¶
2022-05-12¶
A pipe with automatic reprocessing enabled will now automatically reset if the dependency tracking threshold is reached.
2022-05-03¶
Transforms now have a side_effects property that specifies if the transform has side-effects or not. A side-effect means that it causes changes to the system that it talks to. If the transform alters the system in any way, then this property must be set to true to prevent inadvertent changes to the system by features like pipe preview.
Corrected a bug that for multi subscriptions would cause the default maximum concurrent pipes for a SQL system to be 20 instead of the 10 and essentially unlimited for non-SQL systems. Note that the default number of concurrent pipe for all systems is controlled by the
worker_threadsproperty available on all systems and is 10 by default.
2022-04-25¶
Documented the resource quotas for microservices.
The default value of
max_mergedin the merge source is now set as a global default in the service metadata, and the default value has been increased to 50000 entities. This is a very high number of entities for the merge source to handle at once, and merge sources will start using up large amounts of RAM before hitting this default limit. It is recommended to reduce this limit to prevent such high memory usage and then reconfigure any pipes that attempt to merge too many entities.
2022-04-19¶
Added a new property
max_mergedwith a default value of 100 entities to the merge source. Pipes that attempt to merge more entities thanmax_mergedwill fail with this change. The motivation for adding this new property is that merge sources generally should not be merging that many entities in the first place, and the merge process can end up using excessive amounts of RAM.
2022-04-07¶
Schema inferencing has been extended to collect namespaces used in NI values.
2022-03-31¶
Added support for Metrics.
New data option Metrics and monitoring in test and production pricing replaces the pr. pipe monitoring option. Pipe monitoring will still be available for existing subscription that is already using this.
2022-03-25¶
New developer subscription size Developer Pro is now available.
Added support for Durable Data.
2022-03-24¶
Subscriptions created in the portal are now provisioned with the Clustered architecture.
2022-03-21¶
The Databrowser tool will reach end-of-life December 31st 2023. It is superseded by the Integrated Search feature. We will notify the current subscribers soon.
Added a property
ignore_non_existent_datasetsto the merge, merge_datasets and union_datasets sources. By default, listing one or or more datasets ininitial_datasetsthat do not exist does not prevent the source from being populated. Settingignore_non_existent_datasetstofalsewill make the pipe fail if any non-existent datasets are listed indatasets.Fixed a bug where the
initial_datasetsproperty was initialized as an empty list in the merge, merge_datasets and union_datasets sources ifinitial_datasetswas not explicitly set. The property now defaults correctly to the same list of datasets listed indatasets. This is a breaking change.The dataset and diff_datasets now warn the user if any input datasets do not exist. This also applies to the merge, merge_datasets and union_datasets sources if
ignore_non_existent_datasetsisfalse.
2022-03-10¶
Restructured this documentation site. What’s Sesam is targeted at architects and decision makers. User guide is targeted at users of Sesam, with new subsections for Data synchronization, Data modelling, Data platforms and Operations.
2022-03-03¶
Pipes with
manualoroffpump mode can now be disabled and enabled.
2022-02-11¶
As part of the Clustered architecture everywhere initiative we are now in the process of migrating in-cloud subscriptions over to it. You can find the provisioning status of a subscription in
Subscription>Basicsin the Management Studio. There you can see which provisioner version it is running (version 1is old single machine service,version 2is the new clustered service, if self-hosted it will sayself-hosted).
Changes to the user experience:
Pipes are now being provisioned asynchronous, this is reflected in the UI.
Config upload when using sesam-py can report taking a little longer.
2022-01-25¶
The lower keys, upper keys and undirected graph transforms have been deprecated. DTL transforms can replace the functionality of lower keys and upper keys transforms.
2022-01-24¶
Added a new property remove_pk_char_trailing_spaces to the SQL sink. This property is enabled by default and fixes an issue with updating table rows when the primary key is of type
ncharorchar.
2022-01-20¶
Added custom header functionality to HTTP transforms.
2022-01-12¶
Added domain name validation to
docker.hostsproperty on microservice systems. This ensures that domain names are on a format that is accepted by Kubernetes.
2022-01-03¶
Added a new resolved_entity property to write-error entities in the execution log. It contains the entity that was used to resolve the write-error if it is different from the original entity that caused the write-error. This property is also set for any tracked dead letters that has been resolved (on the deleted dead letter). Fixed a bug where the resolved property was not set (to
true) if a write-error entity was successfully retried.
2021-12-20¶
Renamed the
prefiltersproperty in the hops DTL function tosubsets.prefiltershad some known issues and is now deprecated. Note that you may have to reset the pipe if you change fromprefilterstosubsets. All new pipes should usesubsetsto get the documented behaviour.
2021-12-17¶
Added
custom_ca_pem_chainproperty to the URL system and REST system. This property can hold a custom chain of certificates (in PEM format) that will be used to validate the SSL connection ifverify_sslis set totrue.
2021-12-11¶
Our security team has investigated the impact of CVE-2021-44228. The following components have been analysed as they could potentially be affected:
Integrated search. This component uses Elasticsearch under the hood. The version of Elasticsearch that we use is not affected according to this Elastic Security announcement.
Legacy Databrowser. This component uses Apache Solr under the hood. The version of Solr that we use is not affected according to this Solr Security announcement.
GDPR Portal. This component uses Apache Solr under the hood. The version of Solr that we use is not affected according to this Solr Security announcement.
Unofficial OCI images that are hosted as microservices. These components can be affected, and our users need to make sure they only run code that they trust.
2021-11-29¶
Changed the default value of the
global_defaults.use_signalling_internallyproperty of the service metadata section totrue. This property was previouslyfalseby default
2021-11-26¶
Integrated search is now available for subscriptions running on the Clustered Architecture.
VPN is now configurable for subscriptions running on the Clustered Architecture.
2021-11-19¶
The IP address of our log shipping receiver endpoint has changed from
13.74.166.9to52.142.116.113. If you run a self-hosted service and have blocked outgoing traffic then you need to update the firewall accordingly. See the Self-hosted service document.
Changed the name of “The Microsoft Azure SQL Data Warehouse system” to “Microsoft SQL Server system” and “The MSSQL system” to “Legacy Microsoft SQL system”
The “Legacy Microsoft SQL system” has been superceeded by the “Microsoft SQL Server system” and will likely be deprecated in the future
The “Microsoft SQL Server system” has a new type
"system:sqlserver"which replaces the old"system:mssql-azure-dw", which is kept as an alias for nowAdditional note: the recommended “Microsoft SQL Server system” uses official Microsoft (ODBC) drivers while the “Legacy Microsoft SQL system” uses open source drivers. The Microsoft ODBC drivers should support all current Microsoft SQL Server compatible products, including Azure Synapse Analytics (previously known as Azure SQL DataWarehouse). Note that switching from the “Legacy Microsoft SQL system” (
"system:mssql) to the preferred “Microsoft SQL Server system” ("system:sqlserver"aka"system:mssql-azure-dw") can lead to minor data differences in properties due to the different driver backends
2021-11-11¶
Added a
encode_error_strategyproperty to the CSV endpoint - it tells the sink how to deal with encoding errors when the encoding is different from “utf-8”, the default is to use a “backslashed unicode” replacement but other strategies can be chosen
2021-11-09¶
Added a “discard-retries” pump operation to the service API - it is available in the UI as a “Discard retry queue” menu item on pipes. This operation will make the next pipe run ignore any previous write error retries by writing a special “pump-discard-retries” entity to the pipes execution log. This operation can only be done on non-running pipes.
2021-10-25¶
Added a
byte_order_markproperty to the CSV endpoint and XML endpoint sinks. Iftruethese sinks will emit a UTF-8 byte order mark (BOM) to the start of the file/stream. It’sfalseby default and should only be used in conjunction with a UTF-8 encoding.
2021-10-11¶
The http_endpoint source will now get its completeness value from the “X-Dataset-Completeness” http request header, if it is present. If the header is not present, the current time will be used instead, just as before.
2021-09-29¶
Added a new Quick Reference document for faster and easier navigation to configuration types and DTL transforms and functions.
2021-09-28¶
Added the (experimental) ni-collapse and ni-expand DTL functions. Note that these are only meant to work with the
global_defaults.symmetric_namespace_collapseservice metadata option set totrue(falseby default while this functionality is in experimental state)
2021-09-27¶
The “Datasets” page has been removed.
A dataset is managed by a pipe and considered a part the pipe. All the details about a dataset have therefore been moved to the pipe page of the pipe that writes to the dataset (under Output). Internal datasets can be found under “Datahub” > “Internal datasets”.
2021-09-01¶
Added an explanation about why you should not hop to the sink dataset.
2021-08-16¶
Clarified when the
is_firstandis_lastflags can be expected to be set in the Sesam JSON Push Protocol - these flags are only set when running a full sync (i.e. not when in incremental mode). They are intended to signal to the client the start and end of a full sync run across multiple requests.Fixed a bug in the JSON (push) sink that set the
is_firstflag also on incremental syncs.
2021-08-04¶
Added a
headerproperty to the JSON source. This property can be used to specify additional header values to be set when doing HTTP GET requests. This was added to make the JSON source symmetrical with the JSON (push) sink. Note that both the JSON source and sink adhere to the Sesam specific JSON Pull Protocol. Consider using the more general REST source or sink if you’re interacting with a non-Sesam JSON capable REST api.
2021-06-14¶
Added a
json_content_typesproperty to the REST system. This property can be used to specify additional JSON content types to accept besides the default “application/json”. The content must still be valid JSON. Note that the REST source will no longer attempt to parse all responses as JSON but check the content-type against the list of recognised content-types first. If the response content-type is not in this list, it will be treated as “unknown” and an empty entity containing a property with the response body (and optionally the content type) will be emitted for further processing with DTL. Support forresponse_include_content_typeandresponse_propertyhas been added to the REST source for this scenario.
2021-06-09¶
Added a
initial_since_valueproperty to the source configuration. This property holds the “since” value to use by the source when the pipe offset is unset (or has been reset).The
since_defaultproperty of the SPARQL source has been deprecated, please useinitial_since_valueinstead.
2021-05-31¶
We’ve updated our Subscription Fee, payment terms. For existing customers, the changes will take effect from September 1st 2021.
2021-05-20¶
Added a Sesam Community section.
2021-05-06¶
If pipes with sources with the chronological strategy fail, they now save their pipe offset based on last successful batch in the pipe run. This improvement makes it more likely that a failing pipe is able to make progress.
2021-05-05¶
Added
rate_limiting_retriesandrate_limiting_delayproperties to the REST source, REST transform, REST sink and REST system. These can be used to retry failed requests that return a HTTP 429 error code.
2021-05-03¶
The
payload_propertyof the REST source and REST transform now supports traversing a path in the response body using a “dotted” notation.
2021-04-29¶
Added a configuration hint for controlling the deployment of microservices. The new eager_load_microservices option will allow Sesam to hold off starting up microservices which are not connected to any pipes. This option is
trueby default, in line with previous behaviour. The option can be overriden per system using theeager_loadflag in the Microservice system configuration. Individual microservices which need to be run eagerly should have the optioneager_loadset totrueexplicitly in anticipation of the default changing.
2021-04-15¶
Added ‘dialect’ keyword to Microsoft Azure SQL Data Warehouse server system to indicate whether it’s a normal SQL server or a Synapse server. Note that it uses the ‘HEAP’ table type when used to create new tables.
2021-03-25¶
The driver for the LDAP system has been changed to version 2.4 of LDAP3 . The new driver gives the same results as the old driver in our tests, but it is still possible that there may be some subtle changes in how the new driver interacts with the LDAP server. The newer version implements some security fixes.
2021-03-22¶
The mail message sink will now automatically add a
Dateheader to the email message.Added support for specifying a list of HTTP response status codes to ignore in the REST transform.
2021-03-19¶
Added support for paginated responses to the REST transform as well.
The REST transform
response-property,replace-entityandresponse-include-content-typeproperties has been deprecated. Useresponse_property,replace_entityandresponse_include_content_typeinstead.
2021-03-15¶
Added experimental REST source. This source is intended to be able to replace some of the connectors that currently require Microservices.
2021-03-12¶
Notification status changes on Status page is now fully automated.
2021-03-05¶
Added default
operation,propertiesandpayloadvalues to the REST sink and REST transform
2021-02-19¶
The driver for the MySQL database type has been changed to the latest stable version of PyMySQL (the old driver was from 2015, and we wanted to use a more recent driver). The new driver gives the same results as the old driver in our tests, but it is still possible that there may be some subtle changes in how the new driver interacts with the MySQL database (for instance in how data is converted between Sesam’s internal format and the fields in a database table).
2021-02-18¶
A new property
equality_setshas been added to the merge source. This property can be used instead of (or in combination with) theequalityproperty, and should make it a bit easier to configure the equality-rules correctly.
2021-02-15¶
Open Sesam will shut down March 31st, 2021. It unfortunately did not gain as much traction among our users as we had hoped and we are focusing more on the core product. We will notify the users by email soon.
2021-02-11¶
The default batch_size value of pipes that use the REST sink has been changed to 1 (used to be 100).
2021-02-05¶
We are optimizing the maximum number of concurrent running pipes in small subscriptions. The rationale is to get better overall performance. Note that this also affects self-hosted subscriptions.
Documented the compaction settings in the global defaults section of the service metadata. Note that should be careful in changing these values as this can lead to loss of data and/or influence dependency tracking functionality.
2021-02-01¶
We automatically upgrade a Small subscription type to a Medium subscription type if the data storage usage exceeds 40 Gb. We also upgrade a Medium subscription type to Large subscription type if the data storage usage exceeds 350 Gb. Note that this also affects self-hosted subscriptions.
2021-01-11¶
Added experimental support for running a pipe rescan in the background while simultaneously doing normal incremental pipe-runs.
2020-12-01¶
Changed the receive endpoint for log shipping. See Self-hosted service.
2020-11-20¶
New circuit breaker feature for uploading configuration available in service metadata. Prevents the node from updating it’s configuration if the new configuration would result in the deletion of more than 10 and more than 10% of existing components (for example when using the
/configAPI). The circuit breaker can be activated by setting the service metadata propertyglobal_defaults.use_config_circuit_breakertotrue.
2020-11-18¶
The
blacklistandwhitelistproperties of the SQL sink has been deprecated. You can use DTL to filter properties to achieve the same functionality.Note that these deprecated properties cannot be used to avoid inserting values into or overwriting values of existing table columns (partial table updates) or to support identity columns.
For the special case of identity columns (columns with automatically assigned values) some RDBMS systems such as MS SQL Server allow you to define a “writable view” that can be used as a workaround for this. We have added some information to the documentation on this usecase for MS SQL Server.
2020-11-13¶
In the pump configuration section the
use_dead_letter_datasetproperty has been deprecated and thedead_letter_datasetproperty has been un-deprecated. Please update your configuration. Thedead_letters_datasetshould contain a per-pipe unique user dataset id. The motivation for this reversal is that we wish to migrate away from using system datasets for any “dead letters” in a pipe.
2020-10-23¶
Documented the REST transform.
2020-10-09¶
Fixed a bug in datetime-shift and other functions that does implicit or explicit timezone-conversion where we didn’t have the correct historic daylight saving information. This affects the following ranges: 1895-1901, 1916, 1940-1945, 1959-1965 and any year after 2038.
2020-08-24¶
Changed default compaction type to
sink. To go back to the previous default, you can set sink compaction tofalseon individual pipes or set the global default propertydefault_compaction_typetobackgroundin the service metadata.
2020-08-21¶
Added an optional
descriptionproperty to pipes and systems - it can be either a string or a list of strings.Added an optional
commentproperty to pipes, systems, sources, sinks, pumps and transforms - - it can be either a string or a list of strings.
2020-08-17¶
The dataset sink property
set_initial_offsetnow accepts theonloadenum value. This enum value sets the sink dataset’s initial offset when the pipe is loaded / configured.
2020-08-13¶
The encrypt-pki, encrypt-pgp and their corresponding decrypt DTL functions now support using ‘$SECRET()’ syntax in their key and password parameters
2020-08-04¶
Documented the
instanceproperty of the MS SQL system. Please note the the potential consequences for firewall rules when using this property.
2020-06-19¶
Experimental pipe entity type inferencing now enabled by default. Change default value by setting service metadata property
global_defaults.infer_pipe_entity_typestofalse.
2020-05-28¶
Added the Restore completed and :ref:Pump offset set notification rule types.
2020-03-27¶
Added the
dependency_trackingproperty to service metadata. It can be used to specify various dependency tracking related properties.
2020-03-23¶
Added the
max_entity_bytes_sizeproperty to the dataset sink.Added the
global_defaults.max_entity_bytes_sizeproperty to service metadata.
2020-03-18¶
Added the
global_defaults.default_compaction_typeproperty to service metadata.
2020-03-05¶
The union_datasets source now as a
prefix_idsproperty that can be set to false to not add the dataset id as the prefix on entity ids.
2020-03-03¶
The transform function rename will now rename properties with a null value. The old behaviour ignored such properties, but that was considered to be a bug.
2020-02-12¶
Added support for
create_table_if_missingSQL sink property for the Oracle, Oracle TNS and MySQL systems. Previously only the MS SQL and PostgreSQL systems supported this option.
2020-01-08¶
The default value of the
read_timeoutproperty has been changed from 7200 seconds to 1800 seconds for the URL system and the Microservice system.
2019-12-19¶
The replace DTL function now takes a dict argument that lets one specify more than one string replacement.
2019-12-18¶
Updated the documentation for the
supports_signallingproperty on dataset sources and theglobal_defaults.use_signalling_internallyproperty of the service metadata section.The The JSON push sink and REST sink no longer includes header values or entity data in the traceback details of the execution log on failures.
The execution log and dead letter entities no longer includes copies of the
sourceorsinkconfiguration properties of the pipe.The properties of the event entities in the execution log are now truncated at 10 mb to avoid excessive event entity sizes. Note that this cut-off value might be decreased further in the future.
If the pump fails due to exceeding retry limits, the entity in question is no longer included in the traceback properties. Instead it’s put in a separate
exception_entityproperty. Note that this property is not included in the monitoring data, so you cannot devise notification rules that refer to it.
2019-12-17¶
Added support for Config groups.
2019-11-25¶
The RDF source will no longer add the
<rdflibtoplevelelement>root wrapper element to literals with datatypehttp://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral. This is a breaking change.
2019-10-28¶
Added the hex DTL function.
Updated the integer DTL function to parse hexadecimal values.
The dataset sink now has a property called
prevent_multiple_versionsthat makes the pipe fail if an entity already exists in the sink dataset. This is useful if one wants to prevent multiple versions of the same entity to be written.The dataset sink now has a property called
suppress_filtered. The default value isfalseunless it is a full sync and the source is of typedatasetandinclude_previous_versionsisfalse. The purpose of this property is to make it possible to opt-in or opt-out of a specific optimization in the pipe. The optimization is to suppress entities that are filtered out in a transform early so that they are not passed to the sink. This optimization should only be used when the pipe produces exactly one version per_idin the output. The optimization is useful when the pipe filters out a lot of entities.
2019-10-07¶
Sink compaction, merge source, LDAP source, Email message sink, SMTP system, SMS message sink, Twilio system, REST system, and REST sink are no longer experimental.
The reference DTL function has been deprecated.
The Kafka system, Kafka source and Kafka sink have been deprecated.
2019-09-04¶
Index version 2 is now the default version for dataset indexes. This index implementation (version 2) supports bidirectional traversal and that can be used to expose incremental feeds for one or more subsets of a dataset.
2019-09-04¶
Added new Pump finished overdue notification rule type.
Added new Pump failed notification rule type.
2019-08-27¶
DTL property path strings can now be quoted. In practice this means that you can have periods in path elements if you quote them. Example:
"_S.foo.'john.doe''s'.bar"is now equivalent to["path", ["list", "foo", "john.doe's", "bar"], , "_S."]. A quoted path element must begin and end with a single quote. Single quotes can be escaped with''.Extended the JSON Pull Protocol document with information about response headers and an example using dataset subsets.
2019-08-26¶
We’ve added support for a feature called completeness. When a pipe completes a successful run the sink dataset will inherit the smallest completeness timestamp value of the source datasets and the related datasets. Inbound pipes will use the current time as the completeness timestamp value. This mechanism has been introduced so that a pipe can hold off processing source entities that are more recent than the source dataset’s completeness timestamp value. The propagation of these timestamp values is done automatically. Individual datasets can be excluded from completeness timestamp calculation via the
exclude_completenessproperty on the pipe. One can enable the completeness filtering feature on a pipe by setting thecompletenessproperty on the dataset source totrue.
2019-08-19¶
Pipes now have a property called
reprocessing_policythat can be set to cause automatic resets when external factors indicate that the pipe should be reset.
2019-08-12¶
The dataset sink now has a property called
set_initial_offsetthat specifies how the sink should set the initial offset on the sink dataset (a.k.a. the populated flag).
2019-05-31¶
Added experimental support for automatic scheduling of internal (dataset to dataset) pipes and JSON pipes that read from external Sesam datasets via the REST API. See the
supports_signallingproperty of these sources and the globaluse_signalling_internallyanduse_signalling_externallyoptions in service metadata section. Please note the limitations and usage notes.
2019-04-23¶
The embedded source now has configurable continuation properties, i.e.
supports_since,is_chronologicalandis_since_comparable.
2019-04-01¶
The “dtl” transform will now fail if the target entity’s
_idproperty is either missing or is not a string. It will also do so if the arguments to “create” and “create-child” is not a dict or is missing the_idproperty or the_idproperty is of a non-string type. This is a change in default behaviour, but it is possible to opt-out of this new behaviour by setting theid_requiredproperty tofalse. It would make it easier to discover logic errors.
2019-03-26¶
The
track_childrenproperty on the dataset sink is now inferred to betrueif any of the pipe’s transforms use thecreate-childDTL function. It is possible to override this by setting the property’s value tofalse.
2019-03-22¶
The lookup DTL function has been deprecated and replaced with the lookup-entity function. Note that the dataset referenced in its first argument must be populated before the parent pipe will run.
2019-03-14¶
The valid characters in pipe and system ids have been restricted to be valid DNS name components. In practice this means that the first character must be a letter or a digit and the rest must be letters, digits and hyphens. The maximum length is 62. Invalid ids will trigger a validation warning.
2019-03-13¶
A source that has
supports_since=true,is_since_comparable=falseandis_chronological=Truewill now use the chronological continuation strategy. Earlier it used no continutation strategy.
2019-02-27¶
2019-02-15¶
Made the URL system throw an error if it received an invalid ‘Content-Length’ response header value. The URL system used to ignore such errors; the new
ignore_invalid_content_length_response_headerproperty can be set to get the old behaviour.
2019-02-14¶
Added the docker.hosts property to the microservice system. This allow adding custom hostname to IP address mappings to the microservice container.
2019-02-13¶
Added a new coerce_to_decimal property to the Oracle and Oracle TNS systems. If set to true, it will force the use of the decimal type for all “numeric” types (i.e. numbers with precision and scale information). Currently what type the column data ends up as is not clearly defined by the oracle backend driver so in some cases it may yield a float value instead of a decimal value. This property should always be set to true if your flows care if numeric values are floats or decimals. The default value is false.
2019-02-07¶
We’ve changed the default strategy for pipe execution logging. By default, we now will never log any runs which resulted in no processed/changed entities. You can opt-in to the previous behaviour by editing the
log_events_noop_runs,log_events_noop_runs_changes_onlyandnotification_granularitypump properties.
2019-02-04¶
There is now a new index implementation (version 2) that supports bidirectional traversal and that can be used to expose incremental feeds for one or more subsets of a dataset. Index version 1 is currently the default. Nodes must be started with a special command line option in order to change the default value. Version 2 will be made the default at some point once we have enough experience with it.
The dataset and json sources now support the
subsetproperty. This property is used to specify a subset of the source dataset.The hops and apply-hops DTL functions now support the
prefiltersproperty. This property is used to specify a subset of the dataset that it is hopped to.The
GET /api/datasets/{dataset_id}/indexesAPI endpoint now includes the indexes’ version number.The
DELETE /datasets/{dataset_id}/indexes/{index_int_id}API endpoint has been added. It can be used to delete a dataset index.
2019-01-28¶
Compaction is now incremental, so it will continue from where it got to the last time.
Compaction will be performed by the dataset sink if
compaction.sinkis set totruein the pipe configuration. This is only available for pipes using the dataset sink. If sink compaction is enabled no scheduled compaction will be done on the dataset as this is no longer neccessary. Index compaction will still require scheduled compaction, but this does not require a lock on the dataset. Note that sink compaction is currently experimental.Automatic compaction will now kick if there are 10% or 10000 new dataset offsets since the last compaction. The 10000 cap is fixed for now.
2019-01-03¶
The dataset sink will now mark the sink dataset as populated when all input datasets are populated and all entities have been read from them. Earlier it marked the sink dataset as populated after the first completed run. This was typically not what you wanted as it caused the sink datasets to be prematurely populated, which then caused unnecessary dependency tracking.
Added the
initial_datasetsproperty to the merge, merge_datasets, union_datasets, and diff_datasets sources. This property should only be used if some of the input datasets will never be populated. The property should then list the datasets that have to be populated before the sink datasets should be populated.
2018-12-07¶
Casting decimal numbers containing a “scientific notation” shorthand (i.e. “1E-3”, “10E14” etc) to a string using the DTL string function will now expand the exponent to its full representation (i.e. “1E2” -> “100”, “1E-3” -> “0.001”). This is a change in behaviour.
2018-11-12¶
["matches", "x*", ["list"]]now returnsfalseinstead oftrue. Note that this is a breaking change, but the old behaviour was considered a bug as it is both non-intuitive and most likely not what you want.
2018-10-31¶
Added the
sslmodeproperty to the PostgreSQL system. Its default value (prefer) reflects the PostgreSQL client library default, hence you should only set this property if you need other behaviour than the default.
2018-10-25¶
Added the Kafka system, Kafka source and Kafka sink.
2018-10-16¶
Added
compaction.growth_thresholdproperty to the pipe configuration. This lets you specify when dataset compaction kicks in.The
compaction.keep_versionsproperty can now also be set to0and1. The default value is2; which is needed for dependency tracking to be fully able to find reprocessable entities. Setting it to a lower value means that dependency tracking is best effort only.
2018-09-24¶
Added a new
recreate_table_on_first_runboolean flag to the sql sink - it controls if Sesam should recreate the table fromschema_definitonwhen the pipe is reset or runs for the first time. Note that this requires thecreate_table_if_missingproperty to also be set totrueto take effect.Altered the way the PK is created on schema definition generation. If the sink type is
sqlandcreate_table_if_missingis set totrue, the default primary key is the_idproperty of the entities. Previously it would always look for a property with the same contents as_id(which is still the default for non-sql sink pipes).
2018-09-03¶
Added a
fallback_to_single_entities_on_batch_failboolean flag to the pump configuration. The default reflects the current behaviour (true). It can be usefuly to set tofalseif the cost of processing a single entity at a time is high and there is a lot of entities in a batch (for example in a typical MS SQL sink in initial bulk upload mode).
2018-08-24¶
Datasets that are not populated will no longer be compacted.
2018-08-10¶
Receiver and publisher pipes can now be disabled.
2018-08-02¶
Added support in the split DTL function to split string into characters using the empty separator.
2018-07-04¶
Added a translation GUI for the GDPR platform. This GUI makes is much easier to customize the various texts used by the GDPR portal.
2018-06-26¶
2018-06-25¶
Changed the base64-encode and base64-decode DTL functions to only accept bytes and string input respectively.
Added support for bytes input to the string casting function. The encoding used is
utf-8.Added a bytes casting function that casts strings to (
utf-8encoded) bytes representation.
2018-06-19¶
Added a RDF transform, similar to the XML transform. It will render entities to a NTriples string and embed it in the transformed entity.
Added the base64-encode and base64-decode DTL functions.
2018-06-06¶
Changed default behaviour of the CSV source: if
dialectis set, this will override the default value ofauto_dialect. Previously you would have to both turn offauto_dialectand setdialect. Note that ifauto_dialectisfalseand nodialecthas been set, theexceldialect is used as default.The is_chronological property on the SQL source is now dynamic as it is
trueif theupdated_columnandtableproperties are set.Added the is_chronological_full property to the SQL source . If explicity set to
falsethen a full run will not consider the source to be chronological even though it is chronological in incremental runs. The default value is the value of theis_chronological, but can be set tofalse.
2018-06-05¶
The old
dead_letter_datasetpump configuration option (string) has been deprecated and replaced byuse_dead_letter_dataset, which is a boolean flag (false by default). If set to true, the id of the dead letter dataset is automatically generated and linked to the parent pipe id (system:dead-letter:pipe-id). Note that entities written to this new dataset will no longer have the pipe id as part of their_idproperty. This new dataset will inherit the ACLs from its parent pipe (like pump execution datasets). If the pipe is removed, the automatically created dataset is also removed. The olddead_letter_datasetproperty will continue to work as before but will be removed at some future date.
2018-05-29¶
Added the checkpoint_interval property to the pipe. The default has been changed from
1to100, which means that the pipe offset is now saved after every 100 batches instead of after every batch. The default is effectively every 10000 entities, but since it is dependent onbatch_sizethe default value is100(i.e. 10000/batch_size). Note that the pipe offset is always saved at the end of every sync if it changed.Pipes that perform deletion tracking will now have their pipe offset and deletion tracking state saved every 15 minutes or so. If a pipe is interrupted it will now be able to continue doing deletion tracking from where it last saved it’s state.
2018-05-02¶
2018-04-30¶
A partial rescan can now be scheduled on a pump by specifying the two properties
partial_rescan_countandpartial_rescan_delta.
2018-04-27¶
Added the hash128 DTL function. It generates 128 bit integer hashes from bytes and strings.
2018-04-26¶
The sink dataset and the dead-letter dataset will now be asserted when the pipe is loaded. Receiver datasets, i.e. sink datasets that are used in combination with the
http_endpointsource, will be automatically populated at the same time. Note that it is possible to opt-out of this behaviour by settingauto_populate_datasettofalseon the http_endpoint source. Dead-letter datasets are automatically populated, and it is not possible to opt-out.Note that this is a change in behaviour, but in most situations it is the right thing to do. If the initial push to the receiver is a full sync, then it might be good to set
auto_populate_datasettofalse. The reason why this is useful for full syncs is because pipes doing hops against the dataset will then wait until the sync is complete and the dataset is populated.
2018-04-23¶
Processing of namespaced identifiers have gotten a decent performance boost.
Regression: The
make-niDTL function will now return a sorted list of NIs. Earlier the sorting was done by sorting the keys of the source entity, which is a much expensive thing to do.
2018-04-19¶
Added support for circuit breakers, a safety mechanism that one can enable on the dataset sink. The circuit breaker will trip if the number of entities written to a dataset in a pipe run exceeds a certain configurable limit.
2018-04-09¶
Added the round DTL function. It rounds to the nearest digit using the “round half to even” rule.
2018-03-20¶
Added oauth2 (BackendServerClient profile, aka “client credentials”) option to the URL system
2018-03-07¶
Changed the default value of the node configuration setting “pipe_cleanup_after_deletion” to “true”. This means the node will remove any pipe-related data when a pipe is deleted (execution logs, acls, pipe offsets etc)
2018-03-05¶
Added the map-values function. It maps over the values of dictionaries and returns a list of mapped values.
2018-02-27¶
The combine DTL function now allows a single argument. This is useful when you want to turn an expression into a list of values. It is extra useful when you don’t quite know if the value is a list or not. Example:
["combine", "_S.x"]
2018-01-22¶
Added a
content_dispositionconfiguration property to be able to specify the type in theContent-DispositonHTTP response header to the HTTP endpoint sinks.Added the possibility to specify the
filenameof the HTTP endpoint sinks as the last element of the URL (overrides anyfilenameset in the configuration of the sink).
2018-01-16¶
Added the url-unquote function that URL unquotes any URL quoted characters in its input. See the related url-quote function.
2018-01-15¶
The RDF source and SDShare source now supports the
sort_listsproperty to automatically sort resulting properties containing lists (i.e. RDF statements having the same predicate). It istrueby default.
2017-12-15¶
The JSON source now supports the
page_sizeproperty.
2017-12-14¶
Added
encrypt-pgpanddecrypt-pgpDTL functions that can encrypt strings to OpenPGP messages using a PGP public key and decrypt these messages back to strings using a PGP private key and its associated password.
2017-12-12¶
Added
encrypt-pkianddecrypt-pkiDTL functions that can asymmetrically encrypt strings to bytes and decrypt bytes to strings using a PKI public/private key-pair in DEM format (PKCSv8). The encryption is performed using RSA 2048 bits with sha-1 hashes and OAEP/MGF1 padding.
2017-11-23¶
Added Databrowser documentation.
2017-11-22¶
Added the Pattern match notification rule type.
2017-11-15¶
Added the intersects DTL function. This boolean function returns true if there is an overlap between the values in the two arguments.
The DTL compiler will now issue a warning if you try to perform two or more join expressions between the same two dataset aliases. It is there to notify you of possible cardinality issues and to tell you about the tuples function, which may be used to avoid cardinality issues.
When there are two or more join expressions between the same two dataset aliases only the first one is treated as a join expression; the rest of them are equality comparisions. One can use the tuples function to combine them into one big join expression at the cost of composite indexes being used.
Warning
Note that the eq function serves a dual purpose. It can both be used for join expressions and it can be used for equality comparisions. These two are different in that a join uses intersection (similar to the
intersectsfunction) and the equality comparison is an exact match. Use the intersects function if you want to check for intersection/overlap instead of an exact match.
2017-11-08¶
The JSON push sink now supports customizable HTTP headers via a
headersproperty.
2017-10-12¶
Documented the JSON Pull Protocol.
2017-10-09¶
If a pipe is running and the pipe-config is modified, the pipe will no longer be stopped. Instead a “An old version of the pipe is still running” warning will be displayed, and it is up to the user if they want to stop the running pipe or not.
2017-09-06¶
Improved and expanded documentation on namespaced identifiers and the features related to it.
Moved the deprecations to a separate document.
2017-09-05¶
Added a
track_dead_lettersoption to the pump configuration. If set to true, it will delete “dead” entities from the dead letter dataset if a later version of it is successfully written to the sink. Note that using this option incurs a performance cost so use with care.
2017-08-23¶
It is now possible to specify
track-dependencieson all the HOPS_SPEC in a specific hops DTL function. This change was made so that one can disable tracking for any of the HOP_SPECs, not just the last one.
2017-08-16¶
The json-parse and json-transit-parse DTL functions now accept an optional default value expression. The default value expression is used when the input value is not valid JSON.
2017-08-08¶
The datetime-parse and datetime-format DTL functions now accept an optional timezone argument. This makes it possible to parse datetime strings and format datetime values in specific timezones.
2017-06-29¶
When a pipe is reset then the pipe’s retry queue is now also reset.
Bug fix: It is now possible to interrupt pumps that are performing retries.
Indexing of datasets changed so that each dataset is indexed for a maximum of five minutes in each iteration. This prevents some datasets from being blocked from indexing when there are other large datasets being indexed.
2017-06-26¶
Added the enumerate DTL function that can be used to enumerate values, i.e. combine values with an enumeration count.
Added the json-parse and json-transit-parse DTL functions.
2017-06-23¶
Added a conditional transform. This works the same way as conditional sinks and sources.
2017-06-20¶
Added functionality for preventing all pipes from automatically running (useful in some debugging scenarios). See the Low level debugging page for details.
2017-06-16¶
Added a
is_sortedproperty to the RDF source to indicate that the input data is sorted on subject, enabling the source to avoid loading the entire file into memory. Note that it only works fornt(NTriples) format files without blank nodes.
2017-06-12¶
Added a
write_retry_delayproperty to pipe pumps. This is used in conjunction withmax_consecutive_write_errorswhen the system the pipe is writing to is known to be sporadically (non-transiently) unavailable. See the Pump section for details.
2017-06-08¶
The Security document now contains a description of users, roles and permissions in Sesam.
2017-05-31¶
Added support for bulk operations in the SQL sink. Bulk operations are currently only supported for the MSSQL and Microsoft Azure SQL Data Warehouse systems.
2017-05-29¶
Added the
indexesproperty to the dataset sink. If set to"$ids"then an index will be maintained for the$idsproperty. This index will then be used by the dataset browser to look up entities both by _id and $ids.The default value of the
max_depthproperty in hops has been changed fromnullto10. This means that the default is to stop the recursion at level 10.
2017-05-26¶
The JSON push protocol has been simplified to make it easier to write receivers. It will now always send the entities as an array, even if it contains just a single object. The JSON push sink has been updated to reflect this. If you need single-object JSON POST/PUT operations, you should use the REST sink instead.
Systems now support environment variables in their config like pipes do
2017-04-28¶
The
equalityproperty on themergesource is now optional.
2017-04-24¶
Changed the default value of the “schedule_interval” pump configuration property. Before, the default value was 30 seconds for all pipes. The new default value for pipes with a dataset sink and a dataset sink is now 30 seconds +/- 1.5 seconds. For all other pipes, the default is 900 seconds +/- 45 seconds. (The
+/-part helps stagger the start-time of the pipes, so that we don’t get lots of pipes starting at the same instant.)Added a warning in the GUI for non-internal pipes that don’t have a “schedule_interval” or a “cron_expression” attribute set.
2017-03-30¶
Extended all systems to accept a new property
worker_threadsthat limits the number of concurrent pipes that can run against a particular system. The default value is 10. For inbound pipes the source system is used and for outbound pipes the sink system is used. For internal pipes, the the pool has 50 worker threads (i.e. for dataset to dataset pipes or receiver/publisher endpoints).
2017-03-24¶
Extended the URL system and REST system to accept default custom request headers using the
headersproperty. Also fixed the REST system schema to reflect authentication options and thejwt_tokenproperty.
2017-03-16¶
The JSON Push Protocol document now contains examples of how to use
curlto perform incremental and full syncs.
2017-03-15¶
Added the _R variable, which can be used to refer to the root context in a DTL transform.
2017-03-14¶
The
base_urlproperty of the URL system and REST system has been deprecated. It has been superseded by the theurl_patternproperty.
2017-03-09¶
Added the is-changed DTL function that can be used compare data from the current and the previous version of the source entity.
2017-03-02¶
Added a conditional source and conditional sink that can pick from a list of actual candidates, typically controlled by an environment variable.
2017-03-01¶
Added a substring DTL function that returns a substring of another string given a start and end index.
2017-02-28¶
2017-02-20¶
Added
url_patternproperty to URL system. This property gives you more control over how absolute URLs are produced. It can be used instead of thebase_urlproperty.
2017-02-14¶
Added a
jwtauthentication scheme andjwt_tokenproperty to the URL system
2017-02-06¶
Added
text_body_templateandtext_body_template_propertyproperties to the Email message sink. Use these to explicitly construct a plain-text version of your messages if sending multi-part messages.
2017-02-03¶
For security reasons, the Mail and SMS sinks no longer support file-based templates. Note that this is a non-backwards compatible change. You can use environment variables and upload your existing template files using the environment variable API or the corresponding Management Studio form.
2017-02-01¶
Datasets are now scheduled for automatic compaction once every 24 hours. The default is to keep the last 2 versions up until the current time. It is possible to customize the automatic compaction. See documentation on compaction for more information.
2017-01-26¶
The SQL source no longer includes columns with null values by default. You can include them by setting the
preserve_null_valuesproperty of the SQL source totrue. Note that this is a change of the previous default behaviour.The CSV source no longer includes empty string values by default. You can include these by setting the CSV source property
preserve_empty_stringstotrue. Note that this is a change in the default behaviour.
2017-01-23¶
The
dictfunction now takes zero, one or an even number of arguments. If zero arguments given then an empty dict is returned. If an even number of arguments then a new dict with each pair of arguments as key and value. The latter is convenient for easy construction of dicts.The transform functions add and default now take an expression in their first argument. This means that the properties can be dynamic and that there can be multiple. rename now takes dynamic arguments in the first and second positions.
2017-01-11¶
Documented the
pool_recycleoption on SQL systems and changed its default from -1 (no recycling) to 1800 (30 minutes).
2017-01-06¶
Added the merge source. This is a data source that is able to infer the sameness of entities across multiple datasets.
2017-01-04¶
Added an
unhandled_template_variable_replacementproperty to the Email Message sink.
2016-12-20¶
Added a
uuidDTL function. It takes no parameters and returns a UUID object (type 4).
2016-12-19¶
Added a
disable_set_last_seenproperty to the Pipe properties. If set totrue, it will not be possible to set or reset thelast seenbookmark on the pipe using the API (i.e. protecting it from accidental changes by principals with write permission on the pipe).
2016-12-15¶
Added a
read_retry_delayproperty to pipe pumps. This is used in conjunction withmax_read_retrieswhen the source is known to be sporadically (non-transiently) unavailable. See the Pump section for details.
2016-12-07¶
The documentation on cron expressions now makes it clear that they are evaluated in the UTC timezone.
2016-12-06¶
The concat DTL function now takes a variable number of arguments. This avoids constructing unnecessary lists.
2016-11-30¶
The url-quote DTL function now takes an optional
SAFE_CHARSargument. This is especially useful when you don’t want to quote the/character.
2016-11-22¶
The section on Continuation Support has been extended. Each source now has a Continuation support table that shows the source’s support for continuations.
2016-11-09¶
Added the json and json-transit DTL functions.
The group-by DTL function has been changed to always return string keys. The string keys are the JSON transit encoded (same type of string as the json-transit function produces). The reason is that the entity data model (and JSON) only supports string keys.
group-byhas also gotten an optional STRING_FUNCTION argument which lets you specify a custom function to create the string keys.The sorted, sorted-descending, min, max DTL functions have been updated to support mixed type ordering.
2016-11-07¶
Added the microservice system (Experimental).
2016-11-03¶
Added the
filenameproperty to the HTTP endpoint sink, XML endpoint sink and CSV endpoint sink. This property provides a hint to HTTP clients on what filename to use when downloading data (via theContent-Dispositionheader property).
2016-10-18¶
Added the Embedded source. This is a data source that lets you embed data inside the configuration of the source. This is convenient when you have a small and static dataset.
2016-10-17¶
Added the XML transform and XML endpoint sink. These can be used to generate XML documents inline in entities or published to external consumers, respectively.
2016-10-13¶
Changed the CSV endpoint sink to not output deleted entities by default. Added a new skip-deleted-entities config parameter that can be set to
falseif one want deleted entities to appear in the CSV output.
2016-10-04¶
Reworked DTL math functions to reflect that
floatis an allowed type in entities. If the function parameters are of mixed types, the result will be coerced to the type that is the most precise. I.e. float+decimal=decimal, int*float=float, int/div=decimal and so on. Not that this is a change in behaviour as entities that previously only haddecimalas types after using DTL math functions if the input was of type float, now may end up with values that are floats instead. Use the dtldecimalcast-function to coerce the result todecimalif this is important to the application.Added
is-floatandfloatDTL functions. Changedis-decimalfunction so it no longer returnstrueif the argument is afloat. You will now have to add both ais-floatand ais-decimalin anorclause to test for both types.
2016-09-28¶
Added Elasticsearch support, which includes a system and a sink.
Added the
commit_at_endproperty to the Solr sink.Moved the
commit_withinproperty from the Solr system to the Solr sink. The reason is that the commit rate is really specific to how and where it is used. This change is backward compatible, as the default value is taken from the system. It is recommended to update the configuration files accordingly.
2016-09-28¶
Fixed the documentation for the merge DTL transform; it mistakingly stated that the merge transformation would not overwrite existing attributes in the target entity.
Updated the /api/config GET” endpoint to format the json in a more human-readable way.
2016-09-22¶
Added index inspection on datasets.
Added new analyze-dtl operation.
Fixed automatic index creation for the run-dtl operation.
Linked to the changelog from the Management Studio.
2016-09-21¶
Added the datetime-shift DTL function.
Added support for timezones to the datetime-parse DTL function.
Added missing sink- and source- prototypes in the “Edit pipe” gui in Management Studio.
Fixed a bug that prevented users from adding a system in Management Studio.
2016-09-20¶
Fixed missing validation in the /api/pipes “POST” endpoint and added support for the “force” parameter.
Fixed missing validation in the /api/pipes/{pipe_id}/config “PUT” endpoint and added support for the “force” parameter.
Fixed missing validation in the /api/systems “POST” endpoint and added support for the “force” parameter.
Fixed missing validation in the /api/systems/{system_id}/config “PUT” endpoint and added support for the “force” parameter.