Enrich data

In this phase we will add value to the data we have previously collected by semantically enriching it.

Objectives:

After you complete this tutorial you will have learned the following:

  1. How to add NI’s to your entity metadata

  2. How to add rdf:types to your entities

  3. How to add namespaces to your entities

Prerequisites

Before starting on this tutorial we suggest you complete the Collect data tutorial as we will use that data in this tutorial.



Add semantic value

In this step we will add semantic value to our data.

The Hubspot Data

In order to semantically enrich your HubSpot company data, follow the steps below.

  1. Navigate to Pipes

  2. Click on New pipe

  3. Paste and save the DTL configuration below

  4. Press Start to ensure your pipe runs

  5. Press refresh to see number of entities processed (should be 10). You can also see them in the pipe’s output page.

{
  "_id": "hubspot-company-enrich",
  "type": "pipe",
  "source": {
    "type": "dataset",
    "dataset": "hubspot-company-collect"
  },
  "transform": {
    "type": "dtl",
    "rules": {
      "default": [
        ["copy", "*"],
        ["merge-union",
          ["apply", "contact-ni", "_S.associations.contacts.results"]
        ],
        ["add", "rdf:type",
          ["ni", "hubspot:company"]
        ]
      ],
      "contact-ni": [
        ["add",
          ["concat", "_S.type", "-ni"],
          ["ni", "hubspot-contact", "_S.id"]
        ]
      ]
    }
  },
  "add_namespaces": true,
  "namespaces": {
    "identity": "hubspot-company",
    "property": "hubspot-company"
  }
}

The companies in the pipe’s output should now have two new properties consisting of NIs to the respective company’s contacts, one property for each type of contact association, like the following example for hubspot-company:5633255395.

"hubspot-company:company_to_contact-ni": [
  "~:hubspot-contact:5751",
  "~:hubspot-contact:5803",
  "~:hubspot-contact:6151"
],
"hubspot-company:company_to_contact_unlabeled-ni": [
  "~:hubspot-contact:5751",
  "~:hubspot-contact:5803",
  "~:hubspot-contact:6151"
]

The Enhetsregisteret Data

For the Enhetsregisteret data we will only add namespaces and the rdf:type property.

Follow the steps below to create the Enrich pipe for the Enhetsregisteret data.

  1. Navigate to Pipes

  2. Click on New pipe

  3. Paste and save the DTL configuration below

  4. Press Start to ensure your pipe runs

  5. Press refresh to see number of entities processed (should be 10). You can also see them in the pipe’s output page.

{
  "_id": "enhetsregisteret-company-enrich",
  "type": "pipe",
  "source": {
    "type": "dataset",
    "dataset": "enhetsregisteret-company-collect"
  },
  "transform": {
    "type": "dtl",
    "rules": {
      "default": [
        ["copy", "*"],
        ["add", "rdf:type",
          ["ni", "enhetsregisteret:company"]
        ]
      ]
    }
  },
  "add_namespaces": true,
  "namespaces": {
    "identity": "enhetsregisteret-company",
    "property": "enhetsregisteret-company"
  }
}

On the output entities you should now see namespaces on every property and the new rdf:type property.